diff --git a/1-current-metaprogramming/1-metaprog-and-hpc-overview.tex b/1-current-metaprogramming/1-metaprog-and-hpc-overview.tex index a184fee..a6af251 100644 --- a/1-current-metaprogramming/1-metaprog-and-hpc-overview.tex +++ b/1-current-metaprogramming/1-metaprog-and-hpc-overview.tex @@ -217,9 +217,9 @@ \section{ In the next section, we will take a deeper look at \cpp as it has powerful metaprogramming capabilities, while providing bleeding edge support for -all kinds of computer architectures through a decades-old ecosystem. Moreover, -its development is still very active, with many significant metaprogramming -proposals being adopted throughout the latest -releases\cite{10.1145/3564719.3568692}. +all kinds of computer architectures through a decades-old +ecosystem. Moreover, its development is still very active, with many significant +metaprogramming proposals being adopted throughout the latest releases +\cite{10.1145/3564719.3568692}. \end{document} diff --git a/1-current-metaprogramming/2-cpp-constructs.tex b/1-current-metaprogramming/2-cpp-constructs.tex index f879b7f..8402c40 100644 --- a/1-current-metaprogramming/2-cpp-constructs.tex +++ b/1-current-metaprogramming/2-cpp-constructs.tex @@ -251,10 +251,10 @@ \subsection{ \end{lstlisting} \subsection{ - Compile time logic + Compile time computation } -Compile time logic can be achieved in many \cpp constructs. +Compile time computation can be achieved through many \cpp constructs. \begin{itemize} @@ -456,7 +456,7 @@ \subsection{ enable the creation of high level, portable \glspl{dsel} that resolve into high performance code thanks to a combination of metaprogramming techniques. In the next section, we will see a collection of libraries that go beyond the -idea of using templates for math code generation, and implement or enable the +idea of using templates for math code generation, and enable or facilitate the implementation of arbitrary compile-time programs. \section{ @@ -519,7 +519,8 @@ \subsection{ All these libraries either enable \gls{tmp}, or use \gls{tmp} to achieve a specific goal. However with the introduction of \gls{constexpr} programming, a new range of compile-time libraries aims to provide new capabilities -for this new metaprogramming paradigm. +for this new metaprogramming paradigm, including the \cpp standard library +itself. \subsection{ Value based metaprogramming @@ -563,10 +564,10 @@ \subsection{ \lstinline{std::deque} is not usable in \glspl{consteval} whereas its \textbf{C'est} equivalent named \lstinline{cest::deque} is. It was instrumental for this thesis as the research work I present here started -a long time before \cpp23 was adopted and standard libraries as well as +before \cpp23 was adopted, and before standard libraries as well as compilers started implementing it. -Similar to previous examples, listing \ref{lst:cest-example} shows a +Similar to the previous examples, listing \ref{lst:cest-example} shows a compile-time program in which we find the index of the first element of value 6. Note that in this example, we are using properly typed values and functions instead of templates to represent values, predicates, and functions. diff --git a/1-current-metaprogramming/3-gemv.tex b/1-current-metaprogramming/3-gemv.tex index d9efed0..ea0b602 100644 --- a/1-current-metaprogramming/3-gemv.tex +++ b/1-current-metaprogramming/3-gemv.tex @@ -30,8 +30,8 @@ \section{ The efforts of optimizing the performance of \gls{blas} routines fall into two main directions. The first direction is about writing very specific assembly code. This is the case for -almost all the vendor libraries including Intel MKL\cite{hpcs1}, -AMD ACML\cite{hpcs2} etc. To provide the users with efficient \gls{blas} +almost all the vendor libraries including Intel MKL \cite{hpcs1}, +AMD ACML \cite{hpcs2} etc. To provide the users with efficient \gls{blas} routines, the vendors usually implement their own routines for their own hardware using assembly code with specific optimizations which is a low level solution that gives the @@ -51,12 +51,12 @@ \section{ abstraction level and the efficiency of the generated codes. This is for example the case of the approach followed by the Formal Linear Algebra Methods Environment (FLAME) -with the Libflame library\cite{hpcs3}. Thus, it offers a framework to -develop dense linear solvers using algorithmic skeletons\cite{hpcs4} +with the Libflame library \cite{hpcs3}. Thus, it offers a framework to +develop dense linear solvers using algorithmic skeletons \cite{hpcs4} and an API which is more user-friendly than LAPACK, giving satisfactory performance results. A more generic approach is the one followed in recent years by \cpp libraries built around -expression templates\cite{hpcs5} or other generative programming\cite{hpcs6} +expression templates \cite{hpcs5} or other generative programming \cite{hpcs6} principles. In this section, we will focus on such an approach. To show the interest of this approach, we consider as example the matrix-vector multiplication kernel (gemv) which @@ -81,7 +81,7 @@ \section{ } As we saw earlier, metaprogramming is used in \cpp \cite{hpcs9},D \cite{hpcs10}, -OCaml\cite{hpcs11} or Haskell\cite{hpcs12}. A subset of basic notions emerges: +OCaml \cite{hpcs11} or Haskell \cite{hpcs12}. A subset of basic notions emerges: \begin{itemize} \item @@ -155,8 +155,9 @@ \section{ or template arguments in a comma-separated code fragment. Its main use was to provide the syntactic support required to write a code with variadic template -arguments. However, Niebler and Parent showed that -this can be used to generate far more complex code +arguments. However, Niebler and Parent +% TODO: ref +showed that this can be used to generate far more complex code when paired with other language constructs. Both code replication and a crude form of code unrolling were possible. However, it required the use of some @@ -621,7 +622,7 @@ \subsection{ Again, the performances of our implementation are close to that of OpenBLAS and are even quite better for matrices of small sizes ranging from 4 to 16 elements. For example, for a -matrix of size 8 elements,the automatically generated code has +matrix of size 8 elements, the automatically generated code has a performance that is 3 times better than the OpenBLAS \gls{gemv} kernel (15.78 Gflop/s vs 5.06 Gflop/s). Two phenomenons appear however. The first one is that the increased number diff --git a/2-compilation-time-analysis/2-ctbench-design.tex b/2-compilation-time-analysis/2-ctbench-design.tex index b9e856d..41bc0c5 100644 --- a/2-compilation-time-analysis/2-ctbench-design.tex +++ b/2-compilation-time-analysis/2-ctbench-design.tex @@ -6,7 +6,7 @@ \section{ ctbench features } -ctbench implements a new methodology for the analysis of compilation times: +ctbench implements a new method for the analysis of compilation times: it allows users to define \cpp sizable benchmarks to analyze the scaling performance of \cpp metaprogramming techniques, and compare techniques against each other. @@ -172,7 +172,7 @@ \subsection{ Note that JSON data is not stored directly. This is intentional since a profiling file for a single benchmark repetition can reach volumes up to several hundreds megabytes, therefore data loading - is delayed to prevent RAM overcomsumption. + is delayed to prevent RAM overconsumption. \item diff --git a/3-new-approaches-to-metaprogramming/1-technical-background.tex b/3-new-approaches-to-metaprogramming/1-technical-background.tex index 795ffda..5774bf5 100644 --- a/3-new-approaches-to-metaprogramming/1-technical-background.tex +++ b/3-new-approaches-to-metaprogramming/1-technical-background.tex @@ -8,7 +8,7 @@ \section{ % TODO: dig this for references https://youtu.be/q6X9bKpAmnI -As mentioned in \ref{lbl:cpp-meta-constructs}, the \gls{constexpr} +As mentioned in section \ref{lbl:cpp-meta-constructs}, the \gls{constexpr} allows variables and functions to be used in \gls{consteval}, making a whole subset of the \cpp language itself usable for compile-time logic. @@ -73,8 +73,9 @@ \subsection{ // Function template foo takes a polymorphic NTTP template constexpr int foo() { return 1; } -// generate's return value cannot be stored in a constexpr variable -// or used as a NTTP, but it can be used to produce other literal +// generate's return value cannot be stored +// in a constexpr variable or used as a NTTP, +// but it can be used to produce other literals // constexpr auto a = generate(); // ERROR constexpr auto b = generate().size(); // OK @@ -133,7 +134,7 @@ \subsection{ The addition of \gls{constexpr} memory allocation goes hand in hand with the ability to use virtual functions in \glspl{consteval}. This feature allows calls to virtual functions in constant expressions -\cite{virtual-constexpr}. This allows heritage-based polymorphism in +\cite{virtual-constexpr}. This allows inheritance-based polymorphism in \gls{constexpr} programming when used with \gls{constexpr} allocation of polymorphic types. diff --git a/3-new-approaches-to-metaprogramming/2-constexpr-codegen-techniques.tex b/3-new-approaches-to-metaprogramming/2-constexpr-codegen-techniques.tex index ee0a467..2d6ff1f 100644 --- a/3-new-approaches-to-metaprogramming/2-constexpr-codegen-techniques.tex +++ b/3-new-approaches-to-metaprogramming/2-constexpr-codegen-techniques.tex @@ -72,7 +72,7 @@ \section{ This section will cover the implementation of \gls{constexpr} \glspl{ast}, and techniques to work around the limitations that prevent their direct use as \glspl{nttp} either through functional wrapping techniques, or through -their convertion of into values that satisfy \gls{nttp} requirements. +their convertion into values that satisfy \gls{nttp} requirements. \subsection{ Code generation from pointer tree data structures @@ -80,7 +80,7 @@ \subsection{ \label{lbl:ptr-tree-codegen} -In this subsection, we introduce three techniques that will allow us to use +In this section, we introduce three techniques that will allow us to use a pointer tree generated from a \gls{constexpr} function as a template parameter for code generation. @@ -106,8 +106,8 @@ \subsection{ a \gls{nttp}, \end{itemize} -The compilation performance measurements in \ref{lbl:compile-time-eval} will -rely on the same data passing techniques, but with more complex examples such +The compilation performance measurements in section \ref{lbl:compile-time-eval} +will rely on the same data passing techniques, but with more complex examples such as embedded compilation of Brainfuck programs, and of \LaTeX math formulae into high performance math computation kernels. @@ -173,7 +173,7 @@ \subsubsection{ The downside of using this value passing technique is that the number of calls of the generator function is proportional to the -number of nodes. Experiments in \ref{lbl:compile-time-eval} highlight +number of nodes. Experiments in section \ref{lbl:compile-time-eval} highlight the scaling issues induced by this code generation method. And while it is very quick to implement, there are still difficulties related to \gls{constexpr} memory constraints and compiler or library support. @@ -310,7 +310,8 @@ \subsubsection{ type-based paradigms, even when \gls{constexpr} allocated memory is involved. It is worth mentioning that both this technique and the previous one induce -very high compilation times, as we will see in \ref{lbl:bf-parsing-and-codegen}. +very high compilation times, as we will see in section +\ref{lbl:bf-parsing-and-codegen}. \subsubsection{ FLAT - AST serialization @@ -477,11 +478,11 @@ \subsection{ \end{figure} Parsing algorithms may output serialized data. In this case, the serialization -step described in \ref{lbl:flat-technique} is not needed, and the result +step described in section \ref{lbl:flat-technique} is not needed, and the result can be converted into a static array. This makes the code generation process rather straightforward as no complicated transformation is needed, while still scaling decently as we will see in -\ref{lbl:compile-time-eval} where we will be using a +section \ref{lbl:compile-time-eval} where we will be using a Shunting Yard parser \cite{shunting-yard} to parse math formulae to a \gls{rpn}, which is its postfix notation. diff --git a/3-new-approaches-to-metaprogramming/3-brainfuck.tex b/3-new-approaches-to-metaprogramming/3-brainfuck.tex index b704bde..c495bf2 100644 --- a/3-new-approaches-to-metaprogramming/3-brainfuck.tex +++ b/3-new-approaches-to-metaprogramming/3-brainfuck.tex @@ -10,7 +10,7 @@ \section{ pointer trees generated by \gls{constexpr} functions, we will use them in the context of compile time parsing and code generation for the Brainfuck language. Therefore use data structures and code generation techniques introduced in -subsection \ref{lbl:ptr-tree-codegen}. +section \ref{lbl:ptr-tree-codegen}. We chose Brainfuck as a first language for several reasons: the language generates approximately one \gls{ast} node per character which makes the size @@ -104,7 +104,7 @@ \subsection{ figuring out how to transform its result, which contains dynamic memory, into \cpp code. -As you may remember from subsection \ref{lbl:constexpr-programming}, +As you may remember from section \ref{lbl:constexpr-programming}, there is no direct way to use values holding pointers to dynamic memory directly as \glspl{nttp}. Therefore it must be conveyed by other means or transformed into \glspl{litval} @@ -121,7 +121,7 @@ \subsection{ The first backend implemented in the poacher project was the \gls{et} backend, where the AST is transformed into a type-based \gls{ir} -as described in \ref{lbl:pbg-et-technique}. It was later simplified to +as described in section \ref{lbl:pbg-et-technique}. It was later simplified to remove the \gls{ir} transformation step, which gave the \gls{pbg} backend. \begin{lstlisting}[ @@ -143,7 +143,7 @@ \subsection{ \end{lstlisting} The implementations of these two backends do not differ significantly from -the ones described in \ref{lbl:pbg-technique} and \ref{lbl:pbg-et-technique}: +the ones described in sections \ref{lbl:pbg-technique} and \ref{lbl:pbg-et-technique}: the generators that evaluate each node are passed as template parameters, only to work around the fact that pointers to \gls{constexpr} allocated memory cannot be used in a \gls{nttp}. @@ -153,7 +153,7 @@ \subsection{ pack of arbitrary types that may be \lstinline{et_token_t} elements for single tokens, or \lstinline{et_while_t} elements for nested while loops. -From there, the code generation occurs in the same way as it did in +From there, the code generation occurs in the same way as it did in section \ref{lbl:pbg-et-technique}: the \gls{et} is traversed recursively using overloaded functions to generate the \cpp code that corresponds to every while block, and down to every instruction in the \gls{et}. @@ -167,11 +167,11 @@ \subsection{ The last remaining backend to implement is the one that transforms the \gls{ast} into a serialized, \gls{nttp} compatible \gls{ir}. The case of Brainfuck introduces a notable difference compared to the simple -use case seen in \ref{lbl:flat-technique}: while \gls{ast} nodes in Brainfuck +use case seen in section \ref{lbl:flat-technique}: while \gls{ast} nodes in Brainfuck can have an arbitrary number of children nodes, as opposed to add nodes in the simple use case I presented earlier. This introduces a few technical differences with regard to serialization and code generation, which I will -cover in detail in this subsection. +cover in detail in this section. \begin{lstlisting}[ language=c++, diff --git a/3-new-approaches-to-metaprogramming/4-math-parsing.tex b/3-new-approaches-to-metaprogramming/4-math-parsing.tex index 98bbada..0b77ff6 100644 --- a/3-new-approaches-to-metaprogramming/4-math-parsing.tex +++ b/3-new-approaches-to-metaprogramming/4-math-parsing.tex @@ -13,7 +13,7 @@ \section{ a parsing algorithm that transforms infix formulas into their \gls{rpn} representation. -In \ref{lbl:codegen-from-rpn}, we already demonstrated that generating code +In section \ref{lbl:codegen-from-rpn}, we already demonstrated that generating code from \gls{rpn} formulas is a rather easy task, therefore this section will only cover the Shunting Yard algorithm, and the use of \gls{rpn} code generation applied to high performance computing. @@ -23,7 +23,7 @@ \subsection{ } As parsing algorithms and \gls{constexpr} dynamic data representations were -already covered in \ref{lbl:bf-parsing-and-codegen}, the implementation of +already covered in section \ref{lbl:bf-parsing-and-codegen}, the implementation of the Shunting Yard algorithm will not be covered in detail here. A thoroughly commented \gls{constexpr} implementation is available in appendix \ref{app:shunting-yard-impl}. It features the algorithm itself, as well as @@ -38,7 +38,7 @@ \subsection{ The worst case time and memory complexity of the algorithm is $O(N)$. Once again, code generation from postfix notation formulas was already covered -in \ref{lbl:codegen-from-rpn}, so we will skip straight to the use of Blaze +in section \ref{lbl:codegen-from-rpn}, so we will skip straight to the use of Blaze to generate high performance code from \gls{constexpr} formulas. \subsection{ diff --git a/bibliography/biblio.bib b/bibliography/biblio.bib index a219d12..2e81258 100644 --- a/bibliography/biblio.bib +++ b/bibliography/biblio.bib @@ -140,7 +140,7 @@ @article{10.1145/243439.243447 @article{10.1023/A:1010095604496, author = {Futamura, Yoshihiko}, - title = {Partial Evaluation of Computation Process—AnApproach to a + title = {Partial Evaluation of Computation Process — An Approach to a Compiler-Compiler}, year = {1999}, issue_date = {December 1999}, diff --git a/bibliography/hpcs2018.bib b/bibliography/hpcs2018.bib index 41801f1..261647c 100644 --- a/bibliography/hpcs2018.bib +++ b/bibliography/hpcs2018.bib @@ -133,7 +133,7 @@ @book{hpcs9 } @misc{hpcs10, - title = {Templates revisited - d programming language}, + title = {Templates revisited - D programming language}, author = {Walter Bright}, url = {https://dlang.org/articles/templates-revisited.html}, } @@ -407,7 +407,7 @@ @misc{hpcs21 } @book{hpcs22, - title = {Using the gnu compiler collection: a gnu manual for gcc version 4.3. + title = {Using the GNU compiler collection: a GNU manual for gcc version 4.3. 3}, author = {Stallman, Richard M}, year = {2009}, diff --git a/format.cpp b/format.cpp index f9b977b..dd9cc29 100644 --- a/format.cpp +++ b/format.cpp @@ -1,9 +1,20 @@ -constexpr std::vector> get_vector() { - return {{1, 2, 3}, {4, 5, 6}}; +std::tuple +parse_block(token_vec_t::const_iterator parse_begin, + token_vec_t::const_iterator parse_end) { + std::vector block_content; + for (; parse_begin != parse_end; parse_begin++) { + if (*parse_begin == while_end_v) { + return {std::move(block_content), parse_begin}; + } else if (*parse_begin == while_begin_v) { + auto [while_block_content, while_block_end] = + parse_block(parse_begin + 1, parse_end); + block_content.push_back( + std::make_unique(std::move(while_block_content))); + parse_begin = while_block_end; + } else if (*parse_begin != nop_v) { + block_content.push_back( + ast_node_ptr_t(std::make_unique(*parse_begin))); + } + } + return {ast_block_t(std::move(block_content)), parse_end}; } - -// Pas bien: -// constexpr std::vector subvec_0 = get_vector()[0]; - -// Bien: -constexpr auto get_subvec_0 = []() { return get_vector()[0]; } diff --git a/french-condensed.tex b/french-condensed.tex index b64e200..41270be 100644 --- a/french-condensed.tex +++ b/french-condensed.tex @@ -67,7 +67,7 @@ templates. Ces techniques, en plus des aspects <> du langage \cpp, -on fait na\^itre un \'ecosyst\`eme de biblioth\`eques m\'etaprogramm\'ees +ont fait na\^itre un \'ecosyst\`eme de biblioth\`eques m\'etaprogramm\'ees permettant d'optimiser automatiquement des programmes pour des architectures tr\`es diverses: les architectures SIMD, les processeurs multi-coeurs, les GPU, ou tout autre type d'acc\'el\'erateur programmable. @@ -79,7 +79,7 @@ la m\'etaprogrammation de templates pour impl\'ementer des analyseurs syntaxiques permettant de reconna\^itre des langages arbitraires. -Depuis C++20, la m\'emoire dynamique peut \^etre utilis\'ee dans les fonctions +Depuis \cpp20, l'allocation dynamique peut \^etre utilis\'ee dans des fonctions ex\'ecut\'ees \`a la compilation d'un programme \cpp. Nous souhaitons \'etudier l'impact de cette nouvelle fonctionnalit\'e sur l'impl\'ementation d'analyseurs syntaxiques de langages arbitraires, autant sur la conception des analyseurs @@ -139,24 +139,25 @@ \section{ impl\'ementations en assembleur d\'eclin\'ees pour chaque micro-architecture fournies par la biblioth\`eque OpenBLAS en l'appliquant sur des matrices de tailles allant de $4 \times 4$ \`a $512 \times 512$. -Malgr\'e des r\'esultats peu concluants en SSE4.2, les r\'esultats se montrent -tr\`es positifs en AVX2 o\`u nous obtenons des r\'esultats sup\'erieurs ou -\'equivalents \`a OpenBLAS. Quant \`a ARM, les r\'esultats nous y obtenons +Malgr\'e les r\'esultats peu concluants en SSE4.2 +(figure \ref{fig:gemv-sse-bench-fr}), les r\'esultats se montrent +tr\`es positifs en AVX2 (figure \ref{fig:gemv-avx-bench-fr}) o\`u nous obtenons +des r\'esultats sup\'erieurs ou \'equivalents \`a OpenBLAS. Quant \`a ARM +(figure \ref{fig:gemv-arm-bench-fr}), les r\'esultats nous y obtenons des performances sup\'erieures \`a celles d'OpenBLAS quelle que soit la taille -de la matrice (voir figures \ref{fig:gemv-sse-bench-fr}, -\ref{fig:gemv-avx-bench-fr} et \ref{fig:gemv-arm-bench-fr}). +de la matrice. \\ Ce chapitre a fait l'objet d'une publication scientifique \`a la conf\'erence internationale \textbf{High Performance Computation \& Simulation 2018}. \section{ - Nouvelle m\'ethodologie pour l'analyse des temps de compilation + Nouvelle m\'ethode pour l'analyse des temps de compilation } -Nous proposons ensuite une nouvelle m\'ethodologie pour l'analyse des temps -de compilation des m\'etaprogrammes \cpp, ainsi que des outils impl\'ementant -cette m\'ethodologie de mani\`ere reproductible. Cette m\'ethodologie +Nous pr\'esentons ensuite une nouvelle m\'ethode pour l'analyse des temps +de compilation de m\'etaprogrammes \cpp, ainsi que des outils impl\'ementant +cette m\'ethode de mani\`ere reproductible. Cette m\'ethode repose sur l'utilisation des donn\'ees de profiling de Clang pour permettre des analyses cibl\'ees, et permet de r\'ealiser des \'etudes de temps de compilation sur des m\'etaprogrammes param\'etrables, et donc d'observer @@ -174,7 +175,7 @@ \section{ \cpp pour exploiter les hi\'erarchies de fichiers g\'en\'er\'ees par \textbf{ctbench}. -Cette nouvelle m\'ethodologie permet de mieux comprendre l'impact des +Cette m\'ethode permet de mieux comprendre l'impact des techniques de m\'etaprogrammation sur les temps de compilation en permettant d'observer finement chaque \'etape de la compilation \'etant donn\'e que les mesures du profiler de Clang sont accompagn\'ees de m\'etadonn\'ees relatives @@ -200,7 +201,8 @@ \section{ \^etre stock\'es, ainsi la fonction g\'en\'erant la structure devra \^etre rappel\'ee pour chaque \'el\'ement de la structure. Cette mani\`ere de passer les \'el\'ements d'une structure arbitraire en param\`etre de template est -baptis\'ee <>, et permet de g\'en\'erer du code \`a partir +baptis\'ee <> (ou <>), +et permet de g\'en\'erer du code \`a partir d'une structure dynamique sans n\'ecessiter de repr\'esentation interm\'ediaire. Elle peut toutefois \^etre utilis\'ee pour g\'en\'erer une repr\'esentation interm\'ediaire sous forme d'arborescence de templates de types (appel\'ees @@ -240,7 +242,7 @@ \section{ des diff\'erentes techniques plus en finesse. Lors de ces benchmarks, nous constatons que la m\'ethode dite -<> induit des temps de compilation quadratiques en raison +<> induit des temps de compilation quadratiques en raison du nombre d'appels \`a la fonction de parsing, appell\'ee elle-m\^eme par les fonctions g\'en\'eratrices pour chaque n\oe{}ud de l'arbre de syntaxe. Cette technique est suffisante pour des programmes de petite taille @@ -271,16 +273,18 @@ \section{ \\ Ce chapitre a fait l'objet d'une pr\'esentation \`a \textbf{Meeting C++ 2022} -dans le cadre d'un projet commun de recherche avec Paul Keir et Andrew Gozillon. +dans le cadre d'un projet commun de recherche avec Paul Keir et Andrew Gozillon +de University of the West of Scotland. \section{ Application: int\'egration d'un langage math\'ematique dans \cpp } Les conclusions tir\'ees du cas de Brainfuck nous permettent d'impl\'ementer -un autre analyseur syntaxique pour un langage math\'ematique que nous appelons -Tiny Math Language (TML). Nous d\'ecidons d'impl\'ementer un parser bas\'e sur -l'algorithme Shunting yard d'Edsger Dijkstra dont la sortie est la formule +un autre analyseur syntaxique pour un langage math\'ematique de d\'emonstration +que nous appelons Tiny Math Language (TML). +Nous d\'ecidons d'impl\'ementer un parser bas\'e sur +l'algorithme Shunting Yard d'Edsger Dijkstra dont la sortie est la formule en notation polonaise inverse. Cette repr\'esentation permet ensuite de g\'en\'erer le programme correspondant \`a l'aide d'un m\'etaprogramme template stockant les r\'esultats interm\'ediaires dans une pile sous forme de tuple. diff --git a/introduction.tex b/introduction.tex index 0d5ba32..7fb33b8 100644 --- a/introduction.tex +++ b/introduction.tex @@ -156,7 +156,7 @@ \section{ In the second part we will focus on \textbf{ctbench}, a tool for the scientific study of \cpp metaprograms compile-time performance. It implements a new -benchmarking methodology that enables the study of metaprogram performance +benchmarking method that enables the study of metaprogram performance at scale, and through the lens of Clang's built-in profiler. We will then use it to study novel \cpp metaprogramming techniques based on compile-time \cpp execution. The study will be conducted on two embedded diff --git a/main.pdf b/main.pdf index 9b6f628..f6370d4 100644 Binary files a/main.pdf and b/main.pdf differ diff --git a/main.tex b/main.tex index 6725307..9401759 100644 --- a/main.tex +++ b/main.tex @@ -156,27 +156,31 @@ \newglossaryentry{bf}{ name=Brainfuck, - description=empty + description={An esoteric language in which basic + commands are encoded as single characters.} } \newglossaryentry{ir}{ name=intermediate representation, - description=empty + description={An internal data structure used by compilers + for the representation of programs.} } \newglossaryentry{pbg}{ name=pass by generator, - description=empty + description={An indirect value passing technique which consists in passing + generator functions instead of their results.} } \newglossaryentry{tmp}{ name=template metaprogramming, - description=empty + description={The use of templates for the implementation of metaprograms.} } \newglossaryentry{et}{ name=expression template, - description=empty + description={The representation of mathematical expressions in the form of + template arborescences.} } \newacronym{api}{API}{Application Programming Interface} @@ -307,50 +311,6 @@ \end{titlepage} -\chapter*{Remerciements} - -Merci \`a \textbf{Christine Paulin} d'assurer la pr\'esidence du jury, -aux rapporteurs \textbf{David Hill} et \textbf{Thierry G\'eraud}, -ainsi qu'\`a \textbf{Amina Guermouche} d'examiner la th\`ese avec -Christine Paulin. - -Je tiens \'egalement \`a faire part de ma reconnaissance envers -\textbf{Christine Paulin-Mohring} pour la cr\'eation du Magist\`ere d'Informatique -qui m'a permis de d\'ecouvrir la recherche dans un cadre tr\`es privil\'egi\'e. - -Merci \'egalement \`a \textbf{Jean-Thierry Laprest\'e} et -\textbf{Daniel \'Etiemble} pour leurs relectures. - -Merci, bien s\^ur, \`a \textbf{Jo\"el Falcou}, pour son accompagnement -bienveillant, valorisant, et d\'evou\'e depuis mon premier stage de recherche. - -% Merci les collegues et les parents - -Merci \`a \textbf{Amal Khabou}, dont l'aide m'a \'et\'e pr\'ecieuse pour la pr\'eparation -de ma premi\`ere conf\'erence. - -Merci \`a \textbf{Hartmut Kaiser}, pour son accueil \`a LSU, -\textbf{Adrian et Cory Lemoine}, qui m'ont aid\'e \`a -traverser un \'ev\'enement difficile et marquant pendant mon s\'ejour, -et bien s\^ur \textbf{au reste de l'\'equipe Ste||ar} pour les amiti\'es -que j'ai pu y nouer, et la gentillesse de chaque personne que j'y ai -rencontr\'ee. - -Merci \`a \textbf{Paul Keir}, pour sa collaboration amicale et bienveillante. - -Merci \`a \textbf{mes parents}, -pour m'avoir transmis la passion de l'informatique, -pour m'avoir soutenu \`a tout \^age, -y compris pendant mes ann\'ees de th\`ese. - -% Merci les potes - -\textbf{Antoine Lanco et Alexandrina Korneva}, pour leur camaraderie, les -d\'ecorations de f\^etes, les f\^etes dans les champs, et les champs de pizzas. - -\textbf{Marie Debard}, pour son soutien moral lorsque je m'inqui\'etais -pour le financement de ma th\`ese. - % page des résumés à garder en 2ème page. % Si les résumés sont trop longs pour tenir sur une seule et même page, % on peut mettre un résumé par page @@ -458,7 +418,7 @@ \chapter*{Remerciements} to study code generation using portable high performance computing libraries. In order to assess the compile-time impact of these various metaprogramming -techniques, we propose a new benchmarking methodology that uses +techniques, we propose a new benchmarking method that uses Clang's built-in profiler, enabling compile-time performance scaling analysis beyond black-box benchmarking. \end{multicols} @@ -485,6 +445,50 @@ \chapter*{Remerciements} % ------------------------------------------------------------------------------ % These de Jules + +\chapter*{ + Acknowledgements +} +\addcontentsline{toc}{chapter}{Acknowledgements} + +Thanks to \textbf{Christine Paulin} for presiding the jury, +as well as \textbf{David Hill}, \textbf{Thierry G\'eraud} for being +reporters on this thesis, and \textbf{Amina Guermouche} +for being an examiner of it. + +I would also like to thank \textbf{Sylvain Conchon and Steven Martin} for +the creation of the Magist\`ere d'Informatique, +and thanks to \textbf{Sarah Cohen-Boulakia} for being responsible for it and +leading me to my first reseach internship. + +Thanks to \textbf{Jean-Thierry Laprest\'e} and \textbf{Daniel \'Etiemble} +for their proofreading. + +And of course, thanks to \textbf{Jo\"el Falcou} for his support that has been +valorizing, careful, and devoted since my first research internship. + +Thanks to \textbf{Amal Khabou} for her help during +the preparation of my first conference. + +Thanks to \textbf{Hartmut Kaiser} for welcoming me at Louisiana State +University, thanks to \textbf{Adrian and Cory Lemoine} for their kind support +during a very difficult and striking event in Baton Rouge, and of course thanks +to \textbf{the rest of the STE||AR Group} for the friendships I made along the +way. + +Thanks to \textbf{Paul Keir}, for his friendly and caring partnership. + +Thanks to \textbf{my parents} for fostering my passion for computer science +and for supporting me throughout my childhood and my student years, +including my years as a PhD student. + +Thanks to \textbf{Antoine Lanco and Alexandrina Korneva} for their +companionship, the party decorations, the parties in open fields, +and the fields of pizza. + +Thanks to \textbf{Marie Debard} for being supportive when I was worried for +the financing of my thesis. + \chapter*{ R\'esum\'e en Fran\c{c}ais } @@ -549,7 +553,7 @@ \part{ } \chapter{ - Compile time benchmarking methodology + Compile time benchmarking method } With \gls{tmp} libraries like Eigen\cite{eigen}, Blaze\cite{blazelib}, @@ -559,7 +563,7 @@ \chapter{ even larger as \cpp embeds more features over time to support and extend this kind of practices, like compile time containers \cite{more-constexpr-containers} or static reflection\cite{static-reflection}. However, there is still no clear -cut methodology to compare the performance impact of different metaprogramming +cut method to compare the performance impact of different metaprogramming strategies. As new \cpp features allow for new techniques with alleged better compile time performance, no scientific process can back up those claims. @@ -569,7 +573,7 @@ \chapter{ tools to define and run benchmarks, then aggregate, filter out, and plot the data to analyze them. As such, \ctbench aims to be a foundational layer of a proper -scientific methodology for analyzing compile time program behavior. +scientific method for analyzing compile time program behavior. \\ ctbench puts an emphasis on software quality. @@ -683,7 +687,7 @@ \chapter{ \section{Conclusion} The Brainfuck example shows that using \gls{constexpr} programming to embed -large programs written in foreign languages in \cpp can be achieved. +large programs written in arbitrary languages in \cpp can be achieved. \gls{ast} serialization makes it possible to store parsing results and avoid running into quadratic compilation time issues, as long as the code generation methods do not rely on heavy metaprogramming methods. @@ -713,11 +717,12 @@ \chapter*{Conclusion \& perspectives} even for very low level tasks where the reduction of abstraction overhead is critical to achieve decent performance. +% TODO: ??? \section*{...} Before moving on to \gls{constexpr} programming, we adressed the lack of tools for the scientific study of compilation times with ctbench. -We proposed a new compile-time study methodology based on Clang's built-in +We proposed a new compile-time study method based on Clang's built-in profiler that help us understand the impacts of metaprograms on each step of the compilation process. While it offers limited benchmarking capabilities when used with GCC as the latter does not output any detailed profiling data, diff --git a/notes-adum.txt b/notes-adum.txt new file mode 100644 index 0000000..5278693 --- /dev/null +++ b/notes-adum.txt @@ -0,0 +1,5 @@ +Remerciements apres les resumes + +Inverser titres (Anglais en premier) + +Envoyer un mail a Laurence Sauboy apres depot