first round of corrections from the reporters

JPenuchot · Oct 21, 2024 · 5158739 · 5158739
1 parent d86c9f1
commit 5158739
Show file tree

Hide file tree

Showing 16 changed files with 157 additions and 128 deletions.
diff --git a/1-current-metaprogramming/1-metaprog-and-hpc-overview.tex b/1-current-metaprogramming/1-metaprog-and-hpc-overview.tex
@@ -217,9 +217,9 @@ \section{
 
 In the next section, we will take a deeper look at \cpp as it has
 powerful metaprogramming capabilities, while providing bleeding edge support for
-all kinds of computer architectures through a decades-old ecosystem. Moreover,
-its development is still very active, with many significant metaprogramming
-proposals being adopted throughout the latest
-releases\cite{10.1145/3564719.3568692}.
+all kinds of computer architectures through a decades-old
+ecosystem. Moreover, its development is still very active, with many significant
+metaprogramming proposals being adopted throughout the latest releases
+\cite{10.1145/3564719.3568692}.
 
 \end{document}
diff --git a/1-current-metaprogramming/2-cpp-constructs.tex b/1-current-metaprogramming/2-cpp-constructs.tex
@@ -251,10 +251,10 @@ \subsection{
 \end{lstlisting}
 
 \subsection{
-  Compile time logic
+  Compile time computation
 }
 
-Compile time logic can be achieved in many \cpp constructs.
+Compile time computation can be achieved through many \cpp constructs.
 
 \begin{itemize}
 
@@ -456,7 +456,7 @@ \subsection{
 enable the creation of high level, portable \glspl{dsel} that resolve into
 high performance code thanks to a combination of metaprogramming techniques.
 In the next section, we will see a collection of libraries that go beyond the
-idea of using templates for math code generation, and implement or enable the
+idea of using templates for math code generation, and enable or facilitate the
 implementation of arbitrary compile-time programs.
 
 \section{
@@ -519,7 +519,8 @@ \subsection{
 All these libraries either enable \gls{tmp}, or use \gls{tmp} to achieve
 a specific goal. However with the introduction of \gls{constexpr} programming,
 a new range of compile-time libraries aims to provide new capabilities
-for this new metaprogramming paradigm.
+for this new metaprogramming paradigm, including the \cpp standard library
+itself.
 
 \subsection{
   Value based metaprogramming
@@ -563,10 +564,10 @@ \subsection{
 \lstinline{std::deque} is not usable in \glspl{consteval} whereas
 its \textbf{C'est} equivalent named \lstinline{cest::deque} is.
 It was instrumental for this thesis as the research work I present here started
-a long time before \cpp23 was adopted and standard libraries as well as
+before \cpp23 was adopted, and before standard libraries as well as
 compilers started implementing it.
 
-Similar to previous examples, listing \ref{lst:cest-example} shows a
+Similar to the previous examples, listing \ref{lst:cest-example} shows a
 compile-time program in which we find the index of the first element of value 6.
 Note that in this example, we are using properly typed values and functions
 instead of templates to represent values, predicates, and functions.

diff --git a/1-current-metaprogramming/3-gemv.tex b/1-current-metaprogramming/3-gemv.tex
@@ -30,8 +30,8 @@ \section{
 The efforts of optimizing the performance of \gls{blas} routines
 fall into two main directions. The first direction is about
 writing very specific assembly code. This is the case for
-almost all the vendor libraries including Intel MKL\cite{hpcs1},
-AMD ACML\cite{hpcs2} etc. To provide the users with efficient \gls{blas}
+almost all the vendor libraries including Intel MKL \cite{hpcs1},
+AMD ACML \cite{hpcs2} etc. To provide the users with efficient \gls{blas}
 routines, the vendors usually implement their own routines
 for their own hardware using assembly code with specific
 optimizations which is a low level solution that gives the
@@ -51,12 +51,12 @@ \section{
 abstraction level and the efficiency of the generated codes.
 This is for example the case of the approach followed by
 the Formal Linear Algebra Methods Environment (FLAME)
-with the Libflame library\cite{hpcs3}. Thus, it offers a framework to
-develop dense linear solvers using algorithmic skeletons\cite{hpcs4}
+with the Libflame library \cite{hpcs3}. Thus, it offers a framework to
+develop dense linear solvers using algorithmic skeletons \cite{hpcs4}
 and an API which is more user-friendly than LAPACK, giving
 satisfactory performance results. A more generic approach is
 the one followed in recent years by \cpp libraries built around
-expression templates\cite{hpcs5} or other generative programming\cite{hpcs6}
+expression templates \cite{hpcs5} or other generative programming \cite{hpcs6}
 principles. In this section, we will focus on such an approach.
 To show the interest of this approach, we consider as
 example the matrix-vector multiplication kernel (gemv) which
@@ -81,7 +81,7 @@ \section{
 }
 
 As we saw earlier, metaprogramming is used in \cpp \cite{hpcs9},D \cite{hpcs10},
-OCaml\cite{hpcs11} or Haskell\cite{hpcs12}. A subset of basic notions emerges:
+OCaml \cite{hpcs11} or Haskell \cite{hpcs12}. A subset of basic notions emerges:
 
 \begin{itemize}
 \item
@@ -155,8 +155,9 @@ \section{
 or template arguments in a comma-separated code
 fragment. Its main use was to provide the syntactic
 support required to write a code with variadic template
-arguments. However, Niebler and Parent showed that
-this can be used to generate far more complex code
+arguments. However, Niebler and Parent
+% TODO: ref
+showed that this can be used to generate far more complex code
 when paired with other language constructs. Both
 code replication and a crude form of code unrolling
 were possible. However, it required the use of some
@@ -621,7 +622,7 @@ \subsection{
 Again, the performances of our implementation are close
 to that of OpenBLAS and are even quite better for matrices of
 small sizes ranging from 4 to 16 elements. For example, for a
-matrix of size 8 elements,the automatically generated code has
+matrix of size 8 elements, the automatically generated code has
 a performance that is 3 times better than the OpenBLAS \gls{gemv}
 kernel (15.78 Gflop/s vs 5.06 Gflop/s). Two phenomenons
 appear however. The first one is that the increased number

diff --git a/2-compilation-time-analysis/2-ctbench-design.tex b/2-compilation-time-analysis/2-ctbench-design.tex
@@ -6,7 +6,7 @@ \section{
   ctbench features
 }
 
-ctbench implements a new methodology for the analysis of compilation times:
+ctbench implements a new method for the analysis of compilation times:
 it allows users to define \cpp sizable benchmarks to analyze the scaling
 performance of \cpp metaprogramming techniques, and compare techniques
 against each other.
@@ -172,7 +172,7 @@ \subsection{
   Note that JSON data is not stored directly.
   This is intentional since a profiling file for a single benchmark repetition
   can reach volumes up to several hundreds megabytes, therefore data loading
-  is delayed to prevent RAM overcomsumption.
+  is delayed to prevent RAM overconsumption.
 
 \item
 

diff --git a/3-new-approaches-to-metaprogramming/1-technical-background.tex b/3-new-approaches-to-metaprogramming/1-technical-background.tex
@@ -8,7 +8,7 @@ \section{
 
 % TODO: dig this for references https://youtu.be/q6X9bKpAmnI
 
-As mentioned in \ref{lbl:cpp-meta-constructs}, the \gls{constexpr}
+As mentioned in section \ref{lbl:cpp-meta-constructs}, the \gls{constexpr}
 allows variables and functions to be used in \gls{consteval},
 making a whole subset of the \cpp language itself usable for compile-time logic.
 
@@ -73,8 +73,9 @@ \subsection{
 // Function template foo takes a polymorphic NTTP
 template<auto bar> constexpr int foo() { return 1; }
 
-// generate's return value cannot be stored in a constexpr variable
-// or used as a NTTP, but it can be used to produce other literal
+// generate's return value cannot be stored
+// in a constexpr variable or used as a NTTP,
+// but it can be used to produce other literals
 
 // constexpr auto a = generate();         // ERROR
 constexpr auto b = generate().size();     // OK
@@ -133,7 +134,7 @@ \subsection{
 The addition of \gls{constexpr} memory allocation goes hand in hand
 with the ability to use virtual functions in \glspl{consteval}.
 This feature allows calls to virtual functions in constant expressions
-\cite{virtual-constexpr}. This allows heritage-based polymorphism in
+\cite{virtual-constexpr}. This allows inheritance-based polymorphism in
 \gls{constexpr} programming when used with \gls{constexpr} allocation of
 polymorphic types.
 

diff --git a/3-new-approaches-to-metaprogramming/2-constexpr-codegen-techniques.tex b/3-new-approaches-to-metaprogramming/2-constexpr-codegen-techniques.tex
@@ -72,15 +72,15 @@ \section{
 This section will cover the implementation of \gls{constexpr} \glspl{ast},
 and techniques to work around the limitations that prevent their direct use as
 \glspl{nttp} either through functional wrapping techniques, or through
-their convertion of into values that satisfy \gls{nttp} requirements.
+their convertion into values that satisfy \gls{nttp} requirements.
 
 \subsection{
   Code generation from pointer tree data structures
 }
 
 \label{lbl:ptr-tree-codegen}
 
-In this subsection, we introduce three techniques that will allow us to use
+In this section, we introduce three techniques that will allow us to use
 a pointer tree generated from a \gls{constexpr} function as a template parameter
 for code generation.
 
@@ -106,8 +106,8 @@ \subsection{
       a \gls{nttp},
 \end{itemize}
 
-The compilation performance measurements in \ref{lbl:compile-time-eval} will
-rely on the same data passing techniques, but with more complex examples such
+The compilation performance measurements in section \ref{lbl:compile-time-eval}
+will rely on the same data passing techniques, but with more complex examples such
 as embedded compilation of Brainfuck programs, and of \LaTeX math formulae
 into high performance math computation kernels.
 
@@ -173,7 +173,7 @@ \subsubsection{
 
 The downside of using this value passing technique is that the
 number of calls of the generator function is proportional to the
-number of nodes. Experiments in \ref{lbl:compile-time-eval} highlight
+number of nodes. Experiments in section \ref{lbl:compile-time-eval} highlight
 the scaling issues induced by this code generation method.
 And while it is very quick to implement, there are still difficulties
 related to \gls{constexpr} memory constraints and compiler or library support.
@@ -310,7 +310,8 @@ \subsubsection{
 type-based paradigms, even when \gls{constexpr} allocated memory is involved.
 
 It is worth mentioning that both this technique and the previous one induce
-very high compilation times, as we will see in \ref{lbl:bf-parsing-and-codegen}.
+very high compilation times, as we will see in section
+\ref{lbl:bf-parsing-and-codegen}.
 
 \subsubsection{
   FLAT - AST serialization
@@ -477,11 +478,11 @@ \subsection{
 \end{figure}
 
 Parsing algorithms may output serialized data. In this case, the serialization
-step described in \ref{lbl:flat-technique} is not needed, and the result
+step described in section \ref{lbl:flat-technique} is not needed, and the result
 can be converted into a static array.
 This makes the code generation process rather straightforward as no complicated
 transformation is needed, while still scaling decently as we will see in
-\ref{lbl:compile-time-eval} where we will be using a
+section \ref{lbl:compile-time-eval} where we will be using a
 Shunting Yard parser \cite{shunting-yard} to parse math formulae to a
 \gls{rpn}, which is its postfix notation.
 

diff --git a/3-new-approaches-to-metaprogramming/3-brainfuck.tex b/3-new-approaches-to-metaprogramming/3-brainfuck.tex
@@ -10,7 +10,7 @@ \section{
 pointer trees generated by \gls{constexpr} functions, we will use them in the
 context of compile time parsing and code generation for the Brainfuck language.
 Therefore use data structures and code generation techniques introduced in
-subsection \ref{lbl:ptr-tree-codegen}.
+section \ref{lbl:ptr-tree-codegen}.
 
 We chose Brainfuck as a first language for several reasons: the language
 generates approximately one \gls{ast} node per character which makes the size
@@ -104,7 +104,7 @@ \subsection{
 figuring out how to transform its result, which contains dynamic memory,
 into \cpp code.
 
-As you may remember from subsection \ref{lbl:constexpr-programming},
+As you may remember from section \ref{lbl:constexpr-programming},
 there is no direct way to use values holding pointers to dynamic memory
 directly as \glspl{nttp}.
 Therefore it must be conveyed by other means or transformed into \glspl{litval}
@@ -121,7 +121,7 @@ \subsection{
 
 The first backend implemented in the poacher project was the \gls{et} backend,
 where the AST is transformed into a type-based \gls{ir}
-as described in \ref{lbl:pbg-et-technique}. It was later simplified to
+as described in section \ref{lbl:pbg-et-technique}. It was later simplified to
 remove the \gls{ir} transformation step, which gave the \gls{pbg} backend.
 
 \begin{lstlisting}[
@@ -143,7 +143,7 @@ \subsection{
 \end{lstlisting}
 
 The implementations of these two backends do not differ significantly from
-the ones described in \ref{lbl:pbg-technique} and \ref{lbl:pbg-et-technique}:
+the ones described in sections \ref{lbl:pbg-technique} and \ref{lbl:pbg-et-technique}:
 the generators that evaluate each node are passed as template parameters,
 only to work around the fact that pointers to \gls{constexpr} allocated memory
 cannot be used in a \gls{nttp}.
@@ -153,7 +153,7 @@ \subsection{
 pack of arbitrary types that may be \lstinline{et_token_t} elements for single
 tokens, or \lstinline{et_while_t} elements for nested while loops.
 
-From there, the code generation occurs in the same way as it did in
+From there, the code generation occurs in the same way as it did in section
 \ref{lbl:pbg-et-technique}: the \gls{et} is traversed recursively using
 overloaded functions to generate the \cpp code that corresponds to every
 while block, and down to every instruction in the \gls{et}.
@@ -167,11 +167,11 @@ \subsection{
 The last remaining backend to implement is the one that transforms
 the \gls{ast} into a serialized, \gls{nttp} compatible \gls{ir}.
 The case of Brainfuck introduces a notable difference compared to the simple
-use case seen in \ref{lbl:flat-technique}: while \gls{ast} nodes in Brainfuck
+use case seen in section \ref{lbl:flat-technique}: while \gls{ast} nodes in Brainfuck
 can have an arbitrary number of children nodes, as opposed to add nodes in
 the simple use case I presented earlier. This introduces a few technical
 differences with regard to serialization and code generation, which I will
-cover in detail in this subsection.
+cover in detail in this section.
 
 \begin{lstlisting}[
   language=c++,

diff --git a/3-new-approaches-to-metaprogramming/4-math-parsing.tex b/3-new-approaches-to-metaprogramming/4-math-parsing.tex
@@ -13,7 +13,7 @@ \section{
 a parsing algorithm that transforms infix formulas into their \gls{rpn}
 representation.
 
-In \ref{lbl:codegen-from-rpn}, we already demonstrated that generating code
+In section \ref{lbl:codegen-from-rpn}, we already demonstrated that generating code
 from \gls{rpn} formulas is a rather easy task, therefore this section
 will only cover the Shunting Yard algorithm, and the use of \gls{rpn}
 code generation applied to high performance computing.
@@ -23,7 +23,7 @@ \subsection{
 }
 
 As parsing algorithms and \gls{constexpr} dynamic data representations were
-already covered in \ref{lbl:bf-parsing-and-codegen}, the implementation of
+already covered in section \ref{lbl:bf-parsing-and-codegen}, the implementation of
 the Shunting Yard algorithm will not be covered in detail here.
 A thoroughly commented \gls{constexpr} implementation is available in appendix
 \ref{app:shunting-yard-impl}. It features the algorithm itself, as well as
@@ -38,7 +38,7 @@ \subsection{
 The worst case time and memory complexity of the algorithm is $O(N)$.
 
 Once again, code generation from postfix notation formulas was already covered
-in \ref{lbl:codegen-from-rpn}, so we will skip straight to the use of Blaze
+in section \ref{lbl:codegen-from-rpn}, so we will skip straight to the use of Blaze
 to generate high performance code from \gls{constexpr} formulas.
 
 \subsection{

diff --git a/bibliography/biblio.bib b/bibliography/biblio.bib
@@ -140,7 +140,7 @@ @article{10.1145/243439.243447
 
 @article{10.1023/A:1010095604496,
   author = {Futamura, Yoshihiko},
-  title = {Partial Evaluation of Computation Process—AnApproach to a
+  title = {Partial Evaluation of Computation Process — An Approach to a
            Compiler-Compiler},
   year = {1999},
   issue_date = {December 1999},

diff --git a/bibliography/hpcs2018.bib b/bibliography/hpcs2018.bib
@@ -133,7 +133,7 @@ @book{hpcs9
 }
 
 @misc{hpcs10,
-  title = {Templates revisited - d programming language},
+  title = {Templates revisited - D programming language},
   author = {Walter Bright},
   url = {https://dlang.org/articles/templates-revisited.html},
 }
@@ -407,7 +407,7 @@ @misc{hpcs21
 }
 
 @book{hpcs22,
-  title = {Using the gnu compiler collection: a gnu manual for gcc version 4.3.
+  title = {Using the GNU compiler collection: a GNU manual for gcc version 4.3.
            3},
   author = {Stallman, Richard M},
   year = {2009},

diff --git a/format.cpp b/format.cpp
@@ -1,9 +1,20 @@
-constexpr std::vector<std::vector<int>> get_vector() {
-  return {{1, 2, 3}, {4, 5, 6}};
+std::tuple<ast_block_t, token_vec_t::const_iterator>
+parse_block(token_vec_t::const_iterator parse_begin,
+            token_vec_t::const_iterator parse_end) {
+  std::vector<ast_node_ptr_t> block_content;
+  for (; parse_begin != parse_end; parse_begin++) {
+    if (*parse_begin == while_end_v) {
+      return {std::move(block_content), parse_begin};
+    } else if (*parse_begin == while_begin_v) {
+      auto [while_block_content, while_block_end] =
+          parse_block(parse_begin + 1, parse_end);
+      block_content.push_back(
+          std::make_unique<ast_while_t>(std::move(while_block_content)));
+      parse_begin = while_block_end;
+    } else if (*parse_begin != nop_v) {
+      block_content.push_back(
+          ast_node_ptr_t(std::make_unique<ast_token_t>(*parse_begin)));
+    }
+  }
+  return {ast_block_t(std::move(block_content)), parse_end};
 }
-
-// Pas bien:
-// constexpr std::vector<int> subvec_0 = get_vector()[0];
-
-// Bien:
-constexpr auto get_subvec_0 = []() { return get_vector()[0]; }