From 72c93d29df8974c2ea6182e36b0c8b1e0188e232 Mon Sep 17 00:00:00 2001 From: nick black Date: Fri, 6 Sep 2024 10:08:22 -0400 Subject: [PATCH 1/5] Update file names for renumbered Ch1 sections --- .../1-3 What is performance analysis.md | 11 +++++++++++ .../1-4 What is performance analysis.md | 16 ++++++++++------ ....md => 1-5 What is discussed in this book.md} | 0 .../1-5 What is performance tuning.md | 15 --------------- ...=> 1-6 What is not discussed in this book.md} | 0 .../{1-8 Exercises.md => 1-7 Exercises.md} | 0 6 files changed, 21 insertions(+), 21 deletions(-) create mode 100644 chapters/1-Introduction/1-3 What is performance analysis.md rename chapters/1-Introduction/{1-6 What is in the book.md => 1-5 What is discussed in this book.md} (100%) delete mode 100644 chapters/1-Introduction/1-5 What is performance tuning.md rename chapters/1-Introduction/{1-7 What is not in this book.md => 1-6 What is not discussed in this book.md} (100%) rename chapters/1-Introduction/{1-8 Exercises.md => 1-7 Exercises.md} (100%) diff --git a/chapters/1-Introduction/1-3 What is performance analysis.md b/chapters/1-Introduction/1-3 What is performance analysis.md new file mode 100644 index 0000000000..03a831654d --- /dev/null +++ b/chapters/1-Introduction/1-3 What is performance analysis.md @@ -0,0 +1,11 @@ +## What Is Performance Analysis? + +Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. + +Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. + +Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. + +Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. + +[^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) diff --git a/chapters/1-Introduction/1-4 What is performance analysis.md b/chapters/1-Introduction/1-4 What is performance analysis.md index 03a831654d..f848f1194f 100644 --- a/chapters/1-Introduction/1-4 What is performance analysis.md +++ b/chapters/1-Introduction/1-4 What is performance analysis.md @@ -1,11 +1,15 @@ -## What Is Performance Analysis? +## What Is Performance Tuning? -Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. +Locating a performance bottleneck is only half of an engineer’s job. The second half is to fix it properly. Sometimes changing one line in the source code of a program can yield a drastic performance boost. Missing such opportunities can be quite wasteful. Performance analysis and tuning are all about finding and fixing this line. -Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. +To take advantage of all the computing power of modern CPUs, you need to understand how they work. Or as performance engineers like to say, you need to have "mechanical sympathy". This term was borrowed from the car racing world. It means that a racing driver with a good understanding of how the car works has an edge over its competitors who don't. The same applies to performance engineering. It is not possible to know all the details of how a modern CPU operates, but you need to have a good mental model of it to squeeze the last bit of performance. -Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. +This is what I mean by "low-level optimizations". This is a type of optimization that takes into account the details of the underlying hardware capabilities. It is different from "high-level optimizations" which are more about application-level logic, algorithms, and data structures. As you will see in the book, the majority of low-level optimizations can be applied to a wide variety of modern processors. To successfully implement low-level optimizations, you need to have a good understanding of the underlying hardware. -Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. +> "During the post-Moore era, it will become ever more important to make code run fast and, in particular, to tailor it to the hardware on which it runs." [@Leisersoneaam9744] -[^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) +In the past, software developers had more mechanical sympathy as they often had to deal with nuances of the hardware implementation. During the PC era, developers usually were programming directly on top of the operating system, with possibly a few libraries in between. As the world moved to the cloud era, the software stack got deeper and more complex. The top layer of the stack (on which most developers work) has moved further away from the hardware. The negative side of such evolution is that developers of modern applications have less affinity to the actual hardware on which their software is running. This book will help you build a strong connection with modern processors. + +There is a famous quote by Donald Knuth: "Premature optimization is the root of all evil". But the opposite is often true as well. Postponed performance engineering may be too late and cause as much evil as premature optimization. For developers working with performance-critical projects, it is crucial to know how underlying hardware works. In such industries, it is a failure from the start when a program is being developed without a hardware focus. ClickHouse DB is an example of a successful software product that was built around a small but very efficient core. Performance characteristics of software must be a first-class citizen along with correctness and security starting from day 1. Poor performance can kill a product just as easily as security vulnerabilities. + +Performance engineering is important and rewarding work, but it may be very time-consuming. In fact, performance optimization is a never-ending game. There will always be something to optimize. Inevitably, a developer will reach the point of diminishing returns at which further improvement comes at a very high engineering cost and likely will not be worth the effort. Knowing when to stop optimizing is a critical aspect of performance work. diff --git a/chapters/1-Introduction/1-6 What is in the book.md b/chapters/1-Introduction/1-5 What is discussed in this book.md similarity index 100% rename from chapters/1-Introduction/1-6 What is in the book.md rename to chapters/1-Introduction/1-5 What is discussed in this book.md diff --git a/chapters/1-Introduction/1-5 What is performance tuning.md b/chapters/1-Introduction/1-5 What is performance tuning.md deleted file mode 100644 index f848f1194f..0000000000 --- a/chapters/1-Introduction/1-5 What is performance tuning.md +++ /dev/null @@ -1,15 +0,0 @@ -## What Is Performance Tuning? - -Locating a performance bottleneck is only half of an engineer’s job. The second half is to fix it properly. Sometimes changing one line in the source code of a program can yield a drastic performance boost. Missing such opportunities can be quite wasteful. Performance analysis and tuning are all about finding and fixing this line. - -To take advantage of all the computing power of modern CPUs, you need to understand how they work. Or as performance engineers like to say, you need to have "mechanical sympathy". This term was borrowed from the car racing world. It means that a racing driver with a good understanding of how the car works has an edge over its competitors who don't. The same applies to performance engineering. It is not possible to know all the details of how a modern CPU operates, but you need to have a good mental model of it to squeeze the last bit of performance. - -This is what I mean by "low-level optimizations". This is a type of optimization that takes into account the details of the underlying hardware capabilities. It is different from "high-level optimizations" which are more about application-level logic, algorithms, and data structures. As you will see in the book, the majority of low-level optimizations can be applied to a wide variety of modern processors. To successfully implement low-level optimizations, you need to have a good understanding of the underlying hardware. - -> "During the post-Moore era, it will become ever more important to make code run fast and, in particular, to tailor it to the hardware on which it runs." [@Leisersoneaam9744] - -In the past, software developers had more mechanical sympathy as they often had to deal with nuances of the hardware implementation. During the PC era, developers usually were programming directly on top of the operating system, with possibly a few libraries in between. As the world moved to the cloud era, the software stack got deeper and more complex. The top layer of the stack (on which most developers work) has moved further away from the hardware. The negative side of such evolution is that developers of modern applications have less affinity to the actual hardware on which their software is running. This book will help you build a strong connection with modern processors. - -There is a famous quote by Donald Knuth: "Premature optimization is the root of all evil". But the opposite is often true as well. Postponed performance engineering may be too late and cause as much evil as premature optimization. For developers working with performance-critical projects, it is crucial to know how underlying hardware works. In such industries, it is a failure from the start when a program is being developed without a hardware focus. ClickHouse DB is an example of a successful software product that was built around a small but very efficient core. Performance characteristics of software must be a first-class citizen along with correctness and security starting from day 1. Poor performance can kill a product just as easily as security vulnerabilities. - -Performance engineering is important and rewarding work, but it may be very time-consuming. In fact, performance optimization is a never-ending game. There will always be something to optimize. Inevitably, a developer will reach the point of diminishing returns at which further improvement comes at a very high engineering cost and likely will not be worth the effort. Knowing when to stop optimizing is a critical aspect of performance work. diff --git a/chapters/1-Introduction/1-7 What is not in this book.md b/chapters/1-Introduction/1-6 What is not discussed in this book.md similarity index 100% rename from chapters/1-Introduction/1-7 What is not in this book.md rename to chapters/1-Introduction/1-6 What is not discussed in this book.md diff --git a/chapters/1-Introduction/1-8 Exercises.md b/chapters/1-Introduction/1-7 Exercises.md similarity index 100% rename from chapters/1-Introduction/1-8 Exercises.md rename to chapters/1-Introduction/1-7 Exercises.md From 5966184c4061064f0bc53e43b1ed64bbc22738dc Mon Sep 17 00:00:00 2001 From: nick black Date: Fri, 6 Sep 2024 10:14:05 -0400 Subject: [PATCH 2/5] Chapter 1 section 3. You swapped post- and pre-! --- .../1-Introduction/1-3 What is performance analysis.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/chapters/1-Introduction/1-3 What is performance analysis.md b/chapters/1-Introduction/1-3 What is performance analysis.md index 03a831654d..c4486987e5 100644 --- a/chapters/1-Introduction/1-3 What is performance analysis.md +++ b/chapters/1-Introduction/1-3 What is performance analysis.md @@ -1,11 +1,11 @@ ## What Is Performance Analysis? -Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. +Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even small tweaks to code can trigger noticeable performance changes. Relying on intuition when optimizing an application typically results in random "fixes" without real performance impact. -Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. +Inexperienced developers sometimes make changes in their code and claim it *should* run faster. One such example is replacing `i++` (post-increment) with `++i` (pre-increment) all over the code base (assuming that the previous value of `i` is not used). In the general case, this change will make no difference to the generated code: every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition. *Always measure.* -Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. +Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is the [XOR swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm).[^2] In reality, simple `std::swap` produces equivalent or faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to tune should be the result of careful performance analysis, not intuition or guessing. -Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. +Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, but none of them will necessarily lead you to a certain discovery. With experience, you will develop your own strategies about when to use each approach. [^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) From f343e677c1f3168c08a4adca7b97e55f01227cfa Mon Sep 17 00:00:00 2001 From: Denis Bakhvalov Date: Sat, 7 Sep 2024 18:17:52 -0400 Subject: [PATCH 3/5] Revert "Chapter 1 section 3. You swapped post- and pre-!" This reverts commit 5966184c4061064f0bc53e43b1ed64bbc22738dc. --- .../1-Introduction/1-3 What is performance analysis.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/chapters/1-Introduction/1-3 What is performance analysis.md b/chapters/1-Introduction/1-3 What is performance analysis.md index c4486987e5..03a831654d 100644 --- a/chapters/1-Introduction/1-3 What is performance analysis.md +++ b/chapters/1-Introduction/1-3 What is performance analysis.md @@ -1,11 +1,11 @@ ## What Is Performance Analysis? -Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even small tweaks to code can trigger noticeable performance changes. Relying on intuition when optimizing an application typically results in random "fixes" without real performance impact. +Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. -Inexperienced developers sometimes make changes in their code and claim it *should* run faster. One such example is replacing `i++` (post-increment) with `++i` (pre-increment) all over the code base (assuming that the previous value of `i` is not used). In the general case, this change will make no difference to the generated code: every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition. *Always measure.* +Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. -Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is the [XOR swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm).[^2] In reality, simple `std::swap` produces equivalent or faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to tune should be the result of careful performance analysis, not intuition or guessing. +Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. -Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, but none of them will necessarily lead you to a certain discovery. With experience, you will develop your own strategies about when to use each approach. +Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. [^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) From 8ab70c56bf1aecc5e1136fc2b882aa780ec5118e Mon Sep 17 00:00:00 2001 From: Denis Bakhvalov Date: Sat, 7 Sep 2024 18:18:00 -0400 Subject: [PATCH 4/5] Revert "Update file names for renumbered Ch1 sections" This reverts commit 72c93d29df8974c2ea6182e36b0c8b1e0188e232. --- .../1-3 What is performance analysis.md | 11 ----------- .../1-4 What is performance analysis.md | 16 ++++++---------- .../1-5 What is performance tuning.md | 15 +++++++++++++++ ...n this book.md => 1-6 What is in the book.md} | 0 ...s book.md => 1-7 What is not in this book.md} | 0 .../{1-7 Exercises.md => 1-8 Exercises.md} | 0 6 files changed, 21 insertions(+), 21 deletions(-) delete mode 100644 chapters/1-Introduction/1-3 What is performance analysis.md create mode 100644 chapters/1-Introduction/1-5 What is performance tuning.md rename chapters/1-Introduction/{1-5 What is discussed in this book.md => 1-6 What is in the book.md} (100%) rename chapters/1-Introduction/{1-6 What is not discussed in this book.md => 1-7 What is not in this book.md} (100%) rename chapters/1-Introduction/{1-7 Exercises.md => 1-8 Exercises.md} (100%) diff --git a/chapters/1-Introduction/1-3 What is performance analysis.md b/chapters/1-Introduction/1-3 What is performance analysis.md deleted file mode 100644 index 03a831654d..0000000000 --- a/chapters/1-Introduction/1-3 What is performance analysis.md +++ /dev/null @@ -1,11 +0,0 @@ -## What Is Performance Analysis? - -Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. - -Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. - -Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. - -Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. - -[^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) diff --git a/chapters/1-Introduction/1-4 What is performance analysis.md b/chapters/1-Introduction/1-4 What is performance analysis.md index f848f1194f..03a831654d 100644 --- a/chapters/1-Introduction/1-4 What is performance analysis.md +++ b/chapters/1-Introduction/1-4 What is performance analysis.md @@ -1,15 +1,11 @@ -## What Is Performance Tuning? +## What Is Performance Analysis? -Locating a performance bottleneck is only half of an engineer’s job. The second half is to fix it properly. Sometimes changing one line in the source code of a program can yield a drastic performance boost. Missing such opportunities can be quite wasteful. Performance analysis and tuning are all about finding and fixing this line. +Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. -To take advantage of all the computing power of modern CPUs, you need to understand how they work. Or as performance engineers like to say, you need to have "mechanical sympathy". This term was borrowed from the car racing world. It means that a racing driver with a good understanding of how the car works has an edge over its competitors who don't. The same applies to performance engineering. It is not possible to know all the details of how a modern CPU operates, but you need to have a good mental model of it to squeeze the last bit of performance. +Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. -This is what I mean by "low-level optimizations". This is a type of optimization that takes into account the details of the underlying hardware capabilities. It is different from "high-level optimizations" which are more about application-level logic, algorithms, and data structures. As you will see in the book, the majority of low-level optimizations can be applied to a wide variety of modern processors. To successfully implement low-level optimizations, you need to have a good understanding of the underlying hardware. +Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. -> "During the post-Moore era, it will become ever more important to make code run fast and, in particular, to tailor it to the hardware on which it runs." [@Leisersoneaam9744] +Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. -In the past, software developers had more mechanical sympathy as they often had to deal with nuances of the hardware implementation. During the PC era, developers usually were programming directly on top of the operating system, with possibly a few libraries in between. As the world moved to the cloud era, the software stack got deeper and more complex. The top layer of the stack (on which most developers work) has moved further away from the hardware. The negative side of such evolution is that developers of modern applications have less affinity to the actual hardware on which their software is running. This book will help you build a strong connection with modern processors. - -There is a famous quote by Donald Knuth: "Premature optimization is the root of all evil". But the opposite is often true as well. Postponed performance engineering may be too late and cause as much evil as premature optimization. For developers working with performance-critical projects, it is crucial to know how underlying hardware works. In such industries, it is a failure from the start when a program is being developed without a hardware focus. ClickHouse DB is an example of a successful software product that was built around a small but very efficient core. Performance characteristics of software must be a first-class citizen along with correctness and security starting from day 1. Poor performance can kill a product just as easily as security vulnerabilities. - -Performance engineering is important and rewarding work, but it may be very time-consuming. In fact, performance optimization is a never-ending game. There will always be something to optimize. Inevitably, a developer will reach the point of diminishing returns at which further improvement comes at a very high engineering cost and likely will not be worth the effort. Knowing when to stop optimizing is a critical aspect of performance work. +[^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) diff --git a/chapters/1-Introduction/1-5 What is performance tuning.md b/chapters/1-Introduction/1-5 What is performance tuning.md new file mode 100644 index 0000000000..f848f1194f --- /dev/null +++ b/chapters/1-Introduction/1-5 What is performance tuning.md @@ -0,0 +1,15 @@ +## What Is Performance Tuning? + +Locating a performance bottleneck is only half of an engineer’s job. The second half is to fix it properly. Sometimes changing one line in the source code of a program can yield a drastic performance boost. Missing such opportunities can be quite wasteful. Performance analysis and tuning are all about finding and fixing this line. + +To take advantage of all the computing power of modern CPUs, you need to understand how they work. Or as performance engineers like to say, you need to have "mechanical sympathy". This term was borrowed from the car racing world. It means that a racing driver with a good understanding of how the car works has an edge over its competitors who don't. The same applies to performance engineering. It is not possible to know all the details of how a modern CPU operates, but you need to have a good mental model of it to squeeze the last bit of performance. + +This is what I mean by "low-level optimizations". This is a type of optimization that takes into account the details of the underlying hardware capabilities. It is different from "high-level optimizations" which are more about application-level logic, algorithms, and data structures. As you will see in the book, the majority of low-level optimizations can be applied to a wide variety of modern processors. To successfully implement low-level optimizations, you need to have a good understanding of the underlying hardware. + +> "During the post-Moore era, it will become ever more important to make code run fast and, in particular, to tailor it to the hardware on which it runs." [@Leisersoneaam9744] + +In the past, software developers had more mechanical sympathy as they often had to deal with nuances of the hardware implementation. During the PC era, developers usually were programming directly on top of the operating system, with possibly a few libraries in between. As the world moved to the cloud era, the software stack got deeper and more complex. The top layer of the stack (on which most developers work) has moved further away from the hardware. The negative side of such evolution is that developers of modern applications have less affinity to the actual hardware on which their software is running. This book will help you build a strong connection with modern processors. + +There is a famous quote by Donald Knuth: "Premature optimization is the root of all evil". But the opposite is often true as well. Postponed performance engineering may be too late and cause as much evil as premature optimization. For developers working with performance-critical projects, it is crucial to know how underlying hardware works. In such industries, it is a failure from the start when a program is being developed without a hardware focus. ClickHouse DB is an example of a successful software product that was built around a small but very efficient core. Performance characteristics of software must be a first-class citizen along with correctness and security starting from day 1. Poor performance can kill a product just as easily as security vulnerabilities. + +Performance engineering is important and rewarding work, but it may be very time-consuming. In fact, performance optimization is a never-ending game. There will always be something to optimize. Inevitably, a developer will reach the point of diminishing returns at which further improvement comes at a very high engineering cost and likely will not be worth the effort. Knowing when to stop optimizing is a critical aspect of performance work. diff --git a/chapters/1-Introduction/1-5 What is discussed in this book.md b/chapters/1-Introduction/1-6 What is in the book.md similarity index 100% rename from chapters/1-Introduction/1-5 What is discussed in this book.md rename to chapters/1-Introduction/1-6 What is in the book.md diff --git a/chapters/1-Introduction/1-6 What is not discussed in this book.md b/chapters/1-Introduction/1-7 What is not in this book.md similarity index 100% rename from chapters/1-Introduction/1-6 What is not discussed in this book.md rename to chapters/1-Introduction/1-7 What is not in this book.md diff --git a/chapters/1-Introduction/1-7 Exercises.md b/chapters/1-Introduction/1-8 Exercises.md similarity index 100% rename from chapters/1-Introduction/1-7 Exercises.md rename to chapters/1-Introduction/1-8 Exercises.md From 2d52027576ece95e6807602a7904e91abe0c080f Mon Sep 17 00:00:00 2001 From: Denis Bakhvalov Date: Sat, 7 Sep 2024 18:24:43 -0400 Subject: [PATCH 5/5] Reapplied Nick's edits --- .../1-Introduction/1-4 What is performance analysis.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/chapters/1-Introduction/1-4 What is performance analysis.md b/chapters/1-Introduction/1-4 What is performance analysis.md index 03a831654d..ae4fcc6cb0 100644 --- a/chapters/1-Introduction/1-4 What is performance analysis.md +++ b/chapters/1-Introduction/1-4 What is performance analysis.md @@ -1,11 +1,11 @@ ## What Is Performance Analysis? -Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even a small tweak to the code can trigger a noticeable performance change. Some people rely on intuition when they try to optimize their applications. And usually, it ends up with random fixes here and there without making any real performance impact. +Have you ever found yourself debating with a coworker about the performance of a certain piece of code? Then you probably know how hard it is to predict which code is going to work the best. With so many moving parts inside modern processors, even small tweaks to code can trigger noticeable performance changes. Relying on intuition when optimizing an application typically results in random "fixes" without real performance impact. -Inexperienced developers sometimes make changes in their code and claim it *should* make it faster. One such example is replacing `i++` (pre-increment) with `++i` (post-increment) all over the code base, assuming that the previous value of `i` is not used. In the general case, this change will make no difference to the generated code because every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition, instead *always measure*. +Inexperienced developers sometimes make changes in their code and claim it *should* run faster. One such example is replacing `i++` (post-increment) with `++i` (pre-increment) all over the code base (assuming that the previous value of `i` is not used). In the general case, this change will make no difference to the generated code: every decent optimizing compiler will recognize that the previous value of `i` is not used and will eliminate redundant copies anyway. The first piece of advice in this book is: don't solely rely on your intuition. *Always measure.* -Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is using [XOR-based swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm),[^2] while in reality, simple `std::swap` produces faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to fix should be a result of careful performance analysis, not intuition or guessing. +Many micro-optimization tricks that circulate around the world were valid in the past, but current compilers have already learned them. Additionally, some people tend to overuse legacy bit-twiddling tricks. One such example is the [XOR swap idiom](https://en.wikipedia.org/wiki/XOR_swap_algorithm).[^2] In reality, simple `std::swap` produces equivalent or faster code. Such accidental changes likely won’t improve the performance of an application. Finding the right place to tune should be the result of careful performance analysis, not intuition or guessing. -Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies, however, not each one of them will necessarily lead you to a discovery. With experience, you will develop your own strategy about when to use each approach. +Performance analysis is a process of collecting information about how a program executes and interpreting it to find optimization opportunities. Any change that ends up being made in the source code of a program should be driven by analyzing and interpreting collected data. We will show you how to use performance analysis techniques to discover optimization opportunities even in a large and unfamiliar codebase. There are many performance analysis methodologies. Depending on the problem, some will be more efficient than others. With experience, you will develop your own strategies about when to use each approach. -[^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) +[^2]: XOR-based swap idiom - [https://en.wikipedia.org/wiki/XOR_swap_algorithm](https://en.wikipedia.org/wiki/XOR_swap_algorithm) \ No newline at end of file