Skip to content

Commit

Permalink
[Chapter 8] - summary
Browse files Browse the repository at this point in the history
  • Loading branch information
dendibakh committed May 24, 2024
1 parent aabdd44 commit 95cfd8c
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
1. Solve the `perf-ninja::data_packing` lab assignment, in which you need to make the data structure more compact.
2. Solve the `perf-ninja::huge_pages_1` lab assignment using methods we discussed in [@sec:secDTLB]. Observe any changes in performance, huge page allocation in `/proc/meminfo`, and CPU performance counters that measure DTLB loads and misses.
3. Solve the `perf-ninja::swmem_prefetch_1` lab assignment by implementing explicit memory prefetching for future loop iterations.
4. Describe what it takes for a piece of code to be cache-friendly?
5. Run the application that you're working with on a daily basis. Measure its memory usage, memory footprint and analyze heap allocations using memory profilers that we discussed in [@sec:MemoryProfiling]. Use a general profiler like Intel VTune or Linux perf to identify hot memory accesses. Is the application cache-friendly? Is there a way to improve it?
4. Describe in general terms what it takes for a piece of code to be cache-friendly.
5. Run the application that you're working with daily. Measure its memory utilization and analyze heap allocations using memory profilers that we discussed in [@sec:MemoryProfiling]. Identify hot memory accesses using Linux perf, Intel VTune, or other profiler. Is there a way to improve those accesses?
8 changes: 3 additions & 5 deletions chapters/8-Optimizing-Memory-Accesses/8-9 Chapter Summary.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
[TODO]: update

## Chapter Summary {.unlisted .unnumbered}

* Most real-world applications experience memory-related performance bottlenecks.
* Performance of the memory subsystem is not growing as fast as CPU performance. Yet, memory accesses are a frequent source of performance problems in many applications. Speeding up such programs requires revising the way they access memory.
* In [@sec:MemBound], we discussed some of the popular recipes for cache-friendly data structures, data reorganization techniques, how to utilize large memory pages to improve DTLB performance, and explicit memory prefetching.
* Most real-world applications experience memory-related performance bottlenecks. Emerging application domains, such as machine learning and big data, are particularly demanding in terms of memory bandwidth and latency.
* Performance of the memory subsystem is not growing as fast as the CPU performance. Yet, memory accesses are a frequent source of performance problems in many applications. Speeding up such programs requires revising the way they access memory.
* In [@sec:MemBound], we discussed frequently used recipes for developing cache-friendly data structures, explored data reorganization techniques, learned how to utilize large memory pages to improve DTLB performance, and how to use explicit memory prefetching to reduce the number of cache misses.

\sectionbreak

0 comments on commit 95cfd8c

Please sign in to comment.