ChampSim is a trace-based simulator for a microarchitecture study. You can sign up to the public mailing list by sending an empty mail to [email protected]. Traces for the 3rd Data Prefetching Championship (DPC-3) can be found from here (https://dpc3.compas.cs.stonybrook.edu/?SW_IS). A set of traces used for the 2nd Cache Replacement Championship (CRC-2) can be found from this link. (http://bit.ly/2t2nkUj)
Install PIN
Decompress the PIN version 3.11. The version that supported by ChampSim in the original document (version 3.2)does not support the latest kernel any more.
tar -zxvf pin-3.11-97998-g7ecce2dac-gcc-linux.tar.gz
Change the path of the trace collecting script
In tracer/run_tracer.sh
, there are several variable that need to be changed in order to run the script on your own machine.
${PIN_SOURCE}: The path to the PIN binary
${BENCH_BIN}: The path to the binary of all benchmarks that will be instrumented
${TRACE_DIR}: The path that the trace file will located.
In tracer/make_tracer.sh
, change the first line to the path that your PIN binary located.
export PIN_ROOT=PATH_TO_YOUR_PIN_BINARY
In run_champsim.sh
, change the default trace file, default one is $PWD/dpc3_traces
TRACE_DIR=PATH_TO_YOUR_TRACE_FILE # should be equal to the ${TRACE_DIR}
git clone https://github.com/ChampSim/ChampSim.git
ChampSim takes five parameters: Branch predictor, L1D prefetcher, L2C prefetcher, LLC replacement policy, and the number of cores.
For example, ./build_champsim.sh bimodal no no lru 1
builds a single-core processor with bimodal branch predictor, no L1/L2 data prefetchers, and the baseline LRU replacement policy for the LLC.
$ ./build_champsim.sh bimodal no no no lru 1
$ ./build_champsim.sh ${BRANCH} ${L1D_PREFETCHER} ${L2C_PREFETCHER} ${LLC_PREFETCHER} ${LLC_REPLACEMENT} ${NUM_CORE}
Execute run_champsim.sh
with proper input arguments. The default TRACE_DIR
in run_champsim.sh
is set to $PWD/dpc3_traces
.
- Single-core simulation: Run simulation with
run_champsim.sh
script. NOTE that the N_WARM and N_SIM here should be the full number, not just the number of millions, i.e., if you'd like to do warmup with 1 million instructions, please give the input 1000000, not 1
Usage: ./run_champsim.sh [BINARY] [N_WARM] [N_SIM] [TRACE] [OPTION]
$ ./run_champsim.sh bimodal-no-no-no-lru-1core 1 10 400.perlbench-41B.champsimtrace.xz
${BINARY}: ChampSim binary compiled by "build_champsim.sh" (bimodal-no-no-lru-1core)
${N_WARM}: number of instructions for warmup (1 million)
${N_SIM}: number of instructinos for detailed simulation (10 million)
${TRACE}: trace name (400.perlbench-41B.champsimtrace.xz)
${OPTION}: extra option for "-low_bandwidth" (src/main.cc)
Simulation results will be stored under "results_${N_SIM}M" as a form of "${TRACE}-${BINARY}-${OPTION}.txt".
- Multi-core simulation: Run simulation with
run_4core.sh
script.
Usage: ./run_4core.sh [BINARY] [N_WARM] [N_SIM] [N_MIX] [TRACE0] [TRACE1] [TRACE2] [TRACE3] [OPTION]
$ ./run_4core.sh bimodal-no-no-no-lru-4core 1 10 0 400.perlbench-41B.champsimtrace.xz \\
401.bzip2-38B.champsimtrace.xz 403.gcc-17B.champsimtrace.xz 410.bwaves-945B.champsimtrace.xz
Note that we need to specify multiple trace files for run_4core.sh
. N_MIX
is used to represent a unique ID for mixed multi-programmed workloads.
Add your own branch predictor, data prefetchers, and replacement policy (For Hawkeye, this step can be ignored)
Copy an empty template
$ cp branch/branch_predictor.cc prefetcher/mybranch.bpred
$ cp prefetcher/l1d_prefetcher.cc prefetcher/mypref.l1d_pref
$ cp prefetcher/l2c_prefetcher.cc prefetcher/mypref.l2c_pref
$ cp prefetcher/llc_prefetcher.cc prefetcher/mypref.llc_pref
$ cp replacement/llc_replacement.cc replacement/myrepl.llc_repl
Work on your algorithms with your favorite text editor
$ vim branch/mybranch.bpred
$ vim prefetcher/mypref.l1d_pref
$ vim prefetcher/mypref.l2c_pref
$ vim prefetcher/mypref.llc_pref
$ vim replacement/myrepl.llc_repl
Compile and test
$ ./build_champsim.sh mybranch mypref mypref mypref myrepl 1
$ ./run_champsim.sh mybranch-mypref-mypref-mypref-myrepl-1core 1 10 bzip2_183B
python run.py
under tracer folder is easiest approach to generate poly trace. Use the Pin tool like this
pin -t obj-intel64/champsim_tracer.so -- <your program here>
The tracer has three options you can set:
-o
Specify the output file for your trace.
The default is default_trace.champsim
-s <number>
Specify the number of instructions to skip in the program before tracing begins.
The default value is 0.
-t <number>
The number of instructions to trace, after -s instructions have been skipped.
The default value is 1,000,000.
For example, you could trace 200,000 instructions of the program ls, after skipping the first 100,000 instructions, with this command:
pin -t obj/champsim_tracer.so -o traces/ls_trace.champsim -s 100000 -t 200000 -- ls
Traces created with the champsim_tracer.so are approximately 64 bytes per instruction, but they generally compress down to less than a byte per instruction using xz compression.
Use scripts
# under tracer/
./run_tracer ${BENCHMARK_NAME} ${INSTR_TO_IGNORE} ${INSTR_TO_TRACE}
# For example
./run_tracer 2mm 1000000 10000000
We provide the instruction_count.csv
under tracer
directory. It contains the number of dynamic instructions for kernels for each polyhedral benchamrk.
Here is an example:
(benchmark), (instr_num to ignre), (total_instr_num), (kernel_instr_num)
2mm,2079427,288675485,286596058
ChampSim measures the IPC (Instruction Per Cycle) value as a performance metric.
There are some other useful metrics printed out at the end of simulation.
Good luck and be a champion!