Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAMsys 2.0] Terapool DRAMsys Merge Request #93

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
9db3fa3
[TeraPool] Configurations Changes for TeraPool merge into MemPool
yichao-zh May 17, 2023
87300d9
[DRAMsys] add dram rtl model
Jun 7, 2023
20761d4
[DRAMsys] fix simulation bug
Jun 7, 2023
7d7776c
[DRAMsys] setting to add dramsys support
Jun 7, 2023
84013e2
[DRAMsys] fix bugs: stack overflow when reading from dramsys
Jun 7, 2023
645e364
[Software] Temp change for easy debug
yichao-zh May 31, 2023
f6ae1d4
[DMA] DMA bug fix and mempool trace bug fix
yichao-zh Jun 2, 2023
429915b
[DRAM] Update DRAM lib with AXI reordering
yichao-zh Jun 13, 2023
d0ece15
[DRAM] Format codes
yichao-zh Jun 13, 2023
390eb51
[DRAM] Merge SRAM and DRAM simulation in one RTL file
yichao-zh Jun 13, 2023
4b9b88c
[Software] Update memcpy kernel
yichao-zh Jun 13, 2023
edc7de4
[Makefile] Update Makefile control simulation with dram var
yichao-zh Jun 13, 2023
31ada7c
[DRAM] Delete old file
yichao-zh Jun 13, 2023
922ff48
[Bender] Remove the deleted RTL file
yichao-zh Jun 14, 2023
b5f0b9d
[RTL] Change the AXI MUX to AXI Xbar to connect the DRAM
yichao-zh Jun 14, 2023
6ffb02b
[DRAM] DRAM update to support interleaved address mapping
yichao-zh Jun 14, 2023
72e25f7
[DRAM] Update DRAM model to support interleaved mode and fix write bugs
yichao-zh Jul 13, 2023
711c095
[Hardware] Support the different interleave mode for DRAM access
yichao-zh Jul 13, 2023
3d277f4
[Config] L2 address and size update
yichao-zh Jul 17, 2023
6840ad8
[Kernel] memcpy kernel update
yichao-zh Jul 17, 2023
7394a88
[DRAM] Non-Ideal PHY latency support
yichao-zh Jul 18, 2023
7c3afcd
[DRAM] Python Script for DRAM Bandwidth Analysis
yichao-zh Jul 18, 2023
8af0b71
[Format] Format the files for CI check
yichao-zh Jul 19, 2023
53ec53e
[Format] Format and put liscenses to files for CI checking
yichao-zh Jul 19, 2023
5e1c600
[Format] Format DRAM python script for CI check
yichao-zh Jul 19, 2023
addc7fc
[AXI] Update Auto Spliter Adding, Update Interleave SystemVerilog Wri…
yichao-zh Nov 23, 2023
1a7a2b3
[HBM2E] Update DRAM HBM model to MICRON HBM2E-3600
yichao-zh Nov 23, 2023
d8a5278
[Env] Update some configurations, include the fifo size and DRAM conf…
yichao-zh Dec 11, 2023
65efc1f
[Rebase] Rebase the DRAM work on top of main branch
yichao-zh Feb 16, 2024
66cf7ca
[Config] Complete MinPool config for CI checking
yichao-zh Feb 16, 2024
5ab178e
[memcpy] Reduce transfer size for MinPool CI check
yichao-zh Feb 16, 2024
0a3f31a
[memcpy] Reduce transfer size for MinPool CI check
yichao-zh Feb 16, 2024
ed07d95
[memcpy] Remove unused dump from kernel
yichao-zh Feb 16, 2024
4455f4b
[FIFO depth] The Fifo depth tune for support 8 outstanding transction…
yichao-zh Feb 22, 2024
2108979
[DRAMsys] Remove the local version of DRAMsys hardware folder, add th…
yichao-zh Mar 22, 2024
516b3e8
[Config] Move the dram related configurations to the config.mk
yichao-zh Mar 22, 2024
1936ee1
[Makefile] Modify Makefile for updating submodule, patching dram conf…
yichao-zh Mar 22, 2024
1511852
[hardware] hardware change for the new version dramsys support
yichao-zh Mar 22, 2024
f361a6b
[DRAM] Add the configuration files for HBM2 DRAM simulation, these fi…
yichao-zh Mar 22, 2024
9528572
[tb] Change back the simulation clk period to 2ns, but 1ns will have …
yichao-zh Mar 22, 2024
e32c4e5
[Bender] Update bender to the correct RTL name, as DRAMsys updated th…
yichao-zh Mar 22, 2024
0c1ac85
[software] Update memcpy kernel with reasonable transfer size and tur…
yichao-zh Mar 22, 2024
ef0c820
[config] Change simulation to SRAM as L2 for CI checking
yichao-zh Mar 22, 2024
989c7dc
[CI test] Fix tb whitespace tailing and change the bender to compile …
yichao-zh Mar 22, 2024
2d19b34
[CHANGELOG and README] Add changelog and readme for DRAM co-simulation
yichao-zh Mar 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,6 @@
[submodule "hardware/deps/apb"]
path = hardware/deps/apb
url = https://github.com/pulp-platform/apb.git
[submodule "hardware/deps/dram_rtl_sim"]
path = hardware/deps/dram_rtl_sim
url = https://github.com/pulp-platform/dram_rtl_sim.git
4 changes: 4 additions & 0 deletions Bender.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,10 @@ sources:
- hardware/tb/traffic_generator.sv
# Level 2
- hardware/tb/mempool_tb.sv
# DRAMsys
- hardware/deps/dram_rtl_sim/src/sim_dram.sv
- hardware/deps/dram_rtl_sim/src/axi_dram_sim.sv
- hardware/deps/dram_rtl_sim/src/dram_sim_engine.sv

- target: mempool_verilator
files:
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
### Added
- Add `apb` dependency of version 0.2.4
- Add support for the `FENCE` instruction
- Add support for DRAMsys5.0 co-simulation

### Changes
- Add physical feasible TeraPool configuration with SubGroup hierarchy.
Expand Down
36 changes: 35 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ BENDER_INSTALL_DIR ?= ${INSTALL_DIR}/bender
VERILATOR_INSTALL_DIR ?= ${INSTALL_DIR}/verilator
RISCV_TESTS_DIR ?= ${ROOT_DIR}/${SOFTWARE_DIR}/riscv-tests

CMAKE ?= cmake
CMAKE ?= cmake-3.28.3
# CC and CXX are Makefile default variables that are always defined in a Makefile. Hence, overwrite
# the variable if it is only defined by the Makefile (its origin in the Makefile's default).
ifeq ($(origin CC),default)
Expand Down Expand Up @@ -157,6 +157,40 @@ update-deps:
done
git apply hardware/deps/patches/*

# Build, update and patch the DRAMsys submodule
$(eval DRAM_PATH=$(realpath $(shell git config --file .gitmodules --get-regexp dram_rtl_sim.path | awk '/hardware/{ print $$2 }')))
$(eval DRAM_LIB_PATH=$(DRAM_PATH)/dramsys_lib)
$(eval DRAMSYS_PATH=$(DRAM_LIB_PATH)/DRAMSys)
$(eval DRAMSYS_PATCH_PATH=$(DRAM_LIB_PATH)/dramsys_lib_patch)
$(eval DRAMSYS_SO_PATH=$(DRAMSYS_PATH)/build)

clean-dram:
if [ -d "$(DRAMSYS_PATH)" ]; then \
rm -rf $(DRAMSYS_PATH); \
fi

build-dram: clean-dram
if [ ! -d "$(DRAMSYS_PATH)" ]; then \
git clone https://github.com/tukl-msd/DRAMSys.git $(DRAMSYS_PATH); \
fi
cd $(DRAMSYS_PATH) && git reset --hard 8e021ea && git apply $(DRAMSYS_PATCH_PATH)

config-dram: build-dram
@cp hardware/include/dram_config/am_hbm2e_16Gb_pc_brc.json $(DRAMSYS_PATH)/configs/addressmapping/.
@cp hardware/include/dram_config/mc_hbm2e_fr_fcfs_grp.json $(DRAMSYS_PATH)/configs/mcconfig/.
@cp hardware/include/dram_config/ms_hbm2e_16Gb_3600.json $(DRAMSYS_PATH)/configs/memspec/.
@cp hardware/include/dram_config/simconfig_hbm2e.json $(DRAMSYS_PATH)/configs/simconfig/.
@mv $(DRAMSYS_PATH)/configs/hbm2-example.json $(DRAMSYS_PATH)/configs/hbm2-example.json.ori
@cp hardware/include/dram_config/HBM2E-3600.json $(DRAMSYS_PATH)/configs/hbm2-example.json

setup-dram: config-dram
cd $(DRAMSYS_PATH) && \
if [ ! -d "build" ]; then \
mkdir build && cd build; \
CC=gcc-11.2.0 CXX=g++-11.2.0 cmake -DCMAKE_CXX_FLAGS=-fPIC -DCMAKE_C_FLAGS=-fPIC -D DRAMSYS_WITH_DRAMPOWER=ON .. ; \
make -j; \
fi

# Helper targets
.PHONY: clean format apps

Expand Down
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,39 @@ To get a visualization of the traces, check out the `scripts/tracevis.py` script

We also provide Synopsys Spyglass linting scripts in the `hardware/spyglass`. Run `make lint` in the `hardware` folder, with a specific MemPool configuration, to run the tests associated with the `lint_rtl` target.

## DRAMsys Co-Simulation

The MemPool system supports both on-chip SRAM or off-chip DRAM co-simulation for higher hierarchy memory transfering. For off-chip DRAM co-simulation, it incorporates the `dram_rtl_sim` tool as a submodule, build at `hardware/deps/dram_rtl_sim`. Leveraging DRAMSys5.0, it facilitates an effective co-simulation environment between RTL models and DRAMSys5.0 for the simulation of DRAM + CTRL models, with contemporary off-chip DRAM technologies (e.g., LPDDR, DDR, HBM).

The DRAMsys tool aids are open-sourced and can be found here:
[https://github.com/pulp-platform/dram_rtl_sim](https://github.com/pulp-platform/dram_rtl_sim)

### Building DRAMsys Co-Simulation

To prepare for DRAMsys co-simulation, adjust the system configuration by setting `l2_sim_type` to `dram` in `config/config.mk`. Then, execute the following command in the project's root directory to establish the DRAMsys tool aids environment:

```bash
make setup-dram
```

This makefile target automates several tasks:
1. Cleans up the existing DRAMSys5.0 repository, if previously built.
2. Rebuilds the DRAMSys5.0 repository and applies necessary patches within `hardware/deps/dramsys_rtl_sim/dramsys_lib/`.
3. Applies HBM2 DRAM configuration patches tailored for the MemPool system simulation.
4. Compiles the DRAMSys dynamic linkable library located at `hardware/deps/dramsys_rtl_sim/dramsys_lib/DRAMSys`.

**Important:** This environment requires `cmake` version 3.28.1 or higher and GCC version 11.2.0 or above.

### DRAM Chip Configuration

DRAMsys supports a range of contemporary off-chip DRAM technologies, including LPDDR, DDR, and HBM. Configuration files, formatted as `.json`, are accessible in the following directory: `hardware/deps/dramsys_rtl_sim/dramsys_lib/DRAMSys/configs`. Additionally, we provide a recommended HBM2 configuration for the MemPool system located within `hardware/deps/dramsys_rtl_sim/dramsys_lib/DRAMSys`. This configuration is automatically applied as the default setting when establishing the DRAMsys tool aids environment. You are encouraged to review and modify these configurations as necessary to meet your specific simulation requirements.

### Testing MemPool-DRAMSys Co-Simulation

For data transfer testing between the MemPool system and higher hierarchy memory through DMA transfer, use the prepared example kernel located in `software/tests/baremetal/memcpy`. For more detailed methods on building applications and setting up RTL simulation, please refer to the sections aboves.

**Note:** Currently, the simulation crafting tool for off-chip DRAM co-simulation is not open-sourced. We utilize the `Questasim` simulator exclusively.

## Publications
If you use MemPool in your work or research, you can cite us:

Expand Down Expand Up @@ -602,5 +635,10 @@ The open-source simulator [Verilator](https://www.veripool.org/verilator) can be

- `toolchain/verilator` is licensed under GPL. See [Verilator's license](https://github.com/verilator/verilator/blob/master/LICENSE) for more details.

### DRAMsys5.0

- The `dram_rtl_sim` submodule, located at `hardware/deps/dram_rtl_sim`, is licensed under the Solderpad Hardware License 0.51. You can review the license [here](https://github.com/pulp-platform/dram_rtl_sim/blob/main/LICENSE).
- [DRAMSys5.0](https://github.com/tukl-msd/DRAMSys) is utilized for DRAM simulations. For details on its usage and licensing, please refer to the DRAMSys5.0 [license information](https://github.com/tukl-msd/DRAMSys).

</p>
</details>
9 changes: 4 additions & 5 deletions config/config.mk
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,6 @@ boot_addr ?= 2684354560 # A0000000

# L2 memory configuration (in dec)
l2_base ?= 2147483648 # 80000000
l2_size ?= 4194304 # 400000
l2_banks ?= 4

# L1 size per bank (in dec)
l1_bank_size ?= 1024
Expand All @@ -52,9 +50,6 @@ axi_data_width ?= 512
# Read-only cache line width in AXI interconnect (in bits)
ro_line_width ?= 512

# Number of DMA backends in each group
dmas_per_group ?= 4

#############################
## Xqueues configuration ##
#############################
Expand All @@ -72,3 +67,7 @@ xpulpimg ?= 1
# This parameter is only used for TeraPool configurations
num_sub_groups_per_group ?= 1
remote_group_latency_cycles ?= 7

# DRAMsys co-simulation: dram/sram
l2_sim_type ?= sram
dram_axi_width_interleaved ?= 16
9 changes: 8 additions & 1 deletion config/mempool.mk
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,14 @@ num_cores_per_tile ?= 4
banking_factor ?= 4

# Radix for hierarchical AXI interconnect
axi_hier_radix ?= 20
axi_hier_radix ?= 17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, and this change should not impact the backend too much, so if it actually helps with the DRAM's performance, we can keep it that way. But it would be interesting to know why we can't get full performance with smaller bursts on the DRAM.


# Number of AXI masters per group
axi_masters_per_group ?= 1

# Number of DMA backends in each group
dmas_per_group ?= 1 # Brust Length = 16

# L2 Banks/Channels
l2_size ?= 4194304 # 400000
l2_banks ?= 4
10 changes: 7 additions & 3 deletions config/minpool.mk
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,15 @@ axi_data_width ?= 256
# Read-only cache line width in AXI interconnect (in bits)
ro_line_width ?= 256

# Number of DMA backends in each group
dmas_per_group ?= 1

# Radix for hierarchical AXI interconnect
axi_hier_radix ?= 2

# Number of AXI masters per group
axi_masters_per_group ?= 1

# Number of DMA backends in each group
dmas_per_group ?= 1 # Brust Length = 16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MinPool, the burst length will be 8, right?


# L2 Banks/Channels
l2_size ?= 4194304 # 400000
l2_banks ?= 4
3 changes: 2 additions & 1 deletion config/terapool.mk
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ axi_hier_radix ?= 9
axi_masters_per_group ?= 4

# Number of DMA backends in each group
dmas_per_group ?= 4
dmas_per_group ?= 4 # Brust Length = 16

# L2 Banks/Channels
l2_banks = 16
l2_size ?= 16777216 # 1000000
11 changes: 10 additions & 1 deletion hardware/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,12 @@ python ?= python3
# Enable tracing
snitch_trace ?= 1

# Path to DRAMsys
dramsys_resouces_path ?= $(MEMPOOL_DIR)/hardware/deps/dram_rtl_sim/dramsys_lib/DRAMSys/configs
dramsys_lib_path ?= $(MEMPOOL_DIR)/hardware/deps/dram_rtl_sim/dramsys_lib/DRAMSys/build/lib
questa_args += +DRAMSYS_RES=$(dramsys_resouces_path)
questa_args += -sv_lib $(dramsys_lib_path)/libsystemc -sv_lib $(dramsys_lib_path)/libDRAMSys_Simulator

# Check if the specified QuestaSim version exists
ifeq (, $(shell which $(questa_cmd)))
# Spaces are needed for indentation here!
Expand Down Expand Up @@ -112,8 +118,11 @@ vlog_defs += -DRO_LINE_WIDTH=$(ro_line_width)
vlog_defs += -DDMAS_PER_GROUP=$(dmas_per_group)
vlog_defs += -DAXI_HIER_RADIX=$(axi_hier_radix) -DAXI_MASTERS_PER_GROUP=$(axi_masters_per_group)
vlog_defs += -DSEQ_MEM_SIZE=$(seq_mem_size) -DXQUEUE_SIZE=$(xqueue_size)
# This parameter is only used for TeraPool configurations
# The below parameter is only used for TeraPool configurations
vlog_defs += -DNUM_SUB_GROUPS_PER_GROUP=$(num_sub_groups_per_group) -DREMOTE_GROUP_LATENCY_CYCLES=$(remote_group_latency_cycles)
# The below parameter is only used for DRAMsys co-simulation
vlog_defs += -D${l2_sim_type}
vlog_defs += -DDRAM_AXI_WIDTH_INTERLEAVED=${dram_axi_width_interleaved}

# Traffic generation enabled
ifdef tg
Expand Down
1 change: 1 addition & 0 deletions hardware/deps/dram_rtl_sim
Submodule dram_rtl_sim added at 15caf3
15 changes: 15 additions & 0 deletions hardware/include/dram_config/HBM2E-3600.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"simulation": {
"addressmapping": "am_hbm2e_16Gb_pc_brc.json",
"mcconfig": "mc_hbm2e_fr_fcfs_grp.json",
"memspec": "ms_hbm2e_16Gb_3600.json",
"simconfig": "simconfig_hbm2e.json",
"simulationid": "hbm2e",
"tracesetup": [
{
"clkMhz": 1800,
"name": "HBM2E.stl"
}
]
}
}
47 changes: 47 additions & 0 deletions hardware/include/dram_config/am_hbm2e_16Gb_pc_brc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
{
"addressmapping": {
"BYTE_BIT": [
0,
1,
2
],
"COLUMN_BIT": [
3,
4,
8,
9,
10,
11,
12
],
"PSEUDOCHANNEL_BIT":[
5
],
"BANK_BIT": [
16,
17
],
"BANKGROUP_BIT":[
6,
7,
13
],
"ROW_BIT": [
14,
15,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30
]
}
}
20 changes: 20 additions & 0 deletions hardware/include/dram_config/mc_hbm2e_fr_fcfs_grp.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"mcconfig": {
"PagePolicy": "Open",
"Scheduler": "FrFcfsGrp",
"SchedulerBuffer": "Bankwise",
"RequestBufferSize": 128,
"CmdMux": "Oldest",
"RespQueue": "Fifo",
"RefreshPolicy": "AllBank",
"RefreshMaxPostponed": 8,
"RefreshMaxPulledin": 8,
"PowerDownPolicy": "NoPowerDown",
"Arbiter": "Simple",
"PhyDelayFw": 8,
"PhyDelayBw": 9,
"ThinkDelayFw": 12,
"ThinkDelayBW": 12,
"MaxActiveTransactions": 128
}
}
48 changes: 48 additions & 0 deletions hardware/include/dram_config/ms_hbm2e_16Gb_3600.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"memspec": {
"memarchitecturespec": {
"burstLength": 4,
"dataRate": 2,
"nbrOfBankGroups": 8,
"nbrOfBanks": 32,
"nbrOfColumns": 128,
"nbrOfPseudoChannels": 2,
"nbrOfRows": 32768,
"width": 64,
"nbrOfDevices": 1,
"nbrOfChannels": 1
},
"memoryId": "Test MemPool-TeraPool with HBM2 upto 3600bps (16Gb, Single Channel)",
"memoryType": "HBM2",
"memtimingspec": {
"CCDL": 4,
"CCDS": 2,
"CKE": 8,
"DQSCK": 2,
"FAW": 9,
"PL": 2,
"RAS": 30,
"RC": 45,
"RCDRD": 16,
"RCDWR": 12,
"REFI": 3900,
"REFISB": 122,
"RFC": 260,
"RFCSB": 200,
"RL": 41,
"RP": 15,
"RRDL": 2.22,
"RRDS": 2.22,
"RREFD": 8,
"RTP": 4,
"RTW": 18,
"WL": 8,
"WR": 41,
"WTRL": 6,
"WTRS": 4,
"XP": 10,
"XS": 270,
"clkMhz": 1800
}
}
}
15 changes: 15 additions & 0 deletions hardware/include/dram_config/simconfig_hbm2e.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"simconfig": {
"AddressOffset": 0,
"CheckTLM2Protocol": false,
"DatabaseRecording": true,
"Debug": false,
"EnableWindowing": true,
"PowerAnalysis": false,
"SimulationName": "hbm2e",
"SimulationProgressBar": true,
"StoreMode": "Store",
"UseMalloc": false,
"WindowSize": 300
}
}
Loading