You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I'm facing some issues when i tried running the benchmark for 3d-unet.
When i ran make run RUN_ARGS="--benchmarks=3d-unet --scenarios=offline,server""
Got the errors, which is also the system i'm working on Detected system did not match any known systems. Exiting. SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='AMD EPYC 9554 64-Core Processor', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=56, threads_per_core=2): 2}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=1.5849542239999999, byte_suffix=<ByteSuffix.TB: (1000, 4)>, _num_bytes=1584954224000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA H100 PCIe', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=79.6474609375, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=85520809984), max_power_limit=350.0, pci_id='0x233110DE', compute_sm=90): 8})), numa_conf=NUMAConfiguration(numa_nodes={}, num_numa_nodes=16), system_id=None)
Driver Version: 550.90.07 CUDA Version: 12.4
I didn't manually add any system configuration under /work/code/common/system since i didn't saw 4.1 inference result submitter who submit on H100 Pcie 80GB customized that.
any suggestion could help me to pass the error?
thanks a lot!
The text was updated successfully, but these errors were encountered:
Hi @loganwuw Nvidia implementation detects the system configuration which includes the GPU and CPUs and if the system is not a known one separate scripts need to be called to initialize the system. If you want to benchmark 3d-unet, you can use the below CM wrapping - where those manual steps are automated. We actually do nightly runs for these benchmarks and store the results here
Hi,
I'm facing some issues when i tried running the benchmark for 3d-unet.
When i ran
make run RUN_ARGS="--benchmarks=3d-unet --scenarios=offline,server""
Got the errors, which is also the system i'm working on
Detected system did not match any known systems. Exiting. SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='AMD EPYC 9554 64-Core Processor', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=56, threads_per_core=2): 2}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=1.5849542239999999, byte_suffix=<ByteSuffix.TB: (1000, 4)>, _num_bytes=1584954224000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA H100 PCIe', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=79.6474609375, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=85520809984), max_power_limit=350.0, pci_id='0x233110DE', compute_sm=90): 8})), numa_conf=NUMAConfiguration(numa_nodes={}, num_numa_nodes=16), system_id=None)
Driver Version: 550.90.07 CUDA Version: 12.4
I didn't manually add any system configuration under /work/code/common/system since i didn't saw 4.1 inference result submitter who submit on H100 Pcie 80GB customized that.
any suggestion could help me to pass the error?
thanks a lot!
The text was updated successfully, but these errors were encountered: