Add MPS and XPU devices #125

ElliottKasoar · 2024-05-06T11:34:51Z

Adds device options for MPS (Apple GPU) and XPU (Intel GPU), similarly to the addition of GPUs via CUDA.

In theory there are quite a few additional devices we could add (full list here / here), but these two are of most interest from discussions with @jatkinson1000.

I haven't been able to test the XPU device, but basic tests with MPS seem to suggest it's working as expected:

In example 2, resnet_infer_fortran, setting:

model = torch_module_load(args(1), device_type=torch_kMPS)

without changing the input tensor device throws an error:

RuntimeError: slow_conv2d_forward_mps: input(device='cpu') and weight(device=mps:0')  must be on the same device

Similarly, setting the input tensor device, but not the model

in_tensor(1) = torch_tensor_from_array(in_data, in_layout, torch_kMPS)

throws an error:

RuntimeError: Input type (MPSFloatType) and weight type (CPUFloatType) should be the same

Setting both works and the expected output is produced:

Samoyed (id=         259 ), : probability =  0.884624064

I also see spikes in activity on my GPU (for the largest spikes, I added a loop around the example inference):

Note, when running 10,000 iterations of the inference, I got an error:

RuntimeError: MPS backend out of memory (MPS allocated: 45.89 GB, other allocations: 9.72 MB, max allowed: 45.90 GB). Tried to allocate 784.00 KB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

which might suggest a problem with cleanup.

I don't think this is specific to MPS, so might be worth checking on GPU too (you can reduce the CUDA memory to debug more easily, if it helps).

jatkinson1000 · 2024-05-13T11:58:23Z

Potentially closes #127 which is an issue opened in relation to this PR.

ElliottKasoar added 2 commits April 4, 2024 00:29

Add MacOS GPU device option

95b2ebf

Add XPU device option

9ed5bf5

ElliottKasoar added the enhancement New feature or request label May 6, 2024

ElliottKasoar requested a review from jatkinson1000 May 6, 2024 11:34

jatkinson1000 mentioned this pull request May 13, 2024

XPU and MPS support #127

Open

jatkinson1000 linked an issue May 13, 2024 that may be closed by this pull request

XPU and MPS support #127

Open

jatkinson1000 mentioned this pull request May 13, 2024

Check AMD GPU support #128

Open

ma595 self-assigned this Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MPS and XPU devices #125

Add MPS and XPU devices #125

ElliottKasoar commented May 6, 2024

jatkinson1000 commented May 13, 2024

Add MPS and XPU devices #125

Are you sure you want to change the base?

Add MPS and XPU devices #125

Conversation

ElliottKasoar commented May 6, 2024

jatkinson1000 commented May 13, 2024