[BUG] Matvis does not work with CUDA 12 #90

rlbyrne · 2024-07-19T00:47:25Z

Attempting to run matvis with GPUs using CUDA 12 produces this error:

/opt/devel/rbyrne/envs/py310/lib/python3.10/site-packages/pycuda/cuda/pycuda-helpers.hpp(17): error: expected a ";"
    {
    ^

kernel.cu(119): error: identifier "lerp" is undefined
          lerp(Agrid[origin], Agrid[origin + 1], fx),
          ^

kernel.cu(118): error: identifier "lerp" is undefined
      Asrc[pol * nbeam * nsrc + ant * nsrc + src] = lerp(
                                                    ^

At end of source: error: expected a "}"
kernel.cu(1): note #3196-D: to match this "{"
  extern "C" {
             ^

46 errors detected in the compilation of "kernel.cu".
]

I've attempted a workaround by installing CUDA 11 with conda but it hasn't worked. The installations I performed were:
conda install -c conda-forge cudatoolkit=11
conda install -c conda-forge pycuda
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
The resulting error traceback is:

pycuda.driver.CompileError: nvcc preprocessing of /tmp/tmp9gepm5pa.cu failed
[command: nvcc --preprocess -arch sm_86 -I/opt/devel/rbyrne/envs/py310/lib/python3.10/site-packages/pycuda/cuda /tmp/tmp9gepm5pa.cu --compiler-options -P]
[stderr:
b'cc1plus: fatal error: cuda_runtime.h: No such file or directory\ncompilation terminated.\n']

The text was updated successfully, but these errors were encountered:

steven-murray · 2024-07-19T12:09:12Z

Thanks @rlbyrne. The issue with CUDA 11 seems to be an environment issue with your system. It would be good to get matvis working with CUDA 12 though, so I will look into that. Unfortunately, this is beyond what my current schedule can handle, so it might be a few weeks. How urgent is this? I wonder if we can find a workaround.

rlbyrne · 2024-07-19T12:34:17Z

@steven-murray There is some time pressure, so a workaround sounds better than waiting a few weeks. Any ideas how to resolve the environment issue?

piyanatk · 2024-07-19T14:37:36Z

Hi @rlbyrne . To clarify @steven-murray reply a bit, yes, matvis currently only works with CUDA 11 because the GPU code is developed in that version.

As for your error, it indeed seem to be a compiling issue coming from nvcc. Do you have this command after installing cudatoolkit and pycuda. This stackoverflow post suggests that cudatoolkit installed through conda-forge doesn't have nvcc, but the one install from nvidia channel does. I would check on that.

The version of the CUDA driver on your system also matter. On an HPC, I have to load the correct version of CUDA driver, usually with the module load cuda=11 command, for example.

rlbyrne · 2024-07-20T03:47:52Z

@piyanatk I'm trying to install cuda via the nvidia channel (conda install cuda -c nvidia), but the environment solve has been hanging for hours.

The module load cuda=11 command errors with message Lmod has detected the following error: The following module(s) are unknown: "cuda=11"

piyanatk · 2024-07-20T06:56:47Z

@rlbyrne I am not sure about the issue with installing from nvidia channel ... maybe switch to mamba? It solves the environment a lot quicker. On your base environment, just install mamba from conda-forge channel. Then, you can pretty much use it as a replacement for conda

For loading the driver with module, you must check with module avai command first if the software is installed in the system and which versions are available. I also made a mistake in my previous post - module, at least on the HPC I am using, use / to indicate the version, so I would have to do something like, module load cuda/11.8.0_520.61.0, for example.

rlbyrne · 2024-07-20T07:39:20Z

@piyanatk I don't see any cuda packages under module avail. Running the cuda installation with mamba worked, but I am still getting an error (traceback 71 errors detected in the compilation of "kernel.cu".)

piyanatk · 2024-07-20T08:19:06Z

@rlbyrne Can you check your cuda version that mamba has installed, mamba list | grep cuda. I am suspecting that it is version 12. Also, do which nvcc to check that it point to the cuda install through mamba and not other ones in the system.

BTW, are you running matvis through hera_sim?

piyanatk · 2024-07-20T09:50:48Z

@rlbyrne. OK. I got it working without the system CUDA installed.

mamba create -n <env_name> python=3.11
mamba activate <env_name>
mamba install -c nvidia cuda=11 cuda-python=11 cuda-nvcc=11 cuda-toolkit=11
mamba install -c conda-forge numpy=1.26.4 pyuvdata=2.4.5 pycuda
pip install matvis[gpu] hera_sim[gpu] pyuvsim pyradiosky

It turns out that if you only install cuda=11, it doesn't install version 11 of the other required libraries and compiler. Luckily the library is packed into the cuda-python package, so we don't have to manually install version 11 of every library.

Although not cuda related, numpy and pyuvdata versions here are important because we do not support Numpy 2 and pyuvdata 3 yet

@steven-murray We should update the package requirements until we have time to update to CUDA 12. I will make an environment file for GPU installation and put it somewhere, also update the documentation if I can find time.

steven-murray · 2024-07-21T10:04:56Z

Hey @piyanatk, yes that would be much appreciated! There is a PR open for hera_sim to make it compatible with numpy2 and pyuvdata 3, so hopefully it will be merged inside the week. A GPU environment file would be very useful (even if it does go out of date rather fast).

rlbyrne · 2024-07-22T21:37:02Z

Ok things are looking promising! I think the job is running. Thank you @piyanatk

rlbyrne · 2024-07-23T07:16:32Z

Unfortunately running things with the GPUs didn't speed things up at all. A single time and frequency step for the OVRO-LWA took 140 minutes with the GPU setting and 132 minutes without, so actually slower with the GPU setting. Any idea what's going wrong?

piyanatk · 2024-07-23T08:29:08Z

@rlbyrne Can you please share the command that you use?

rlbyrne · 2024-07-23T17:22:35Z

@piyanatk I'm just setting use_gpu=True. Is there something else I need to be doing?

The full call is

matvis.simulate_vis(
          ants=antpos,
          fluxes=m3.stokes[0].T.to_value("Jy"),
          ra=ra_new,
          dec=dec_new,
          freqs=freqs,
          lsts=np.array([lst.to_value("rad")]),
          beams=beams,
          beam_idx=beam_ids,
          polarized=True,
          precision=2,
          latitude=location.lat.rad,
          use_gpu=True,
)

steven-murray · 2024-07-24T06:04:05Z

@rlbyrne I think the bigger question is what is the configuration of your simulation? How many baselines and sky sources/pixels? And also are your beams analytic or UVBeams, and if UVBeams, how many pixels? For some smaller simulations (per time and freq) the overheads are dominant and GPU isn't that useful.

steven-murray · 2024-07-24T06:07:41Z

You could also try running some line-profiling to see what the dominant bottleneck is. You can also check out the matvis profile command with your setup-size to see if it reflects what you're seeing (though it currently has the limitation that if you're working with a simulated beam model, it only profiles with a low-res model, which under-estimates the beam-interpolation part of the calculation).

piyanatk · 2024-07-24T07:38:53Z

@rlbyrne I want to also suggest using the wrapper in hera_sim to run matvis. It takes puuvsim style configuration file and works with any array and telescope configurations. The command line tool hera-sim-vis.py also has a build in profiling options.

rlbyrne · 2024-07-24T19:11:15Z

The simulation has 1 time, 1 frequency, 62,128 baselines, and I would estimate 1,572,864 "sources" (pixels in the hemisphere for a Healpix map with nside 512).

I do suspect the beam size has some impact. I was previously using a lower resolution beam and things were running faster, although I didn't actually profile the speed. I can try downsampling the beam and hope it doesn't have much impact on the result. Does matvis support any other beam formats that could run faster? Like a Gaussian decomposition?

I haven't worked with pyuvsim configuration files. Is that expected to speed things up?

piyanatk · 2024-07-25T07:41:39Z

Hmm, the GPU should speed things up with that number of sources and baselines based on my testing.

What kind of beam model are you using? Is it an e-field CST beam in a UVBeam file? You can try to specify the order of beam spatial interpolation through the beam_spline_opts keyword, and see if that help. I don't remember exactly what matvis uses by default.

The interface in hera_sim.visibilities does not speed things up, but it is more streamline and unified to work with other simulators, including pyuvsim and fftvis. It is easier to use and to avoid mistakes, in my opinion. If you want a simple Gaussian or Airy beam, that is already available by setting an appropriate beam types in the configuration file. If you want to use a more complex analytic fit of a beam, you will have to write a subclass of AnalyticBeam from pyuvsim.

steven-murray · 2024-07-30T11:19:36Z

Sorry to be late on this again -- I was on vacation. I agree with @piyanatk that for the simulation size you're using, there should be significant speed up with GPU, so I'm not quite sure what is going on here.

Is your beam on a rectilinear alt/az grid or in healpix? I am assuming the former because currently the GPU version of matvis can't handle the latter. Also, if you're using the same beam for each antenna, make sure you specify only one unique beam in the beam list (and use the same beam index for each antenna).

For this size of simulation with a single unique beam, I wouldn't expect to the beam interpolation to be the bottleneck, but rather the single big matrix multiplication. This should be much faster on the GPU no matter how you slice it.

rlbyrne assigned steven-murray Jul 19, 2024

steven-murray added the priority: high label Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Matvis does not work with CUDA 12 #90

[BUG] Matvis does not work with CUDA 12 #90

rlbyrne commented Jul 19, 2024

steven-murray commented Jul 19, 2024

rlbyrne commented Jul 19, 2024

piyanatk commented Jul 19, 2024

rlbyrne commented Jul 20, 2024

piyanatk commented Jul 20, 2024

rlbyrne commented Jul 20, 2024

piyanatk commented Jul 20, 2024

piyanatk commented Jul 20, 2024

steven-murray commented Jul 21, 2024

rlbyrne commented Jul 22, 2024

rlbyrne commented Jul 23, 2024

piyanatk commented Jul 23, 2024

rlbyrne commented Jul 23, 2024

steven-murray commented Jul 24, 2024

steven-murray commented Jul 24, 2024

piyanatk commented Jul 24, 2024

rlbyrne commented Jul 24, 2024

piyanatk commented Jul 25, 2024

steven-murray commented Jul 30, 2024

[BUG] Matvis does not work with CUDA 12 #90

[BUG] Matvis does not work with CUDA 12 #90

Comments

rlbyrne commented Jul 19, 2024

steven-murray commented Jul 19, 2024

rlbyrne commented Jul 19, 2024

piyanatk commented Jul 19, 2024

rlbyrne commented Jul 20, 2024

piyanatk commented Jul 20, 2024

rlbyrne commented Jul 20, 2024

piyanatk commented Jul 20, 2024

piyanatk commented Jul 20, 2024

steven-murray commented Jul 21, 2024

rlbyrne commented Jul 22, 2024

rlbyrne commented Jul 23, 2024

piyanatk commented Jul 23, 2024

rlbyrne commented Jul 23, 2024

steven-murray commented Jul 24, 2024

steven-murray commented Jul 24, 2024

piyanatk commented Jul 24, 2024

rlbyrne commented Jul 24, 2024

piyanatk commented Jul 25, 2024

steven-murray commented Jul 30, 2024