Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function outlining errors in mirgecom #224

Open
MTCam opened this issue Apr 20, 2023 · 0 comments
Open

Function outlining errors in mirgecom #224

MTCam opened this issue Apr 20, 2023 · 0 comments

Comments

@MTCam
Copy link
Contributor

MTCam commented Apr 20, 2023

Trying out the outlining capability in #221 with mirgecom:

To reproduce this, install mirgecom:
emirge/install.sh --branch=production-outlining

Then run any example with CNS operator (cd examples):
python -m mpi4py combozzle-mpi.py

The error is as follows:

frozen_inv_metric_deriv_vol: check array access within bounds: started 3s ago
frozen_inv_metric_deriv_vol: check array access within bounds: completed (3.94s wall 1.00x CPU)
frozen_inv_metric_deriv_vol: generate code: completed (2.18s wall 1.00x CPU)
build program: kernel 'frozen_inv_metric_deriv_vol' was part of a lengthy source build resulting from a binary cache miss (2.60 s)
/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/pyopencl/invoker.py:366: UserWarning: Kernel 'frozen_inv_metric_deriv_vol_0' has 468 arguments with a total size of 3744 bytes, which approaches the limit of 4352 bytes on <pyopencl.Device 'Tesla V100-SXM2-16GB' on 'Portable Computing Language' at 0x101f47d38>. This might lead to compilation errors, especially on GPU devices.
  warn(f"Kernel '{function_name}' has {num_cl_args} arguments with "
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/mpi4py/__main__.py", line 7, in <module>
    main()
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/mpi4py/run.py", line 198, in main
    run_command_line(args)
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/miniforge3/envs/prediction.env/lib/python3.11/site-packages/mpi4py/run.py", line 47, in run_command_line
    run_path(sys.argv[0], run_name='__main__')
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "combozzle-mpi.py", line 1309, in <module>
    main(use_logmgr=args.log, use_leap=args.leap, input_file=input_file,
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/mpi.py", line 200, in wrapped_func
    func(*args, **kwargs)
  File "combozzle-mpi.py", line 1211, in main
    advance_state(rhs=my_rhs, timestepper=timestepper,
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/steppers.py", line 439, in advance_state
    _advance_state_stepper_func(
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/steppers.py", line 164, in _advance_state_stepper_func
    state = timestepper(state=state, t=t, dt=dt, rhs=maybe_compiled_rhs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/integrators/lsrk.py", line 66, in euler_step
    return lsrk_step(EulerCoefs, state, t, dt, rhs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/integrators/lsrk.py", line 53, in lsrk_step
    k = coefs.A[i]*k + dt*rhs(t + coefs.C[i]*dt, state)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/arraycontext/arraycontext/impl/pytato/compile.py", line 316, in __call__
    output_template = self.f(
                      ^^^^^^^
  File "combozzle-mpi.py", line 1154, in cfd_rhs
    ns_operator(
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/mirgecom/mirgecom/navierstokes.py", line 462, in ns_operator
    grad_cv = get_grad_cv(state)
              ^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/arraycontext/arraycontext/impl/pytato/outline.py", line 159, in __call__
    call_site_output = func_def(**call_parameters)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/gpfs1/mtcampbe/CEESD/AutomatedTesting/MIRGE-Timing/timing/y3-prediction-testing/emirge/pytato/pytato/function.py", line 172, in __call__
    if expected_arg.dtype != kwargs[argname].dtype:
                             ~~~~~~^^^^^^^^^
KeyError: '_actx_in_1_0_mass_0'
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

As an initial guess, it looks like maybe the outlining infrastructure picks apart the array containers down to DOFArrays, but then looks for the function arguments to be the DOFArrays instead of the containers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant