Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

obs-nvenc: Fix incorrect CUDA array size allocation #11924

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

94Bo
Copy link

@94Bo 94Bo commented Mar 5, 2025

Description

Fix incorrect CUDA array size allocation.

Motivation and Context

The obs-nvenc module was using a double-sized CUDA array as a shared texture with the P010 format.

The original code snippet:

desc.Format = CU_AD_FORMAT_UNSIGNED_INT16;
desc.Height += enc->cy / 2;
desc.NumChannels = 2; // number of bytes per element

According to the cuArray3DCreate documentation,
the previous descriptor requested 4 bytes per element, which was incorrect.

How Has This Been Tested?

The changes were tested on Ubuntu 24.10 with an RTX 3060 Ti GPU.

To force obs-nvenc to use shared textures as if it were running on a different GPU,
in the hevc_nvenc_create function within nvenc.c, nvenc_create_base was called with texture=false.

The frontend configuration used the P010 color format with the NVENC HEVC encoder.

The following recording scenarios were tested:

  1. CU_AD_FORMAT_UNSIGNED_INT16 with channels=2 (original): Everything OK
  2. CU_AD_FORMAT_UNSIGNED_INT16 with channels=1: Everything OK
  3. CU_AD_FORMAT_UNSIGNED_INT8 with channels=2: Everything OK
  4. CU_AD_FORMAT_UNSIGNED_INT8 with channels=1: Error occurred.

Types of changes

Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.
  • I have included updates to all appropriate documentation.

Change cuda array use 2 bytes per element with P010 format.
Copy link
Member

@derrod derrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine but please fix the commit message.

This fix corrects the CUDA array size allocation with CU_AD_FORMAT_UNSIGNED_INT16 and 1 channel in P010 format.
@94Bo 94Bo requested a review from derrod March 5, 2025 05:11
@WizardCM WizardCM added the Bug Fix Non-breaking change which fixes an issue label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Fix Non-breaking change which fixes an issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants