Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AMF encoder utilize all VCNs #11752

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nikita-pletnev
Copy link

Description

Make AMF encoder utilize all VCNs (HW encoding units in AMD GPUs) which are present in used GPU, currently only one VCN is used. Reduce amount of D3D devices being created by making one shared device being used by all AMF encode sessions.

Motivation and Context

When use obs-multi-rtmp plugin to run multiple streams at once this change significantly increases performance allowing to run more streams simultaneously.

How Has This Been Tested?

Run OBS on Windows 10, AMD Radeon RX 6800, and run multiple streams with obs-multi-rtmp plugin

Types of changes

  • Performance enhancement (non-breaking change which improves efficiency)

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.
  • I have included updates to all appropriate documentation.

plugins/obs-ffmpeg/texture-amf.cpp Outdated Show resolved Hide resolved
plugins/obs-ffmpeg/texture-amf.cpp Outdated Show resolved Hide resolved
plugins/obs-ffmpeg/texture-amf.cpp Outdated Show resolved Hide resolved
VCNs are HW units which can process encoders in parallel boosting
performance if several encoders are run simultaneously. Added controller
to balance load of VCNs. Made AMF context and D3D11 device and context
shared between all encoder instances.

Using KeyedMutex to synchronize copying from texture from OBS core to
texture allocated and used in encoder caused submitting a lot of HW fences
to all HW queues leading to parallel encode queues waiting for other
queue's job's done and hence significant performance degradation.
To work this around, encoder's texture is created as shared, it's opened
and copy is done on a device used in OBS core, using synchronization on CPU.
@nikita-pletnev
Copy link
Author

Addressed comments above, squashed commits and rebased on the latest master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants