Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/efa: Fix error handling in efa_rdm_cq_poll_ibv_cq #9973

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

shijin-aws
Copy link
Contributor

When ibv_start_poll or ibv_next_poll return error, it may read a cqe from a destroyed EP (QP). In this case we shouldn't call efa_rdm_cq_get_prov_errno, which may use a freed EP from the pkt_entry (wr_id) and cause segmentation fault.

When ibv_start_poll or ibv_next_poll return error, it may read
a cqe from a destroyed EP (QP). In this case we shouldn't call
efa_rdm_cq_get_prov_errno, which may use a freed EP from the
pkt_entry (wr_id) and cause segmentation fault.

Signed-off-by: Shi Jin <[email protected]>
@shijin-aws shijin-aws requested a review from a team April 2, 2024 23:02
@darrylabbate
Copy link
Member

Is it possible to capture this in a condition inside efa_rdm_cq_get_prov_errno() instead?

@shijin-aws
Copy link
Contributor Author

shijin-aws commented Apr 2, 2024

Is it possible to capture this in a condition inside efa_rdm_cq_get_prov_errno() instead?

Unless we want to make the return code (say err) of ibv_start_poll as an input for efa_rdm_cq_get_prov_errno(). When err is EINVAL, that means the polled pkt entry is from a destroyed QP.

There is no good way to tell that from pkt_entry (wr_id) itself.

So I prefer to just call ibv_wc_read_vendor_err directly to avoid these troubles

@shijin-aws
Copy link
Contributor Author

bot:aws:retest

@shijin-aws shijin-aws merged commit ce244a7 into ofiwg:main Apr 3, 2024
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants