Skip to content

Commit

Permalink
prov/efa: Fix bugs when reposting queued read and long cts pkts
Browse files Browse the repository at this point in the history
When efa_rdm_ope_post_read has error, we should still continue
after writing the txe/rxe error.

When ctsdata pkt post returns FI_EAGAIN, we should still continue
instead of break because opes may come from different eps.

Signed-off-by: Shi Jin <[email protected]>
  • Loading branch information
shijin-aws committed Jun 27, 2024
1 parent e72f856 commit 71e0ef3
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions prov/efa/src/efa_domain.c
Original file line number Diff line number Diff line change
Expand Up @@ -566,7 +566,7 @@ void efa_domain_progress_rdm_peers_and_queues(struct efa_domain *domain)
efa_rdm_txe_handle_error(ope, -ret, FI_EFA_ERR_READ_POST);
else
efa_rdm_rxe_handle_error(ope, -ret, FI_EFA_ERR_READ_POST);
return;
continue;
}

ope->internal_flags &= ~EFA_RDM_OPE_QUEUED_READ;
Expand Down Expand Up @@ -611,7 +611,7 @@ void efa_domain_progress_rdm_peers_and_queues(struct efa_domain *domain)
ret = efa_rdm_ope_post_send(ope, EFA_RDM_CTSDATA_PKT);
if (OFI_UNLIKELY(ret)) {
if (ret == -FI_EAGAIN)
break;
continue;

efa_rdm_txe_handle_error(ope, -ret, FI_EFA_ERR_PKT_POST);
continue;
Expand Down

0 comments on commit 71e0ef3

Please sign in to comment.