-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/efa: Distinguish max RMA size from max message transfer size #10082
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, I am not sure whether we have to validate the size every time during transmission calls.
I'll take another look at this - could be superfluous |
d756592
to
01fb587
Compare
if (len > efa_rdm_ep->inject_size) { | ||
EFA_WARN(FI_LOG_CQ, "invalid message size %ld for inject.\n", len); | ||
return -FI_EINVAL; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need an assert here right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I omitted these since the documentation for len
in fi_tagged(3) doesn't have the same qualifier re: valid lengths, as in fi_msg(3) and fi_rma(3). This could easily be an oversight since the documentation was written ~10 years ago and never changed. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's clearly an oversight in the documentation (cc @j-xiong ). You can add assert and update the man page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
if (len > efa_rdm_ep->inject_size) { | ||
EFA_WARN(FI_LOG_CQ, "invalid message size %ld for inject.\n", len); | ||
return -FI_EINVAL; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need an assert
prov/efa/src/rdm/efa_rdm_msg.c
Outdated
@@ -978,6 +966,7 @@ ssize_t efa_rdm_msg_recv(struct fid_ep *ep_fid, void *buf, size_t len, | |||
struct efa_rdm_ep *ep; | |||
|
|||
ep = container_of(ep_fid, struct efa_rdm_ep, base_ep.util_ep.ep_fid.fid); | |||
assert(len <= ep->max_msg_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to apply max_msg_size for recv. Though the documentation doesn't say that, I think it's totally fine that user post a bigger recv buffer than the incoming msg size, i.e. the max_msg_size only applies to transmission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise for RMA read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RMA read is still a transmission, so the limit applies.
I think we also need to revert 642fc3a as part of this PR |
d0f5652
to
1282b93
Compare
Unit test failure after (under) reverting 642fc3a. Likely due to other unit tests utilizing the ability to specify @shijin-aws Are we still happy with the revert of 642fc3a? Reading through #9388, it seems like it was not hastily merged. If so, I can probably alter the unit test logic to use |
Ah yes, I think it is caused by these tests https://github.com/ofiwg/libfabric/blob/main/prov/efa/test/efa_unit_test_ep.c#L646-L709. In order to fix it, you need to make it use fi_setopt to set max_msg_size. |
1282b93
to
dbe7731
Compare
New opts: FI_OPT_MAX_MSG_SIZE, FI_OPT_INJECT_MSG_SIZE Signed-off-by: Darryl Abbate <[email protected]>
Signed-off-by: Darryl Abbate <[email protected]>
Signed-off-by: Darryl Abbate <[email protected]>
Signed-off-by: Darryl Abbate <[email protected]>
This reverts commit 642fc3a. Signed-off-by: Darryl Abbate <[email protected]>
dbe7731
to
b9a6431
Compare
This only affects unit tests Signed-off-by: Darryl Abbate <[email protected]>
b9a6431
to
ae802bf
Compare
This also adds getopt/setopt logic for the new supported endpoint options, as well as some validation logic for the
len
arguments to applicable operations.