-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/efa: Add new efa-direct fi_info objects #10735
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My high level comment is that I don't think efa direct should be a separate ep type. I understand you want to use this to simplify your branching. But the branching should be on prov_name instead of ep type. I thought our plan is to make efa direct cover both dgram and rdm ep type. But that will break existing dgram user which use efa as prov_name. I don't know if this user actually exists, but to avoid the risk I am ok to make dgram ep type stay in efa
and make efa-direct
only cover rdm
.
2437ba1
to
470cd6d
Compare
prov/efa/src/efa_base_ep.h
Outdated
#define EFA_INFO_TYPE_IS_RDM(_info) \ | ||
(_info && _info->ep_attr && (_info->ep_attr->type == FI_EP_RDM) && !strcasecmp(_info->fabric_attr->name, EFA_PROV_NAME)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the linear string comparisons necessary? These are already pretty gnarly chained conditionals.
If the fi_info.fabric_attr.name
is truly the only way to disambiguate efa-rdm from efa-direct (I.e. we can't add some other enum field elsewhere), I'd like to explore employing constant-time comparison schemes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is strncmp
with the length of the longer string good enough? These macros are only used in the fi_getinfo
and fi_domain
calls, so they are not performance critical.
470cd6d
to
7e90a94
Compare
41129e0
to
0d73582
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, thank you! Only a few nits.
AWS CI failure seems to still to be some tests are not using |
0d73582
to
c76cca8
Compare
I still hit bad qpn error when I only test rdm_pingpong with 1K... which means there is something wrong. Need to address this before can merge the PR
|
I would carefully review |
c76cca8
to
76c9ecf
Compare
46210fe
to
300556d
Compare
prov/efa/src/efa_prov_info.c
Outdated
} | ||
|
||
/* efa-direct path requires FI_CONTEXT2 */ | ||
prov_info->tx_attr->mode |= FI_CONTEXT2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both dgram and direct requires FI_CONTEXT2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
300556d
to
1226f67
Compare
prov/efa/src/efa_prov_info.c
Outdated
@@ -251,15 +217,29 @@ void efa_prov_info_set_tx_rx_attr(struct fi_info *prov_info, | |||
struct efa_device *device, | |||
enum fi_ep_type ep_type) | |||
{ | |||
/* efa-direct and DGRAM paths require FI_CONTEXT2 */ | |||
prov_info->tx_attr->mode |= FI_CONTEXT2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mode will be washed by the change in L225....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to move to the place after the if
block
efa-direct is a new path that provides direct access to the EFA device with minimal overhead and features. It does not support SAS ordering, tagged messaging, atomics. It does not have copy based protocols and cannot support sending/receiving large message sizes and HMEM devices that don't have p2p support. The functionality provided by the efa-direct path is a subset of the efa rdm path. When applications do not require capabilities only supported by the efa rdm path, it is more performant for them to use the efa-direct path. Accordingly, the efa-direct fi_info objects are returned before the efa rdm fi_info objects in the fi_getinfo call. Signed-off-by: Sai Sunku <[email protected]>
Previously, device->rdm_info->ep_attr->max_msg_size would always be the max message size for FI_MSG operations (corresponding to the device MTU size). With efa direct and related changes, max_msg_size can be the maximum RMA size supported by the device which is 1GB. So use device->ibv_port_attr.max_msg_sz instead when MTU size is required. Signed-off-by: Sai Sunku <[email protected]>
Remove temporary hacks added previously to test the efa-direct code path and add new tests Signed-off-by: Sai Sunku <[email protected]>
1226f67
to
ea02120
Compare
bot:aws:retest |
return; | ||
} | ||
} | ||
enable_hmem = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this variable enable_hmem
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. To catch the case when HAVE_CUDA
is defined but ofi_hmem_is_initialized(FI_HMEM_CUDA)
is false. That happens when you compile with CUDA but run without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems better to do an early return forif !ofi_hmem_is_initialized
, then you do not need this enable_hmem.
return; | ||
} | ||
|
||
EFA_HMEM_IFACE_FOREACH(i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check for system here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No but FI_HMEM_SYSTEM will never fail the check below, so I don't think it's worth skipping FI_HMEM_SYSTEM explicitly
libfabric/prov/efa/src/efa_hmem.c
Lines 265 to 268 in c3420a7
info->initialized = true; | |
if (iface == FI_HMEM_SYNAPSEAI || iface == FI_HMEM_SYSTEM) { | |
info->p2p_supported_by_device = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a EFA_HMEM_IFACE_FOREACH_NON_SYSTEM
bot:aws:retest |
efa-direct is a new provider that provides direct access to the EFA
device with minimal overhead and features. It does not support SAS
ordering, tagged messaging, atomics. It does not have copy based
protocols and cannot support sending/receiving large message sizes
and HMEM devices that don't have p2p support.
The functionality provided by the efa-direct provider is a subset of the
efa rdm provider. When applications do not require capabilities only
supported by the efa rdm provider, it is more performant for them to use
the efa-direct provider. Accordingly, the efa-direct fi_info objects are
returned before the efa rdm fi_info objects in the fi_getinfo call.