-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[iGPU] The device does not have the ext_intel_free_memory aspect #1352
Comments
# Motivation Friendly handle the runtime error message if the device doesn't support querying the available free memory. See intel/torch-xpu-ops#1352 Pull Request resolved: #146899 Approved by: https://github.com/EikanWang
# Motivation Friendly handle the runtime error message if the device doesn't support querying the available free memory. See intel/torch-xpu-ops#1352 Pull Request resolved: pytorch#146899 Approved by: https://github.com/EikanWang
More details for future reproduce: // icpx demo.cpp -o demo.exe -fsycl
#include <sycl/sycl.hpp>
#include <cstdint>
#include <vector>
namespace {
struct DevicePool {
std::vector<std::unique_ptr<sycl::device>> devices;
std::unique_ptr<sycl::context> context;
} gDevicePool;
void enumDevices(std::vector<std::unique_ptr<sycl::device>>& devices) {
for (const auto& platform : sycl::platform::get_platforms()) {
if (platform.get_backend() != sycl::backend::ext_oneapi_level_zero) {
continue;
}
for (const auto& device : platform.get_devices()) {
if (device.is_gpu()) {
devices.push_back(std::make_unique<sycl::device>(device));
}
}
break;
}
}
inline void initGlobalDevicePoolState() {
// Enumerate all GPU devices and record them.
enumDevices(gDevicePool.devices);
if (gDevicePool.devices.empty()) {
return;
}
gDevicePool.context = std::make_unique<sycl::context>(
gDevicePool.devices[0]->get_platform().ext_oneapi_get_default_context());
}
}
sycl::device& get_raw_device(int device) {
return *gDevicePool.devices[device];
}
sycl::context& get_device_context() {
return *gDevicePool.context;
}
int device_count() {
return gDevicePool.devices.size();
}
int main() {
initGlobalDevicePoolState();
const auto count = device_count();
std::cout << "device count is " << count << std::endl;
if (count <= 0) {
return 0;
}
for (auto i = 0; i < count; i++) {
auto& device = get_raw_device(i);
std::cout << i << "th device name is " << device.get_info<sycl::info::device::name>() << ", total memory is "
<< device.get_info<sycl::info::device::global_mem_size>()/1024./1024./1024. << " Gb.";
if (device.has(sycl::aspect::ext_intel_free_memory)) {
std::cout << " free device memory is " << device.get_info<sycl::ext::intel::info::device::free_memory>()/1024./1024./1024. << "Gb.";
} else {
// This happens on LNL, it lacks of the sycl::aspect::ext_intel_free_memory
std::cout << " ERROR: free device memory is not available.";
}
std::cout << std::endl;
}
std::cout << "finish!" << std::endl;
} The output is below, You could find that it does not support the sycl::aspect::ext_intel_free_memory:
The UR_TRACE would get the following output:
|
The root cause of this issue is that the driver does not support it now. See the GSD-10758 for the internal track.
|
The issue should not be a blocking issue, since we already have a notification in PyTorch to warn users. So it won't block the user's overall experience: |
# Motivation Friendly handle the runtime error message if the device doesn't support querying the available free memory. See intel/torch-xpu-ops#1352 Pull Request resolved: #146899 Approved by: https://github.com/EikanWang
Additional Ref : intel/compute-runtime#742 |
# Motivation Friendly handle the runtime error message if the device doesn't support querying the available free memory. See intel/torch-xpu-ops#1352 Pull Request resolved: pytorch#146899 Approved by: https://github.com/EikanWang
🐛 Describe the bug
This issue only happens on iGPU on Windows. It could pass on BMG. The error message is below:
Versions
PyTorch version: 2.7.0.dev20250209+xpu
XPU used to build PyTorch: 20250000
Is XPU available: True
Intel GPU driver version:
Intel GPU models onboard:
Intel GPU models detected:
The text was updated successfully, but these errors were encountered: