You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enable reading binary files on systems that did not produce them by addressing platform-specific issues in the current serialization process. Currently a number of issues (detailed below) prevent us from consuming these .bin files within the visualizer. In order to move forward we'll need to implement a solution against the source repository which generates these files.
Device-Specific Storage
The current serialization system for DeviceStorage and MultiDeviceStorage depends on device-specific memory (e.g., GPU, FPGA). This is reflected in the following part of the code:
TT_THROW("Device storage isn't supported");
Serialization skips device-specific tensors entirely, meaning that device-related data cannot be deserialized on systems without the same hardware.
Memory Configuration
The MemoryConfig serialized data (e.g., TensorMemoryLayout, BufferType) assumes that the target system can reconstruct the memory architecture. This occurs when writing the memory configuration to the file:
This layout may depend on the original system’s memory architecture, leading to potential deserialization issues on different systems.
DistributedTensorConfig
The DistributedTensorConfig specifies how tensors are distributed across multiple devices. The code handles this in the multi-device storage logic:
This configuration is system-dependent and cannot be easily reproduced on another system without a similar device setup.
Device-Dependent Code Paths
The MeshDevice configuration depends on the system's multi-device setup. For instance:
if (device != nullptr) {
tensor = tensor.to(device, memory_config);
}
This code assumes the presence of specific devices (e.g., MeshDevice), making it impossible to deserialize and map tensor data properly if such devices are absent.
Data Types
The code supports several data types, including hardware-specific ones like BFLOAT16. These types may not be available on all systems:
System dependency arises here, as some platforms may lack support for certain data types, leading to deserialization errors.
Tensor Layout
The code serializes tensor layouts (e.g., ROW_MAJOR, TILE), which may be optimized for certain hardware architectures. This layout is read and written as:
If the target system has a different memory architecture, it may not be able to reconstruct the tensor layout correctly.
Device Context During Deserialization
Deserialization relies on device context to check if tensors are stored on a device and then transfers them to CPU memory:
tensor = tensor.to(device, memory_config);
Without the necessary devices, this part of the code cannot function properly, leading to deserialization failures on systems without similar hardware.
Version-Specific Serialization
The code includes version checks to ensure compatibility between different serialization versions:
if (version_id >= 2) {
input_stream.read(reinterpret_cast<char*>(&has_memory_config), sizeof(bool));
}
Mismatched versions between the writing and reading systems could result in failed or incorrect deserialization.
Custom Buffers and Memory Management
The custom buffer types OwnedBuffer and BorrowedBuffer manage memory during serialization. The buffer sizes are system-dependent:
These custom buffers may not translate well across systems with different memory architectures, leading to issues during deserialization.
Endianness and Platform-Specific Binary Formats
The binary format relies on system-specific properties like endianness, which are not handled explicitly in the current code:
This could cause byte-swapping issues when reading binary files on systems with different endianness.
Proposal
Write All Tensors in Host Independent Format
Given that we can not read the .bin file on a different host system we need to store the tensor in a host-independent format. Currently the database logic has a conditional that will write the tensor either to the custom TTNN tensor format (and produce a .bin file) or will simply write the tensor using PyTorch's save method.
Unfortunately simply saving the tensors as .pt is not enough to allow for reading them on a different host. The tensors need to be detached and converted to a non-host specific memory using the CPU method.
defstore_tensor(report_path, tensor):
importtorchDETACH_SAVED_TENSORS=True# TODO Read from a configurationtensors_path=report_path/TENSORS_PATHtensors_path.mkdir(parents=True, exist_ok=True)
ifisinstance(tensor, ttnn.Tensor):
ifDETACH_SAVED_TENSORS:
tensor_file_name=tensors_path/f"{tensor.tensor_id}.pt"else:
tensor_file_name=tensors_path/f"{tensor.tensor_id}.bin"iftensor_file_name.exists():
returnifDETACH_SAVED_TENSORS:
torch_tensor=ttnn.to_torch(tensor)
torch_tensor=torch_tensor.detach().cpu()
torch.save(torch_tensor, tensor_file_name)
else:
ttnn.dump_tensor(
tensor_file_name,
ttnn.from_device(tensor),
)
elifisinstance(tensor, torch.Tensor):
tensor_file_name=tensors_path/f"{tensor.tensor_id}.pt"iftensor_file_name.exists():
returntorch_tensor=torch.Tensor(tensor)
ifDETACH_SAVED_TENSORS:
torch_tensor=torch.Tensor(tensor).detach().cpu()
torch.save(torch_tensor, tensor_file_name)
else:
raiseValueError(f"Unsupported tensor type {type(tensor)}")
The text was updated successfully, but these errors were encountered:
Description
Enable reading binary files on systems that did not produce them by addressing platform-specific issues in the current serialization process. Currently a number of issues (detailed below) prevent us from consuming these .bin files within the visualizer. In order to move forward we'll need to implement a solution against the source repository which generates these files.
Link to Existing POC PR
Issues
Device-Specific Storage
The current serialization system for
DeviceStorage
andMultiDeviceStorage
depends on device-specific memory (e.g., GPU, FPGA). This is reflected in the following part of the code:Serialization skips device-specific tensors entirely, meaning that device-related data cannot be deserialized on systems without the same hardware.
Memory Configuration
The
MemoryConfig
serialized data (e.g.,TensorMemoryLayout
,BufferType
) assumes that the target system can reconstruct the memory architecture. This occurs when writing the memory configuration to the file:This layout may depend on the original system’s memory architecture, leading to potential deserialization issues on different systems.
DistributedTensorConfig
The
DistributedTensorConfig
specifies how tensors are distributed across multiple devices. The code handles this in the multi-device storage logic:This configuration is system-dependent and cannot be easily reproduced on another system without a similar device setup.
Device-Dependent Code Paths
The
MeshDevice
configuration depends on the system's multi-device setup. For instance:This code assumes the presence of specific devices (e.g.,
MeshDevice
), making it impossible to deserialize and map tensor data properly if such devices are absent.Data Types
The code supports several data types, including hardware-specific ones like
BFLOAT16
. These types may not be available on all systems:System dependency arises here, as some platforms may lack support for certain data types, leading to deserialization errors.
Tensor Layout
The code serializes tensor layouts (e.g.,
ROW_MAJOR
,TILE
), which may be optimized for certain hardware architectures. This layout is read and written as:If the target system has a different memory architecture, it may not be able to reconstruct the tensor layout correctly.
Device Context During Deserialization
Deserialization relies on device context to check if tensors are stored on a device and then transfers them to CPU memory:
Without the necessary devices, this part of the code cannot function properly, leading to deserialization failures on systems without similar hardware.
Version-Specific Serialization
The code includes version checks to ensure compatibility between different serialization versions:
Mismatched versions between the writing and reading systems could result in failed or incorrect deserialization.
Custom Buffers and Memory Management
The custom buffer types
OwnedBuffer
andBorrowedBuffer
manage memory during serialization. The buffer sizes are system-dependent:These custom buffers may not translate well across systems with different memory architectures, leading to issues during deserialization.
Endianness and Platform-Specific Binary Formats
The binary format relies on system-specific properties like endianness, which are not handled explicitly in the current code:
This could cause byte-swapping issues when reading binary files on systems with different endianness.
Proposal
Write All Tensors in Host Independent Format
Given that we can not read the .bin file on a different host system we need to store the tensor in a host-independent format. Currently the database logic has a conditional that will write the tensor either to the custom TTNN tensor format (and produce a
.bin
file) or will simply write the tensor using PyTorch's save method.Unfortunately simply saving the tensors as .pt is not enough to allow for reading them on a different host. The tensors need to be detached and converted to a non-host specific memory using the CPU method.
The text was updated successfully, but these errors were encountered: