Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#15889: Fix handling of mantissa rounding to respect ties round to even #16997

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TT-BrianLiu
Copy link
Contributor

Ticket

Link to Github Issue

Problem description

See issue for more context and examples. When we handle rounding of mantissa when going from float to block float, we always round up, which doesn't folow the "ties round to even" rule. This PR fixes that and adds lower c++ tests to test the different cases.

What's changed

  • Update host logic with correct rounding of mantissa
  • Add tt_metal gtest to test convert_u32_to_bfp rounding logic for bfp8
  • Remove unused functions in tt_metal/api/tt-metalium/bfloat8.hpp

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • (For models and ops writers) Full new models tests passes
  • New/Existing tests provide coverage for changes

Comment on lines +11 to +13
float float_input;
uint32_t expected_mantissa;
float expected_float_output;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please init

// Set shared exponent as original float exponent (ie. skip logic for handling shared exponents)
auto shared_exp = uint32_input >> 23 & 0xFF;

auto output_mantissa = convert_u32_to_bfp<tt::DataFormat::Bfp8_b, false>(uint32_input, shared_exp, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what false means here, in both cases?

auto output_mantissa = convert_u32_to_bfp<tt::DataFormat::Bfp8_b, false>(uint32_input, shared_exp, false);
EXPECT_EQ(output_mantissa, expected_mantissa);

uint32_t uint32_output = convert_bfp_to_u32(tt::DataFormat::Bfp8_b, output_mantissa, shared_exp, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder why one function is templated and the other accepts bfp type as argument


void roundtrip_test_for_mantissa_rounding_with_bfp8(
float float_input, uint8_t expected_mantissa, float expected_float_output) {
auto uint32_input = *reinterpret_cast<const uint32_t*>(&float_input);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we allow conversion functions to accept/return or float or uint32t?

- Add tt_metal gtest to test convert_u32_to_bfp rounding logic for bfp8
- Remove unused functions in tt_metal/api/tt-metalium/bfloat8.hpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants