-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Floating Point Square Root Module with Variable Odd Mantissa and No Rounding #188
base: main
Are you sure you want to change the base?
Conversation
Add a base class for floating point square root. Implement a basic floating point square root module. Mantissa is variable, but limited to odd numbers from 1 to 51. No rounding implemented. Implement a fixed point square root for values with three integral bits and up to 51 fractional bits. Add a fixed point square root method. Add tests for fixed-point and floating-point square root. Implements intel#171 Co-authored-by: James Farwell <[email protected]> Co-authored-by: Curtis Anderson <[email protected]> Signed-off-by: Stephen Weeks <[email protected]> Signed-off-by: James Farwell <[email protected]> Signed-off-by: Curtis Anderson <[email protected]>
/// Abstract base class | ||
abstract class FixedPointSqrtBase extends Module { | ||
/// Width of the input and output fields. | ||
final int numWidth; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just 'width'?
final expError = Const(0); | ||
|
||
expect(compResult.floatingPointValue.withinRounding(expResult), true, | ||
reason: '\t${fp.floatingPointValue} ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an opportunity to use
reason: '''
${fp.floatingPointValue}
<avoiding having to inject newlines into a string>
'''
);
|
||
final sqrtDUT = FloatingPointSqrtSimple(fp); | ||
final rand = Random(513); | ||
for (var i = 0; i < 4096; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you compute the limit (4096) from eWidth and mWidth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly? Not 100% the question here. I wanted to choose something reasonably large since we do end up discarding negative numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking ahead to a variable width component for narrow and wider sqrts -- it would be great if the testing framework scaled.
// An abstract base class defining the API for floating-point square root. | ||
// | ||
// 2025 March 5 | ||
// Authors: James Farwell <[email protected]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Align your names to get top billing -- like to see your names clearly.
: super( | ||
definitionName: 'FloatingPointSquareRootSimple_' | ||
'E${a.exponent.width}M${a.mantissa.width}') { | ||
final outputSqrt = FloatingPoint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed the 'normal' tests: are subnormals supported? If not, you should check and throw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I forgot that! I'll fix that this weekend though I may need pointers on how to do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have an error flag, so you would just need to assign it to ~fp.isNormal();
|
||
// use fixed sqrt unit | ||
final aFixed = FixedPoint(signed: false, m: 3, n: a.mantissa.width); | ||
aFixed <= [Const(1, width: 3), a.mantissa.getRange(0)].swizzle(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a chance, in a test you can check out your naming of Verilog upon output. You can create good names by attaching .named('aFixed'); especially after swizzles.
File('name.sv').writeAsStringSync(mod.generateSynth());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if that was "too many names" when writing this, so tried to keep it to named structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome work! Got me thinking about expanding for multi-cycle or pipelined versions :)
// Weeks <[email protected]> | ||
|
||
/// An abstract API for fixed point square root. | ||
library; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this library
keyword here has no effect and is unnecessary? Or is there a reason?
import 'package:rohd/rohd.dart'; | ||
import 'package:rohd_hcl/rohd_hcl.dart'; | ||
|
||
/// Abstract base class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these doc comments go into API docs, in general it's best to describe more about what the class does rather than why the class is here in the context of the file. i.e. this is not just an "abstract base class", but a base class for fixed-point square root
late final FixedPoint a; | ||
|
||
/// getter for the computed output. | ||
late final FixedPoint sqrtF = a.clone(name: 'sqrtF')..gets(output('sqrtF')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: why the name "sqrtF" rather than just "sqrt"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this answers your question below! We did originally have both do sqrt. However, dart was having naming conflicts with math.sqrt() that we needed to run our tests. And honestly, sqrt didn't seem great after that. So I opted for "sqrtF" for sqrt fixed and "sqrtR" which is sqrt result. Though that one should probably be "sqrtFP".
late final FpType a; | ||
|
||
/// getter for the computed [FloatingPoint] output. | ||
late final FloatingPoint sqrtR = (a.clone(name: 'sqrtR') as FpType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what's the "R" mean? why not "sqrt"?
late final FloatingPoint sqrtR = (a.clone(name: 'sqrtR') as FpType) | ||
..gets(output('sqrtR')); | ||
|
||
/// getter for the [error] output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by this setup here with error
and errorSig
. Since the error
is already a single-bit Logic
, there's no need for this style with an internal and external version of the signal, like there is with the "casting" of LogicStructure
s. Unless I'm misunderstanding, I think you can just do a single late final Logic error = output('error');
and use it for both internal and external purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally was planning to use the internal error signal to also be used for logic testing throughout when we had our first rough draft of the block diagram. I think I just missed this.
final int mantissaWidth; | ||
|
||
/// The [clk] : if a non-null clock signal is passed in, a pipestage is added | ||
/// to the square root to help optimize frequency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the clk
and reset
used anywhere? No sequentials in here (yet?)? Did you mean to use it or is this plumbed for future upgrades?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to make future work easier. In most cases, like all ALUs, you get your best bang for buck when it is pipelined. I wasn't sure how to do that yet, but I wanted to have the ground work. It also keeps it consistent with the other FloatingPoint modules that are also capable of pipeline features. So setup for future work to make it a bit easier (and once I figure out how to do it in ROHD).
@@ -123,6 +123,13 @@ class FixedPoint extends Logic { | |||
} | |||
} | |||
|
|||
/// Multiply | |||
Logic fpMultiply(dynamic other) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can either move this directly into the implementation of operator *
, or at least make this private. If you keep it as a separate function, why not just _multiply
since it's already applicable to FixedPoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I will probably need some Dart education on. I like how we can use .eq()
and .lte()
methods or the override. That could just be my preference so not married to it. Would like to set it private though.
Logic fpMultiply(dynamic other) { | ||
_verifyCompatible(other); | ||
final product = Multiply(this, other).out; | ||
return FixedPoint.of(product, signed: false, m: product.width - n, n: n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since Multiply
has the same output width as inputs, product.width - n
should be the same as m
right? Or is it 1 bigger due to sign bit (if signed)?
Something to consider: do we want FixedPoint
to widen or keep the same width when multiplied by another FixedPoint
? To keep with API consistency, it seems reasonable to keep the same width, since FixedPoint
is a type of Logic
. But do we need a way to appropriately "expand" a FixedPoint
, sort of like a signExtend
but that makes it the "right" shape (m and n) for an output of a multiplication?
This is probably beyond the scope of this PR, but since this change is here and you're the first user of this new *
, thought I'd ask for thoughts :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I spent a lot of time on this last night. You do want your Fixed point to grow, or you loose A LOT of precision. Specifically, when we multiply to N
values we end up with 2*N - 1
result. Because our n doesn't move, that means our m grows by the difference of the output width and n: m = total_width - n
. So you could sign extend, but you will lose data. If you look at most multipliers in general your product usually grows with it. You can than leave it to the user to truncate how they will.
I'm hesitant to force it to the same size as the incoming products because you just loose so much information. Instead of opting to give precise precision to the user to then truncate as they want (or make an override to do this?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did my math here wrong (darn being out of practice for so long). It is still 2*N-1 but the results are 2*m
and 2*n
. So yes, we either need to sign extend or not. Going to push a fix for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for example, if we do 3.5 * 1.25 we expect 4.375. These numbers are (11.10 *01.01) = 0100.0110. If we truncate that from the Q4.4 format back to Q2.2 format we get 00.01 which is 0.5 and inaccurate. So I DO need to fix this to be m: 2*m, n: 2*n
but I think we should keep it widened. (Never not double your check after coding late at night so I'm glad you called this out).
tearDown(() async { | ||
await Simulator.reset(); | ||
}); | ||
test('sqrt(1)', () async { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: since most of these tests look very similar, you could consider making a list of "test vectors" and then generate all these tests, to avoid copy-paste
final fpvExpected = FixedPointValue.ofDouble(sqrt(3.9999998), | ||
signed: fixed.signed, m: fixed.m, n: fixed.n); | ||
expect(fpvResult, fpvExpected); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: how did you arrive at these numbers for the tests? are there any special corner cases to consider? what if there's a negative input?
Description & Motivation
Sometimes we want to be able to do square root for floating points.
This is especially true for ML hardware.
This PR addresses that by adding a Floating Point Square Root module and extending
Fixed Point to have multiplication
Related Issue(s)
#171
Testing
Test cases were created for the Fixed Point Square Root and Floating Point.
Fixed Square Root tested:
Floating Point Tested:
Backwards-compatibility
No.
Documentation
Yes. This required changes to the Floating Point and Fixed Point docs within the HCL library. They are included within this change