Create payload->blob type system #1120

litt3 · 2025-01-16T18:26:21Z

Why are these changes needed?

Checks

I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
I've checked the new test coverage and the coverage percentage didn't drop.
Testing Strategy
- Unit tests
- Integration tests
- This PR is not tested :(

Signed-off-by: litt3 <[email protected]>

This reverts commit 422c2a8. Signed-off-by: litt3 <[email protected]>

Signed-off-by: litt3 <[email protected]>

samlaf

First pass. Still have more to review but tired.

samlaf · 2025-02-12T23:11:19Z

api/clients/codecs/blob.go

+	// This is the blob length IN SYMBOLS, not in bytes
+	blobLength uint32


consider renaming to blobLengthSymbols

Although is symbol standard here? I feel like our whitepaper is the only place I've ever seen it. Should we just use FE for field element everywhere? BlobLengthFE?

I originally used blobLength to try to keep consistency with the struct in the commitment, but I agree, it's not ideal that I have to put in comments everywhere that it's in symbols

I don't have a preference between BlobLengthFE or BlobLengthSymbols. Usage of Symbols is pretty pervasive in our codebase, but also I don't think anyone would be confused by the FE terminology. Do you have a preference?

samlaf · 2025-02-12T23:15:08Z

api/clients/codecs/blob.go

+	coeffPolynomial *coeffPoly
+	// blobLength must be a power of 2, and should match the blobLength claimed in the BlobCommitment
+	// This is the blob length IN SYMBOLS, not in bytes
+	blobLength uint32


unclear how coeffPolynomial and blobLength are related. Does len(coeffPolynomial) == blobLength? Guessing we allow last zeros to be truncated. We should explain that

also can't we just enforce that blobLength = nextPowerOf2(len(coeffPolynomial)) and then we don't need blobLength here?

There is a corner case here:

Imagine a user disperses a very small blob, only 64 bytes, and the last 40 bytes are trailing 0s

When a different user fetches the blob from a relay, we've established that the relay could remove trailing 0s

If we were to say that blobLength = nextPowerOf2(len(coeffPolynomial)), then the user fetching and reconstructing this blob would determine that the blob length is 1 symbol, when it's actually 2

I will add a description of this corner case as a comment, if you confirm my logic here

samlaf · 2025-02-12T23:16:28Z

api/clients/codecs/blob.go

+func BlobFromBytes(bytes []byte, blobLength uint32) (*Blob, error) {
+	poly, err := coeffPolyFromBytes(bytes)
+	if err != nil {
+		return nil, fmt.Errorf("polynomial from bytes: %w", err)
+	}
+
+	return BlobFromPolynomial(poly, blobLength)
+}


see comment above, do we not want to enforce some relationship between len(bytes) and blobLength?

samlaf · 2025-02-12T23:17:09Z

api/clients/codecs/blob.go

+func (b *Blob) GetBytes() []byte {
+	return b.coeffPolynomial.getBytes()
+}


have a preference for Bytes() without the get prefix

or toBytes() to be consistent with ToPayload?

I personally tend to understand a nuanced difference between To and Get

to (or in other contexts, compute) seems to indicate that the data is changing form, or being processed in some way

get implies that the method is simply fetching something that already exists, byte for byte

So, I chose ToPayload because the method is transforming the data, to produce a payload. And it's GetBytes, because we are returning the literal bytes, as they already exist.

But maybe I imagine this nuance, in which case I'm ok changing the current names. What would your preferred naming be for these two methods? Bytes() and ToPayload()?

samlaf · 2025-02-12T23:21:20Z

api/clients/codecs/blob.go

+// The payloadStartingForm indicates how payloads are constructed by the dispersing client. Based on the starting form
+// of the payload, we can determine what operations must be done to the blob in order to reconstruct the original payload
+func (b *Blob) ToPayload(payloadStartingForm PolynomialForm) (*Payload, error) {


I think I prefer if get rid of the starting terminology. It introduces time which I personally find confusing. For me its really just "how does a rollup interpret its payloads as polynomial, ALWAYS", aka its more of an invariant or a way of thinking, than a "starting point".

Wdyt of just payloadForm or payloadPolyForm?

I think payloadForm is good. I will change it

samlaf · 2025-02-12T23:37:44Z

api/clients/codecs/default_blob_codec.go

@@ -22,7 +22,7 @@ func (v DefaultBlobCodec) EncodeBlob(rawData []byte) ([]byte, error) {
 	codecBlobHeader := make([]byte, 32)
 	// first byte is always 0 to ensure the codecBlobHeader is a valid bn254 element
 	// encode version byte
-	codecBlobHeader[1] = byte(DefaultBlobEncoding)
+	codecBlobHeader[1] = byte(PayloadEncodingVersion0)


doesn't this function EncodedBlob need to be changed? it returns []byte, shouldnt it return a Blob in your type system? Or whatever else this is (prob just an encodedPayload?)

default_blob_codec will not be used in v2, all encoding is contained in the new structs

samlaf · 2025-02-12T23:39:09Z

api/clients/codecs/blob_codec.go

+	case PayloadEncodingVersion0:
 		return DefaultBlobCodec{}, nil


this looks a bit weird now. Why is payload version 0 mapping to a defaultBlobCodec? Should we use BlobCodecVersion0 instead? Or do we want separate versioning for the payloadEncoding and BlobCodec? Or are these things the same?

I renamed the constant to what I felt made sense for the new code, and but the name doesn't work so well in the old blob_codec code. If I don't rename the constant, then the name doesn't work so well in the new code

Or are these things the same?

Yes, they are essentially the same thing.

samlaf · 2025-02-12T23:42:00Z

api/clients/codecs/blob_test.go

+func testBlobConversionForForm(t *testing.T, payloadBytes []byte, form PolynomialForm) {
+	payload := NewPayload(payloadBytes)
+
+	blob, err := payload.ToBlob(form)


The natural interpretation for this reads wrong :(
It reads "payload is getting to blob in this form" instead of "payload, interpreted in this form, gets converted to a blob".
Maybe there's a different way to structure the method such that the interpretation makes more sense?

Like perhaps defining on Blob a method .fromPayloadInForm()? Then we could use like
new(Blob).fromPayloadInForm(payload, form). It's def more ugly but at least it reads more easily I feel?

What do you think of this idea: when constructing a payload, declare in the constructor what form it is. The sequence would then read:

payload := NewPayload(payloadBytes, form) blob, err := payload.ToBlob()

I personally really dislike like the pattern of creating a new empty object, just to call a method that creates another instance of the object

samlaf · 2025-02-12T23:44:44Z

api/clients/codecs/blob_test.go

+	for i := 0; i < iterations; i++ {
+		originalData := testRandom.Bytes(testRandom.Intn(1024) + 1)
+		testBlobConversionForForm(t, originalData, PolynomialFormEval)
+		testBlobConversionForForm(t, originalData, PolynomialFormCoeff)
+	}
+}


fuzzing would be better here as it does a smarter search than this random iid generation of test examples.

also we should be testing the edge cases like empty bytes

I'm on board with fuzz testing, but when do we actually run our fuzz tests? Do they run in CI?

Took a stab at fuzz testing in c70fad8d LMK what you think

samlaf · 2025-02-12T23:51:49Z

api/clients/codecs/encoded_payload.go

+// blobLength is the length of the blob IN SYMBOLS, as claimed by the blob commitment. This is needed to make sure
+// that the claimed length in the encoded payload header is valid relative to the total blob length
+func encodedPayloadFromElements(fieldElements []fr.Element, blobLength uint32) (*encodedPayload, error) {


this comment makes me wonder whether we shouldn't instead define this method on the blob itself? Why do we need this method here? Seems like its at the wrong place given that it requires a blobLength?

cody-littley · 2025-02-13T17:15:22Z

api/clients/codecs/coeff_poly.go

+type coeffPoly struct {
+	fieldElements []fr.Element
+}


This duplicates an existing type (currently in encoding/rs/frame_coeffs.go in master, named something different if you aren't up to date with the latest master`).

type FrameCoeffs []fr.Element

Also, there is a bunch of serialization logic in that file as well for going back and forth between serialized and deserialized form.

Is the plan to circle back and replace these old types in a follow up PR?

I see []fr.Element used directly as a type in many places in this PR. Is there a reason why you don't use the fancy type you define here?

Is there a reason why you put this into a struct instead of doing type coefPoly []fr.Element?

cody-littley · 2025-02-13T17:22:01Z

api/clients/codecs/coeff_poly.go

+func (p *coeffPoly) getBytes() []byte {
+	return rs.FieldElementsToBytes(p.fieldElements)
+}


My personal preference for methods like this where we convert to/from []byte is to use a serialize/deserialize naming schema. Open to discussion if you don't agree.

cody-littley · 2025-02-13T17:23:55Z

api/clients/codecs/eval_poly.go

+type evalPoly struct {
+	fieldElements []fr.Element
+}


Particular reason why you use a struct instead of type fieldElements []fr.Element?

cody-littley · 2025-02-13T17:28:52Z

encoding/test_utils.go

+	powers := make([]T, powersToGenerate)
+	for i := T(0); i < powersToGenerate; i++ {
+		powers[i] = T(math.Pow(2, float64(i)))
+	}
+
+	return powers


you can get more efficiently get powers of two by doing the following:

Suggested change

powers := make([]T, powersToGenerate)

for i := T(0); i < powersToGenerate; i++ {

powers[i] = T(math.Pow(2, float64(i)))

}

return powers

powers := make([]T, powersToGenerate)

for i := T(0); i < powersToGenerate; i++ {

powers[i] = 1 << i

}

return powers

Signed-off-by: litt3 <[email protected]>

litt3 added 4 commits January 16, 2025 12:34

Create type system

9b222e0

Signed-off-by: litt3 <[email protected]>

Rename encoding

bae8843

Signed-off-by: litt3 <[email protected]>

Merge branch 'master' into payload-blob-type-system

ebf9979

Fix padding logic

d66ddf2

Signed-off-by: litt3 <[email protected]>

litt3 requested review from samlaf and bxue-l2 January 16, 2025 20:44

litt3 added 3 commits January 16, 2025 16:01

Pad upon poly construction

f6021ea

Signed-off-by: litt3 <[email protected]>

Change name to ProtoBlob

422c2a8

Signed-off-by: litt3 <[email protected]>

Update comments

a3d184b

Signed-off-by: litt3 <[email protected]>

litt3 self-assigned this Jan 16, 2025

litt3 added 15 commits January 17, 2025 09:11

Revert "Change name to ProtoBlob"

856f5fc

This reverts commit 422c2a8. Signed-off-by: litt3 <[email protected]>

Rename PayloadHeader to EncodedPayloadHeader

1eeec85

Signed-off-by: litt3 <[email protected]>

Merge branch 'master' into payload-blob-type-system

a30a4ad

Rename to PayloadEncodingVersion

5eab8dd

Signed-off-by: litt3 <[email protected]>

Do a bunch of cleanup

132e8de

Signed-off-by: litt3 <[email protected]>

Fix PayloadVersion naming

da63536

Signed-off-by: litt3 <[email protected]>

Merge branch 'master' into payload-blob-type-system

3016cac

Use new conversion method

9f98462

Signed-off-by: litt3 <[email protected]>

Add better descriptions to codec utils

58b76c1

Signed-off-by: litt3 <[email protected]>

Do test work

916573f

Signed-off-by: litt3 <[email protected]>

Add length checks

7eed618

Signed-off-by: litt3 <[email protected]>

Merge branch 'master' into payload-blob-type-system

c3a6291

Signed-off-by: litt3 <[email protected]>

Write test for power of 2 util

b604c82

Signed-off-by: litt3 <[email protected]>

Write more tests

2ef8e99

Signed-off-by: litt3 <[email protected]>

Finish utils tests

13d16c7

Signed-off-by: litt3 <[email protected]>

litt3 requested a review from anupsv February 12, 2025 16:46

litt3 added 3 commits February 12, 2025 13:18

Do FFT work

22e8134

Signed-off-by: litt3 <[email protected]>

Clean up encoded payload

23b49c4

Signed-off-by: litt3 <[email protected]>

Merge branch 'master' into payload-blob-type-system

2add571

litt3 requested a review from cody-littley February 12, 2025 18:35

litt3 marked this pull request as ready for review February 12, 2025 18:35

samlaf reviewed Feb 12, 2025

View reviewed changes

cody-littley reviewed Feb 13, 2025

View reviewed changes

litt3 added 3 commits February 13, 2025 13:10

Make encoding version uint8 instead of byte

e5363e4

Signed-off-by: litt3 <[email protected]>

Add explanation for 0x00 padding

7801044

Signed-off-by: litt3 <[email protected]>

Use fuzz testing

c70fad8

Signed-off-by: litt3 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create payload->blob type system #1120

Create payload->blob type system #1120

litt3 commented Jan 16, 2025 •

edited

Loading

samlaf left a comment •

edited

Loading

samlaf Feb 12, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

samlaf Feb 12, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025 •

edited

Loading

samlaf Feb 12, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

samlaf Feb 12, 2025

litt3 Feb 13, 2025

litt3 Feb 13, 2025

samlaf Feb 12, 2025

cody-littley Feb 13, 2025

cody-littley Feb 13, 2025

cody-littley Feb 13, 2025

cody-littley Feb 13, 2025

cody-littley Feb 13, 2025

cody-littley Feb 13, 2025

		// This is the blob length IN SYMBOLS, not in bytes
		blobLength uint32

Create payload->blob type system #1120

Are you sure you want to change the base?

Create payload->blob type system #1120

Conversation

litt3 commented Jan 16, 2025 • edited Loading

Why are these changes needed?

Checks

samlaf left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

litt3 Feb 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

litt3 commented Jan 16, 2025 •

edited

Loading

samlaf left a comment •

edited

Loading

litt3 Feb 13, 2025 •

edited

Loading