bpf2go: ergonomic enums #1636

mejedi · 2024-12-20T13:00:24Z

Constant names emitted for enum elements end up quite unwieldy. They are
prefixed with enum name because in C struct, union and enum names are in
a separate namespace, allowing for overlaps with identifiers, e.g:

enum E { E };

While such overlaps are possible in theory, people usually don't do
it. If a typicall C naming convention is followed, we get

enum something {
    SOMETHING_FOO, SOMETHING_BAR
};

generating <STEM>SomethingSOMETHING_FOO and <STEM>SomethingSOMETHING_BAR.

In addition to "safe" long names, generate shorter ones if the
respective name is not taken. <STEM>SOMETHING_FOO and
<STEM>SOMETHING_BAR are much nicer to work with.

bpf2go orders generated types by name to ensure the output is stable. Adjust data structures such that we can rely on implicit sorting in text/template, removing sortTypes alltogether. The added reservedNames hash is handy for rejecting "reserved" type names such as "<STEM>Specs", and for the upcoming "ergonomic" enum feature (generate short names for enum members if not yet taken). Signed-off-by: Nick Zavaritsky <[email protected]>

Constant names emitted for enum elements end up quite unwieldy. They are prefixed with enum name because in C struct, union and enum names are in a separate namespace, allowing for overlaps with identifiers, e.g: enum E { E }; While such overlaps are possible in *theory*, people usually don't do it. If a typicall C naming convention is followed, we get enum something { SOMETHING_FOO, SOMETHING_BAR }; generating <STEM>SomethingSOMETHING_FOO and <STEM>SomethingSOMETHING_BAR. In addition to "safe" long names, generate shorter ones if the respective name is not taken. <STEM>SOMETHING_FOO and <STEM>SOMETHING_BAR are much nicer to work with. Signed-off-by: Nick Zavaritsky <[email protected]>

ti-mo

While yes, this makes for shorter identifiers, it also makes it less deterministic who will get the prefix-less one in case of conflicts. The first-declared one? The one appearing first alphabetically? Is it random? This needs to be clearly defined.

I also kind of don't really see the problem, could you give a real-world example of an identifier that gets unwieldy because of this? There's a case to be made for anonymous enums, which I'm not sure we currently support; at least I couldn't find any examples.

I'm also not sure about adding GoFormatter.ShortEnumIdentifier, we already have EnumIdentifier that could probably be extended.

ti-mo · 2025-01-08T12:37:56Z

cmd/bpf2go/gen/output_test.go

@@ -116,3 +65,46 @@ func TestObjects(t *testing.T) {
 	qt.Assert(t, qt.StringContains(str, "Var1 *ebpf.Variable `ebpf:\"var_1\"`"))
 	qt.Assert(t, qt.StringContains(str, "ProgFoo1 *ebpf.Program `ebpf:\"prog_foo_1\"`"))
 }
+
+func TestEnums(t *testing.T) {


Would it be possible to write this test without the range over conflict true/false? I'm not a big fan of treating invariants like a list of test cases, because you inevitably end up sprinkling if <invariant> around anyway.

These are technically separate tests; perhaps you could extract the series of qt.Assert calls into a test() closure within the function to avoid repetition.

ti-mo · 2025-01-08T12:40:16Z

cmd/bpf2go/gen/output_test.go

+	}
+}
+
+func wsSeparated(terms ...string) *regexp.Regexp {


Is this strictly required? IMO it's harder to read wsSeparated("barEnumNameV1", "barEnumName", "=", "1") than simply "barEnumNameV1 barEnumName = 1". Or is the amount of whitespace not deterministic?

is the amount of whitespace not deterministic?

The amount of whitespace depends on the length of other identifiers in the set since formatter inserts extra spaces for alignment. It doesn't improve readability indeed, but makes it easier to update test code in the future.

ti-mo · 2025-01-08T12:59:46Z

cmd/bpf2go/gen/output.go

-	types, err := sortTypes(typeNames)
-	if err != nil {
-		return err
+	typeByName := map[string]btf.Type{}


Use make() for consistency?

ti-mo · 2025-01-08T13:53:02Z

cmd/bpf2go/gen/output.go

-// sortTypes returns a list of types sorted by their (generated) Go type name.
-//
-// Duplicate Go type names are rejected.
-func sortTypes(typeNames map[btf.Type]string) ([]btf.Type, error) {


Does the output no longer need to be sorted?

https://pkg.go.dev/text/template#hdr-Actions

{{range pipeline}} T1 {{end}}
The value of the pipeline must be an array, slice, map, or channel.
If the value of the pipeline has length zero, nothing is output;
otherwise, dot is set to the successive elements of the array,
slice, or map and T1 is executed. If the value is a map and the
keys are of basic type with a defined order, the elements will be
visited in sorted key order.

ti-mo · 2025-01-08T13:53:23Z

cmd/bpf2go/gen/output.go

 		args.ObjectFile,
 	}

 	var buf bytes.Buffer
 	if err := commonTemplate.Execute(&buf, &ctx); err != nil {
-		return fmt.Errorf("can't generate types: %s", err)
+		return fmt.Errorf("can't generate types: %v", err)


Why %v over %s? This change feels a little arbitrary.

I actually meant %w.

ti-mo · 2025-01-08T16:11:37Z

cmd/bpf2go/gen/output.go

 		Identifier: args.Identifier,
+		ShortEnumIdentifier: func(_, element string) string {


Could this be made a part of EnumIdentifier instead? Not sure why this deserves its own hook, it's just another step during enum identifier generation.

Not sure why this deserves its own hook, it's just another step during enum identifier generation.

I propose generating BOTH long and short identifiers to maintain backwards compatibility.

ti-mo · 2025-01-08T16:14:50Z

cmd/bpf2go/gen/output_test.go

@@ -12,58 +13,6 @@ import (
 	"github.com/cilium/ebpf/cmd/bpf2go/internal"
 )

-func TestOrderTypes(t *testing.T) {


Basically same question as above. Do we have another way of ensuring stability in the generated output? Was this always unnecessary?

When we range over a map in text/template, items are first ordered by key.

ti-mo · 2025-01-08T16:33:42Z

cmd/bpf2go/gen/output.go

-		// NB: This also deduplicates types.
-		typeNames[typ] = args.Stem + args.Identifier(typ.TypeName())
+	tn := templateName(args.Stem)
+	reservedNames := map[string]struct{}{


At first sight, I'd model this as a property to be specified during construction of the GoFormatter. Everytime the formatter generates an identifier (struct or otherwise), it can check uniqueness and push it to the set of seen names. I think it's a good feature to have in the formatter to prevent it from emitting invalid code, it doesn't need to be b2g-specific.

This way, you also avoid having to close over reservedNames and typeByName to get it into an ShortEnumIdentifier function.

mejedi · 2025-01-08T19:35:04Z

@ti-mo

I also kind of don't really see the problem, could you give a real-world example of an identifier that gets unwieldy because of this?

enum reassm_rc {
  REASSM_RC_SUCCESS,
  REASSM_RC_INTERNAL_ERROR,
  REASSM_RC_TOO_MANY_REASSEMBLIES,
  ...
};

yields

const (
    frReassmRcREASSM_RC_SUCCESS               frReassmRc = 0
    frReassmRcREASSM_RC_INTERNAL_ERROR        frReassmRc = 1
    frReassmRcREASSM_RC_TOO_MANY_REASSEMBLIES frReassmRc = 2
    ....

This code is meant to be reused across several projects, therefore I can't simply have SUCCESS, INTERNAL_ERROR and TOO_MANY_REASSEMBLIES.

mejedi · 2025-01-09T10:33:49Z

@ti-mo

it also makes it less deterministic who will get the prefix-less one in case of conflicts. The first-declared one? The one appearing first alphabetically? Is it random? This needs to be clearly defined.

Let's talk about conflicts. Assuming a C compiler, the only conflict possible is with a type name. E.g.

struct foo {};
enum E {foo};

Enum item name MUST be unique. The following program fails to compile:

enum A {foo};
enum B {foo}; // error: redefinition of enumerator 'foo'

In case of conflict, a type name wins.

In theory, the input object file could have duplicate enum identifiers. It is possible with non-C languages or multiple C compilation units linked together with bpftool. It is a niche use case as of today.

Types are visited in alphabetical order, items in a enum are visited in BTF order which typically matches declaration order.

mejedi requested a review from dylandreimerink as a code owner December 20, 2024 13:00

mejedi force-pushed the ergonomic-enums branch from 0c3db6f to 345d337 Compare December 20, 2024 13:01

mejedi changed the title ~~Ergonomic enums~~ bpf2go: Ergonomic enums Dec 20, 2024

mejedi mentioned this pull request Dec 20, 2024

bpf2go: generate assignment structs and Go types for Variables and VariableSpecs #1610

Merged

mejedi requested a review from ti-mo December 20, 2024 13:05

mejedi force-pushed the ergonomic-enums branch from 345d337 to ad7e33b Compare December 20, 2024 13:12

mejedi force-pushed the ergonomic-enums branch from ad7e33b to d3bc971 Compare December 20, 2024 13:15

mejedi force-pushed the ergonomic-enums branch from d3bc971 to 12b600e Compare December 20, 2024 13:22

mejedi changed the title ~~bpf2go: Ergonomic enums~~ bpf2go: ergonomic enums Dec 20, 2024

ti-mo requested changes Jan 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpf2go: ergonomic enums #1636

bpf2go: ergonomic enums #1636

mejedi commented Dec 20, 2024

ti-mo left a comment

ti-mo Jan 8, 2025

ti-mo Jan 8, 2025

mejedi Jan 8, 2025 •

edited

Loading

ti-mo Jan 8, 2025

ti-mo Jan 8, 2025

mejedi Jan 8, 2025

ti-mo Jan 8, 2025

mejedi Jan 8, 2025

ti-mo Jan 8, 2025

mejedi Jan 8, 2025

ti-mo Jan 8, 2025

mejedi Jan 8, 2025

ti-mo Jan 8, 2025

mejedi commented Jan 8, 2025 •

edited

Loading

mejedi commented Jan 9, 2025

		Identifier: args.Identifier,
		ShortEnumIdentifier: func(_, element string) string {

bpf2go: ergonomic enums #1636

Are you sure you want to change the base?

bpf2go: ergonomic enums #1636

Conversation

mejedi commented Dec 20, 2024

ti-mo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mejedi Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mejedi commented Jan 8, 2025 • edited Loading

mejedi commented Jan 9, 2025

mejedi Jan 8, 2025 •

edited

Loading

mejedi commented Jan 8, 2025 •

edited

Loading