Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correct the description of integer representation #291

Merged
merged 3 commits into from
Feb 7, 2025

Conversation

david-christiansen
Copy link
Collaborator

Closes #286

@david-christiansen
Copy link
Collaborator Author

@TwoFX: Thanks for the great issue!

One thing I'm not totally happy about with this version of the text is that it uses the term "unboxed" in a somewhat ambiguous manner. I'm trying to balance conceptual clarity against overwhelming readers with pedantic details, but I'm not sure this strikes the best balance - what do you think?

In particular, "unboxed" is meaning two different things:

  1. Representing small values inside a pointer, rather than on the heap, in a lean_object*
  2. Using a completely different, definitely non-pointer type like uint64_t based on flow-sensitive properties of monomorphic contexts

In both cases, there's neither heap allocation nor indirection, so in that sense they're the same.

@david-christiansen david-christiansen added the HTML available HTML has been generated for this PR label Feb 7, 2025
@TwoFX
Copy link
Member

TwoFX commented Feb 7, 2025

I had another look at lean.h and I'm no longer sure I believe the text as written. Here are the boxing and unboxing functions:

static inline lean_object * lean_box(size_t n) { return (lean_object*)(((size_t)(n) << 1) | 1); }
static inline size_t lean_unbox(lean_object * o) { return (size_t)(o) >> 1; }

static inline lean_obj_res lean_box_uint32(uint32_t v) {
    if (sizeof(void*) == 4) {
        /* 32-bit implementation */
        lean_obj_res r = lean_alloc_ctor(0, 0, sizeof(uint32_t));
        lean_ctor_set_uint32(r, 0, v);
        return r;
    } else {
        /* 64-bit implementation */
        return lean_box(v);
    }
}

static inline unsigned lean_unbox_uint32(b_lean_obj_arg o) {
    if (sizeof(void*) == 4) {
        /* 32-bit implementation */
        return lean_ctor_get_uint32(o, 0);
    } else {
        /* 64-bit implementation */
        return lean_unbox(o);
    }
}

static inline lean_obj_res lean_box_uint64(uint64_t v) {
    lean_obj_res r = lean_alloc_ctor(0, 0, sizeof(uint64_t));
    lean_ctor_set_uint64(r, 0, v);
    return r;
}

static inline uint64_t lean_unbox_uint64(b_lean_obj_arg o) {
    return lean_ctor_get_uint64(o, 0);
}

lean_box and lean_unbox are generated for Nat as well as UIntX for X <= 16. For UInt32 and UInt64, the compiler generates calls to the special functions.

So to me this looks like the statement "When boxed, fixed-width integer types that take at least as many bits as the platform's pointer type are represented the same way as Nat" is not correct for 64-bit integers on 64-bit systems and for 32/64-bit integers on 32-bit systems, since these always have an indirection, not just for large values.

@david-christiansen
Copy link
Collaborator Author

Thank you for the thorough investigation and correction of misunderstandings - I think this is convincing.

What do you think of this update?

Copy link

github-actions bot commented Feb 7, 2025

Preview for this PR is ready! 🎉 (also as a proofreading version). built with commit 69df5a7.

@TwoFX
Copy link
Member

TwoFX commented Feb 7, 2025

I really like this latest version!

@david-christiansen
Copy link
Collaborator Author

Thanks again for the very useful help!

@david-christiansen david-christiansen merged commit 35bdc96 into main Feb 7, 2025
8 checks passed
@david-christiansen david-christiansen deleted the int-fix branch February 7, 2025 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HTML available HTML has been generated for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime representation of fixed-width types is described incorrectly
2 participants