Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return array from sha3 and keccak hash functions #1

Open
wants to merge 4 commits into
base: staging
Choose a base branch
from

Conversation

anstylian
Copy link

Motivation

This pull request introduces changes to the Hash instructions in the synthesizer/program/src/logic/instruction/operation/hash.rs file. The main focus is to improve the handling of destination types and add support for additional array types in hashing operations (sha3 and keccak).

It is a step forward for implementing ARC-81.

Test Plan

For the moment, I have tested that I can get the results in an array, but I would also like to test that the previous functionality works as expected. You are welcome to suggest changes to the test plan.

Copy link

@diogofriggo diogofriggo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so I have to take this from the standpoint of a "get-the-feedback-rolling PR". I was trying to address some of the duplicated code and other issues. But in the end it's not worth it at this stage, so I just want to point out the things I'm not entirely comfortable with (these are a direct consequence of us trying to fit something new into the larger snarkVM and this feeling out-of-place, there's nothing wrong with your work there, it's actually quite good)
With that out of the way:

to_circuit_array_u8 and to_circuit_array_u16

fn to_circuit_array_u8<A: circuit::Aleo>(hash: &[circuit::Boolean<A>], len: usize) -> Result<circuit::Plaintext<A>> {
    use circuit::{Eject, Inject};

    ensure!(hash.len() % 8 == 0, "Expected hash length to be a multiple of 8, but found {}", hash.len());

    let mut out = vec![];

    for element in hash.chunks(8) {
        let mut value = 0;
        for (i, bit) in element.iter().enumerate() {
            if bit.eject_value() {
                value |= 1 << i;
            }
        }

        let value = console::types::U8::new(value);
        let value = circuit::Plaintext::Literal(
            circuit::Literal::U8(circuit::types::U8::new(element.eject_mode(), value)),
            Default::default(),
        );

        out.push(value);
    }

    ensure!(len == out.len(), "Expected to produce {len} elements, but produced {}", out.len());

    Ok(circuit::Plaintext::Array(out, Default::default()))
}

fn to_circuit_array_u16<A: circuit::Aleo>(hash: &[circuit::Boolean<A>], len: usize) -> Result<circuit::Plaintext<A>> {
    use circuit::{Eject, Inject};

    ensure!(hash.len() % 16 == 0, "Expected hash length to be a multiple of 16, but found {}", hash.len());

    let mut out = vec![];

    for element in hash.chunks(16) {
        let mut value = 0;
        for (i, bit) in element.iter().enumerate() {
            if bit.eject_value() {
                value |= 1 << i;
            }
        }

        let value = console::types::U16::new(value);
        let value = circuit::Plaintext::Literal(
            circuit::Literal::U16(circuit::types::U16::new(element.eject_mode(), value)),
            Default::default(),
        );

        out.push(value);
    }

    ensure!(len == out.len(), "Expected to produce {len} elements, but produced {}", out.len());

    Ok(circuit::Plaintext::Array(out, Default::default()))
}
  • they are nearly the same code copy pasted, is the intention to have
    • to_circuit_array_i8,
    • to_circuit_array_i16,
    • to_circuit_array_i32,
    • to_circuit_array_i64,
    • to_circuit_array_i128,
    • to_circuit_array_u8,
    • to_circuit_array_u16,
    • to_circuit_array_u32,
    • to_circuit_array_u64,
    • to_circuit_array_u128, ?

Essentially the same function copy-pasted 9 times. If it were simple to fix it you'd have done it already but again about my intro.
This may not be worth it at this stage. But for reference const generics can help to a certain extent there

to_console_array

  • once more very similar code to to_circuit_array_u8 and to_circuit_array_u16 but here instead of creating two functions you managed to go with just one
fn to_console_array<N: Network>(hash: &[bool], len: usize, literal_type: LiteralType) -> Result<Plaintext<N>> {
    let chunk_size = match literal_type {
        LiteralType::U8 => 8,
        LiteralType::U16 => 16,
        _ => bail!("Unsupported literal type: {literal_type} for array"),
    };

    ensure!(
        hash.len() % chunk_size == 0,
        "Expected hash length to be a multiple of {chunk_size}, but found {}",
        hash.len()
    );

    let mut out = vec![];
    for element in hash.chunks(chunk_size) {
        let mut value = 0;
        for (i, bit) in element.iter().enumerate() {
            if *bit {
                value |= 1 << i;
            }
        }
        let value = match literal_type {
            LiteralType::U8 => {
                Plaintext::Literal(Literal::U8(console::types::U8::new(value.try_into()?)), Default::default())
            }
            LiteralType::U16 => {
                Plaintext::Literal(Literal::U16(console::types::U16::new(value.try_into()?)), Default::default())
            }
            _ => bail!("Unsupported literal type: {literal_type} for array"),
        };
        out.push(value);
    }
    ensure!(len == out.len(), "Expected to produce {len} elements, but produced {}", out.len());
    Ok(Plaintext::Array(out, Default::default()))
}

Vec of bools to u8

  • at the very least that can be extracted to a function and reused. We don't want to have that kind of logic spread around multiple places.
  • is bitvec an option? Much cleaner and less error prone but indeed would introduce a new dependency

to_destination_literal_type and to_destination_circuit_literal

  • Again essentially the same function but not trivial to generalize so don't worry about it now
fn to_destination_literal_type<N: Network>(
    output: Literal<N>,
    destination_type: &PlaintextType<N>,
) -> Result<Value<N>> {
    let dest = match destination_type {
        PlaintextType::Literal(literal_type) => output.cast_lossy(*literal_type)?,
        PlaintextType::Struct(..) => bail!("Cannot hash into a struct"),
        PlaintextType::Array(..) => bail!("Cannot hash into an array (yet)"),
    };

    Ok(Value::Plaintext(Plaintext::from(dest)))
}

fn to_destination_circuit_literal<A: circuit::Aleo, N: Network>(
    output: circuit::Literal<A>,
    destination_type: &PlaintextType<N>,
) -> Result<circuit::Value<A>> {
    let dest = match destination_type {
        PlaintextType::Literal(literal) => output.cast_lossy(*literal)?,
        PlaintextType::Struct(..) => bail!("Cannot hash into a struct"),
        PlaintextType::Array(..) => bail!("Cannot hash into an array (yet)"),
    };

    Ok(circuit::Value::Plaintext(circuit::Plaintext::Literal(dest, Default::default())))
}

misc

why is [u16; 24] allowed?
In general I'm not a big fan of hardcoded constants that don't make sense on their own (4-6, 12-14) but it's clear that this usage is pervasive in this module so ok by me.
This is also just a nitpick, feel free to ignore, but I prefer to reduce the horizontal span of code if I can:

PlaintextType::Array(array) => {
    let length = **array.length();
    let base_element_type = *array.base_element_type();
    match variant {
        4 | 12 => length == 32 && base_element_type == PlaintextType::Literal(LiteralType::U8),
        5 | 13 => length == 24 && base_element_type == PlaintextType::Literal(LiteralType::U16),
        6 | 14 => length == 32 && base_element_type == PlaintextType::Literal(LiteralType::U16),
        _ => false,
    }
},

Go for just what we need or do a bit more?

  • I've been thinking about this ever since I've started reviewing. You chose to cover [u8] and [u16] as destinations. My initial thought was to go for [u8] only but seeing your code I think adding [u16] gives a better idea of what it would look like to support more types so I like it.
  • Similarly you chose to support 384 and 512 bits, same thoughts
  • you've moved these checks from after the match to inside the match, that made the code so much more verbose. It's great that you've moved it to functions but still could it have been kept as a check after? My guess is that it's not simple so I'd leave it as is (since you have to convert to Value::PlainText you lose access to PlaintextType to do the checks)
  • same for both the matches in evaluate and execute

Conclusion

A lot of things feel out of place, it seems to be a consequence of their design, their choice of types and so on. It's a good first PR to get the ball rolling and see whether this leads to something that fits well with the system.

@anstylian
Copy link
Author

thanks for the review. I agree with your points.
the reason for some of the duplicate code is that console and circuit use different types. So Literals are different types in console and circuit.

the bitvec is a great option. It's not used until now in their project, but I can ask about it.

Good point on [u16;24], great opportunity for them to see it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants