Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent printing of unordered collections across backends #4108

Open
robin-aws opened this issue May 31, 2023 · 4 comments
Open

Inconsistent printing of unordered collections across backends #4108

robin-aws opened this issue May 31, 2023 · 4 comments
Labels
kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: runtime Happens in Dafny's runtime (Add a `lang` tag if relevant) testing-method: uniform-backend-testing Issues found by ensuring uniform testing across backends

Comments

@robin-aws
Copy link
Member

This is one of those issues that most of us are fully aware of but have lived with for a long time without writing down (unless I just can't find the existing issue).

method Main() {
   print {"abc", "def"}, "\n";
}
% dafny run -t:java src/Scratch.dfy

Dafny program verifier finished with 0 verified, 0 errors
{['a', 'b', 'c'], ['d', 'e', 'f']}
% dafny run -t:cs src/Scratch.dfy

Dafny program verifier finished with 0 verified, 0 errors
{['d', 'e', 'f'], ['a', 'b', 'c']}

Not really a bug per se since the behavior of print is underspecified, but this makes consistent differential testing really annoying, whether testing a user program or the backends themselves.

My proposal for the most beneficial solution: implement the collections such as set, multiset and map in Dafny in the same way we now have a Dafny implementation of seq: https://github.com/dafny-lang/dafny/blob/master/Source/DafnyRuntime/DafnyRuntimeDafny/src/dafnyRuntime.dfy. This would depend on having a common implementation of hash code functions for all Dafny types, instead of relying on target language runtime hash codes. We'd then have unpredictable but deterministic ordering for these collections. This is also worth doing in order to provide more efficient implementations of these immutable data types, rather than wrapping mutable versions from the target runtimes and making expensive defensive copies everywhere.

See also #2062.

@robin-aws robin-aws added testing-method: uniform-backend-testing Issues found by ensuring uniform testing across backends kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: runtime Happens in Dafny's runtime (Add a `lang` tag if relevant) labels May 31, 2023
@robin-aws
Copy link
Member Author

Even worse, Python salts its hashes so its orderings are not even deterministic from one run to the next. This makes it effectively impossible to use %testDafnyForEachCompiler for such tests.

@MikaelMayer
Copy link
Member

I understand it's an issue for printing, but as you present it, it looks like you are only concerned about tests for now.
For tests, I think it could be solved by doing the following. For every collection where we expect a result for print like this:

...
  print myset;
// expect file
{"abc", "def"}
  1. Define a sequence of the expected result
  2. Verify that each element of the expected sequence is in the collection (with appropriate value if it's a map
  3. Verify that all keys of the sequence are different
  4. Verify that the length of the sequence is equal to the length of the map.
...
  var expected := ["abc", "def"];
  for i <- 0 to |expected| {
    if expected[i] !in myset {
      print "bad";
    }
  }
  if |myset| != |expected| || |myset| != |set k <- expected| {
    print "bad"
  }

@robin-aws
Copy link
Member Author

Indeed, I agree that would unblock the immediate testing pain, although not without additional complexity. I suggest the deeper change to implement these unordered collection types in Dafny code for the additional benefits (verified implementations, consistent performance across backends, etc.)

@robin-aws
Copy link
Member Author

Even worse, Python salts its hashes so its orderings are not even deterministic from one run to the next. This makes it effectively impossible to use %testDafnyForEachCompiler for such tests.

Turns out we could use PYTHONHASHSEED=... to make this deterministic at least: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: runtime Happens in Dafny's runtime (Add a `lang` tag if relevant) testing-method: uniform-backend-testing Issues found by ensuring uniform testing across backends
Projects
None yet
Development

No branches or pull requests

2 participants