Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lengths of generated lists of derived sorted lists are too short #128

Open
buzden opened this issue Mar 25, 2024 · 5 comments · May be fixed by #129
Open

Lengths of generated lists of derived sorted lists are too short #128

buzden opened this issue Mar 25, 2024 · 5 comments · May be fixed by #129
Labels
derive: core Something in between single type generator and its constructors issue: distribution When distribution of generation is wrong part: derivation Related to automated derivation of generators part: examples Related to the usage examples status: confirmed bug Something isn't working

Comments

@buzden
Copy link
Owner

buzden commented Mar 25, 2024

In the sorted-list example, there is a data type for sorted list, which has a derived generator. For some reason, this derived generator always produces lists of length ~2-3, sometimes 4, not more, even of fuel is over say, 100. Influence of the fuel to the values inside lists are visible clearly, but somewhy this does not affect the length of the generated lists.

@buzden buzden added status: confirmed bug Something isn't working part: derivation Related to automated derivation of generators issue: distribution When distribution of generation is wrong derive: least-effort Relates to the `LeastEffort` derivation algorithm part: examples Related to the usage examples labels Mar 25, 2024
@buzden
Copy link
Owner Author

buzden commented Mar 26, 2024

The reason for such behaviour is the following. Data definitions are the following (ignoring forward declarations):

data SortedList : Type where
  Nil  : SortedList
  (::) : (x : Nat) -> (xs : SortedList) -> FirstGT x xs => SortedList

data FirstGT : Nat -> SortedList -> Type where
  E  : FirstGT n []
  NE : x `GT` n -> FirstGT n $ (x::xs) @{prf}

By definition of (::) and dependencies in it, we can see that the parameter xs would be generated by the parameter of type FirstGT x xs in a dependent pair. So, actual recursive xs would be generated not by generator of SortedList, but by FirstGT. But FirstGT is not a recursive data type, so its constructors would be distributed evenly, so the original argument xs of (::) would get empty and non-empty lists with equal probability. Thus, we get usually lists of length 1 or 2, and sometimes 0 or 3.

@buzden
Copy link
Owner Author

buzden commented Mar 26, 2024

One possible solution for this would be tuning frequences not only for recursive constructors, but for constructors that are potentially indexed by self-recursive constructors, in case when these indices are generated, not given.

Potentially indexed means the following: if we have a data type, say,

data Indexed : SortedList -> Type where
  Zero : Indexed []
  Many : Indexed (x::xs)
  Any : Indexed xs

then in the generator Gen MaybeEmpty (xs ** Indexed xs) only Zero constructor would have frequency 1, both other constructors must have frequency proportional to left fuel, since Any may also have recursive constructor as an index.


Anyway, this idea needs a good thought, and I'm not sure this is a correct way to proceed.

@buzden
Copy link
Owner Author

buzden commented Mar 27, 2024

Two more thoughts:

  1. It seems that this problem is mostly relevant when the simplification hack [ hack ] An alternative version of the simplification hack #96 is used

  2. it's unclear what to do if the index is of form, say, MkX (MkY y) (MkZ z) with different combinations of direct recursiveness status of MkX, MkY and MkZ. I'm assuming now that a disjunction of all involved constructors in the index expression is enough to trigger leftDepth weight for frequency.

@buzden buzden added derive: core Something in between single type generator and its constructors and removed derive: least-effort Relates to the `LeastEffort` derivation algorithm labels Mar 28, 2024
@buzden
Copy link
Owner Author

buzden commented Mar 29, 2024

The current modification of the criteria is the following. Consider all the names in all generated indices of the constructor-under-the-question (1). Those names are either constructor's arguments (2), or some external names (3). If any external name (2) is a recursive constructor or a non-constructor, then we consider the whole original constructor (1) to be potentially recursively-indexed. If not, then consider types of all mentioned as (2), and look if their types are recursive or not. If yes, then consider the constructor (1) to be potentitally recursively-indexed. If both checks are false, consider it as a non-recursive constructor.

EDIT: no, this idea is wrong. For example, using this we'll get equal probabilities for both constructors of Fin, which is not good.

I was trying to formulate universal criteria covering cases like

data Prop : List X -> Type where
  P1 : Prop [] -- non-recursive 
  P2 : Prop (x::xs) -- indexed by recursive
  P3 : Prop xs -- indexed by potentially recursive
  P4 : Prop (externalFun x) -- indexed by potentially recursive
  P5 : (f : Y -> List X) -> (x : Y) -> Prop (f x) -- indexed by potentially recursive

@buzden
Copy link
Owner Author

buzden commented Apr 3, 2024

The latest thought was that we should first clear the common application prefix of index expressions, so that, say, S in indices of Fin wouldn't count. I even implemented it in the commit 5bf57ca but it doesn't seem to work properly, because it starts to count X2 as recursively indexed in types like

data X : Nat -> Type where
  X1 : X 1
  X2 : X 2

since the index has expressions S Z and S (S Z). Removing the prefix leads to Z and S Z respectively, after which S in X2 selects high weight. In practise it make some distributions worse than before.


Now, I'm feeling like a blind rat searching for an exit in a maze....


More suprisingly, I realised that my original explanation of bad distribution in the sorted-list example is wrong, because FirstGT's constructor NE is honestly recursive, thanks to the prf argument. It means, that I was wrong about the reason of observed bad distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
derive: core Something in between single type generator and its constructors issue: distribution When distribution of generation is wrong part: derivation Related to automated derivation of generators part: examples Related to the usage examples status: confirmed bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant