Lengths of generated lists of derived sorted lists are too short #128

buzden · 2024-03-25T18:26:51Z

In the sorted-list example, there is a data type for sorted list, which has a derived generator. For some reason, this derived generator always produces lists of length ~2-3, sometimes 4, not more, even of fuel is over say, 100. Influence of the fuel to the values inside lists are visible clearly, but somewhy this does not affect the length of the generated lists.

The text was updated successfully, but these errors were encountered:

buzden · 2024-03-26T15:37:01Z

The reason for such behaviour is the following. Data definitions are the following (ignoring forward declarations):

data SortedList : Type where
  Nil  : SortedList
  (::) : (x : Nat) -> (xs : SortedList) -> FirstGT x xs => SortedList

data FirstGT : Nat -> SortedList -> Type where
  E  : FirstGT n []
  NE : x `GT` n -> FirstGT n $ (x::xs) @{prf}

By definition of (::) and dependencies in it, we can see that the parameter xs would be generated by the parameter of type FirstGT x xs in a dependent pair. So, actual recursive xs would be generated not by generator of SortedList, but by FirstGT. But FirstGT is not a recursive data type, so its constructors would be distributed evenly, so the original argument xs of (::) would get empty and non-empty lists with equal probability. Thus, we get usually lists of length 1 or 2, and sometimes 0 or 3.

buzden · 2024-03-26T15:43:14Z

One possible solution for this would be tuning frequences not only for recursive constructors, but for constructors that are potentially indexed by self-recursive constructors, in case when these indices are generated, not given.

Potentially indexed means the following: if we have a data type, say,

data Indexed : SortedList -> Type where
  Zero : Indexed []
  Many : Indexed (x::xs)
  Any : Indexed xs

then in the generator Gen MaybeEmpty (xs ** Indexed xs) only Zero constructor would have frequency 1, both other constructors must have frequency proportional to left fuel, since Any may also have recursive constructor as an index.

Anyway, this idea needs a good thought, and I'm not sure this is a correct way to proceed.

buzden · 2024-03-27T18:27:08Z

Two more thoughts:

It seems that this problem is mostly relevant when the simplification hack [ hack ] An alternative version of the simplification hack #96 is used
it's unclear what to do if the index is of form, say, MkX (MkY y) (MkZ z) with different combinations of direct recursiveness status of MkX, MkY and MkZ. I'm assuming now that a disjunction of all involved constructors in the index expression is enough to trigger leftDepth weight for frequency.

buzden · 2024-03-29T13:14:52Z

The current modification of the criteria is the following. Consider all the names in all generated indices of the constructor-under-the-question (1). Those names are either constructor's arguments (2), or some external names (3). If any external name (2) is a recursive constructor or a non-constructor, then we consider the whole original constructor (1) to be potentially recursively-indexed. If not, then consider types of all mentioned as (2), and look if their types are recursive or not. If yes, then consider the constructor (1) to be potentitally recursively-indexed. If both checks are false, consider it as a non-recursive constructor.

EDIT: no, this idea is wrong. For example, using this we'll get equal probabilities for both constructors of Fin, which is not good.

I was trying to formulate universal criteria covering cases like

data Prop : List X -> Type where
  P1 : Prop [] -- non-recursive 
  P2 : Prop (x::xs) -- indexed by recursive
  P3 : Prop xs -- indexed by potentially recursive
  P4 : Prop (externalFun x) -- indexed by potentially recursive
  P5 : (f : Y -> List X) -> (x : Y) -> Prop (f x) -- indexed by potentially recursive

buzden · 2024-04-03T12:29:41Z

The latest thought was that we should first clear the common application prefix of index expressions, so that, say, S in indices of Fin wouldn't count. I even implemented it in the commit 5bf57ca but it doesn't seem to work properly, because it starts to count X2 as recursively indexed in types like

data X : Nat -> Type where
  X1 : X 1
  X2 : X 2

since the index has expressions S Z and S (S Z). Removing the prefix leads to Z and S Z respectively, after which S in X2 selects high weight. In practise it make some distributions worse than before.

Now, I'm feeling like a blind rat searching for an exit in a maze....

More suprisingly, I realised that my original explanation of bad distribution in the sorted-list example is wrong, because FirstGT's constructor NE is honestly recursive, thanks to the prf argument. It means, that I was wrong about the reason of observed bad distribution.

buzden added derive: core Something in between single type generator and its constructors and removed derive: least-effort Relates to the `LeastEffort` derivation algorithm labels Mar 28, 2024

buzden linked a pull request Mar 28, 2024 that will close this issue

[ derive ] Fix distribution for tricky dependency+recursion case #129

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lengths of generated lists of derived sorted lists are too short #128

Lengths of generated lists of derived sorted lists are too short #128

buzden commented Mar 25, 2024

buzden commented Mar 26, 2024

buzden commented Mar 26, 2024

buzden commented Mar 27, 2024

buzden commented Mar 29, 2024 •

edited

Loading

buzden commented Apr 3, 2024

Lengths of generated lists of derived sorted lists are too short #128

Lengths of generated lists of derived sorted lists are too short #128

Comments

buzden commented Mar 25, 2024

buzden commented Mar 26, 2024

buzden commented Mar 26, 2024

buzden commented Mar 27, 2024

buzden commented Mar 29, 2024 • edited Loading

buzden commented Apr 3, 2024

buzden commented Mar 29, 2024 •

edited

Loading