-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing few features #1
base: master
Are you sure you want to change the base?
Conversation
lib/jake.ex
Outdated
_ -> | ||
list | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have to do all these? can't we just use member_of, assuming the given schema is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the case when enum list for type number consists of null or string items along with regular numbers, then if we use only member_of without filtering first, then we may get wrong result, even if it is possible to get correct result.
You can also check out the test integer enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had assumed enum value would be always correct, I don't understand why would anybody use invalid values.
But let's say we follow the spec and validate the value against enclosing spec, we only seem to handle the type
right? what if it's a complex case which has other constraints like maximum
etc? Do you think this could be handled in a simpler way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had assumed enum value would be always correct
In fact, I had also assumed the same, but later on looked at schema doc, which has mentioned that
the elements in the enum array should also be valid against the enclosing schema:
, therefore, I decided to filter the enum list depending on the schema type.
But let's say we follow the spec and validate the value against enclosing spec, we only seem to handle the type right?
yes
what if it's a complex case which has other constraints like maximum etc? Do you think this could be handled in a simpler way?
I also thought about it, but it is difficult and hence handled only simple case. I think, validating particular value against schema is out of scope of this project and is a separate project in itself. For this we can use something like ExJsonSchema.Validator
, in the beginning, to filter out valid enum values, which then can feed to StreamData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possibly a check on the sanity of the schema itself? It complicates the code here. Probably a separate tool could be used to warn/show error to user, if the schema has wrong enclosing type for enums given. This library could assume that the schema is correct. And only concentrate on generating data for correct schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This library could assume that the schema is correct.
It will certainly simplify things.
lib/jake/array.ex
Outdated
|
||
def gen_array(map, type), do: arraytype(map, map["enum"], map["items"]) | ||
|
||
def arraytype(map, enum, items) when enum != nil, do: Jake.gen_enum(enum, "array") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this happen? won't the enum get handle at the Jake module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider schema,
{"type": "integer", "enum": [10,12, null]}
It will first check type and enter into Jake.Number module. If it finds, enum list there then it will be handled by it.
If schema is {"enum": [10,12, null]}
, without mentioning type then it will be handled by Jake module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the pattern match for enum happens early? so this will never get executed?
def gen_all(map, enum, type) when enum != nil, do: gen_enum(enum, type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I'll move it lower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, is it really necessary to handle enum in all type-specific generator? anyway it's going to call the root module? we could instead just remove the enum handling from all the type-specific generators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking the same. I'll check it out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could instead just remove the enum handling from all the type-specific generators?
It won't handle case like {"type":["integer", "string"], "enum": [1,2,3,"hello", null, true]}
. Otherwise all other cases will work. In order to accept above case, we'll need to change/override Jake.gen_enum()
method to handle list of types.
I'll see, if it can be done without complicating things.
lib/jake/array.ex
Outdated
for(n <- @type_list, is_map(n), do: Jake.gen_init(n)) |> StreamData.one_of() | ||
end | ||
|
||
def add_additional_items(list, bool, max, min) when is_boolean(bool) and bool do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the same as pattern matching bool = true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is same as bool == true, but gives much better idea about code, since bool can be map also, which is handled by another overloaded function.
lib/jake/array.ex
Outdated
|
||
def add_additional_items(list, bool, max, min) when is_boolean(bool) and not bool do | ||
if length(list) in min..max do | ||
StreamData.fixed_list(list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen on else? could it return nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have list, but additional items are not allowed and length(list) (i.e. total items) do not fit in the range then test will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen this pattern in multiple places, my question is why are returning nil when we know it's going to fail anyway. Assuming the expected return type is a generator, returning nil here would cause some random error in some unrelated place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way, I look at it is that it is easier to understand what code is doing even after few years.
Assuming the expected return type is a generator, returning nil here would cause some random error in some unrelated place?
Yes, it is possible. May be we can use try..catch block or raise some custom error to make things better. I think, nil can also be used as error code in this case, which means no generator, and this case needs to be handled by caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing error would be better. Nil being used as error is an assumption that would have to be remembered. Explicit is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing error would be better
I'm also thinking the same.
lib/jake/array.ex
Outdated
end | ||
|
||
def get_one_of() do | ||
for(n <- @type_list, is_map(n), do: Jake.gen_init(n)) |> StreamData.one_of() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this reuse the notype generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. In notype, a type will be added to the schema depending on the properties.
In get_one_of, it is not needed, since we already know type, here we using it to generate random values from random types, when no information is given about array. I think here, we can also use StreamData.term()
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StreamData.term()
might not generate valid json values, tuple for example. I was assuming notype should handle cases like empty spec {}
, and would generate some valid value
lib/jake/number.ex
Outdated
StreamData.integer() | ||
@num_min 0 | ||
|
||
@num_max 100_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the default so low? The generator should ideally generate all possible value, assuming the cost of generation is not that high.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are just temporary, I also wanted to discuss about possible limits. Cost of generation is not high, except for floating point multiple, which I'll look into afterwards.
lib/jake/number.ex
Outdated
end | ||
|
||
def get_float_number(min, max) do | ||
StreamData.filter(StreamData.float([{:min, min}, {:max, max}]), fn x -> true end, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we use filter here? seems to return true always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For smaller floating point ranges, it was throwing FilterTooNarrowError. Therefore, I'm using filter only for the sake of keep on trying for 100 times.
If there is any other way to keep on trying even after failure, then we can change that option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter & bind_filter should be avoided if possible or should be used in cases where we know we will only filter a small number of values generated by the underlying filter
I might be misunderstanding the problem. Why would StreamData.float throw FilterTooNarowError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I made mistake while testing. It won't throw FilterTooNarrowError. I was using different method earlier which was giving that error. After, studying StreamData properly, and then using :min, :max
with StreamData.float, I did not get that error. I'll remove unnecessary filter.
lib/jake/string.ex
Outdated
re = Randex.stream(~r/[a-zA-Z0-9\_]{#{min},#{max}}/) | ||
|
||
if min <= max do | ||
StreamData.bind_filter(StreamData.string(:alphanumeric), fn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use min_length/max_length instead of bind_filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was written when I did not read StreamData.string method properly. Later on I kept it, as it is just to know your opinion on the above manual method.
lib/jake/string.ex
Outdated
{:cont, StreamData.constant(x)} | ||
|
||
x when true -> | ||
{:cont, Enum.take(re, 1) |> StreamData.member_of()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will return the same value every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, when I checked, it was returning different value every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but it's not supposed to be used like this. you could just use Randex.stream(~r/[a-zA-Z0-9\_]{#{min},#{max}}/, mod: Randex.Generator.StreamData)
, because randex itself just returns a generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not supposed to be used like this
I've also realised it
In the beginning, I was somewhat confused about how to convert Randex.stream into StreamData. Later I came to know about it when I looked at activesphere/jake implementation. But as I said in the previous post, I wanted to know your opinion on it.
I'll change the method.
lib/jake/string.ex
Outdated
|
||
if min <= max do | ||
StreamData.bind_filter(pat, fn | ||
x when byte_size(x) in min..max -> {:cont, StreamData.constant(x)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the different between bind_filter and filter? is bind_filter necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check this out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bind_filter is not necessary here, one can do it with just filter.
lib/jake.ex
Outdated
_ -> | ||
list | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possibly a check on the sanity of the schema itself? It complicates the code here. Probably a separate tool could be used to warn/show error to user, if the schema has wrong enclosing type for enums given. This library could assume that the schema is correct. And only concentrate on generating data for correct schema.
{min, max} = get_min_max(map) | ||
|
||
case map["additionalItems"] do | ||
x when (is_boolean(x) and x) or is_nil(x) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
x when is_map(x) ->
generate_list(list, Jake.gen_init(map), max, min)
x when x == true or x == nil ->
generate_list(list, get_one_of(), max, min)
_ ->
unless length(list) in min..max, do: raise ...
StreamData.fixed_list(list)
Avoid extra indirection? Otherwise it becomes difficult to understand the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better. I'll apply necessary changes.
lib/jake/array.ex
Outdated
|
||
def add_additional_items(list, bool, max, min) when is_boolean(bool) and not bool do | ||
if length(list) in min..max do | ||
StreamData.fixed_list(list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing error would be better. Nil being used as error is an assumption that would have to be remembered. Explicit is better.
lib/jake/array.ex
Outdated
when length(list) in min..max -> | ||
{:cont, StreamData.constant(list)} | ||
|
||
nlist when true -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nlist when true -> | |
_ -> :skip |
Won't that be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't that be enough?
certainly
end | ||
|
||
def decide_min_max(map, item, min, max) | ||
when is_integer(min) and is_integer(max) and min < max do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this doesn't match? If it should throw error, then being explicit about it might be better.
lib/jake/array.ex
Outdated
{min, max} | ||
end | ||
|
||
def arraytype(map, enum, items) when is_nil(items) and is_nil(enum) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly could be combined with the previous arraytype
definition (Assuming we are handling enum like one of the previous comments).
def list_gen(map, items) when is_nil(items) or is_map(items) do
item = if is_nil(items), do: get_on_of(), else: Jake.gen_init(items)
{min, max} = get_min_max(map)
decide_min_max(map, item, min, max)
end
lib/jake/array.ex
Outdated
|
||
@max_items 1000 | ||
|
||
def gen_array(map, type), do: arraytype(map, map["enum"], map["items"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be we can name things by what they are about. E.g. Array could be about list or typle. The code seems to handle both cases, but probably the cases could be named better?
E.g. Rename arraytype methods to either list_gen
or tuple_gen
based on the case handled.
def gen_array(%{"items": items}=map, _) do
cond do
is_map(items) or items == nil -> list_gen(map, items) # note: not passing map["enum"], as mentioned in previous comment
is_array(items) -> tuple_gen(map, items)
_ -> raise error # ?
end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about naming things properly. It will avoid unnecessary ambiguity .
Following features, corresponding to json schema v4 have been implemented