Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column name/utility accessors for Legolas.Schema #52

Closed
wants to merge 10 commits into from

Conversation

OTDE
Copy link
Member

@OTDE OTDE commented Aug 9, 2022

This PR addresses #41. It adds two functions:

    schema_field_names(::Type{<:Legolas.Schema})

Get a tuple with the names of the fields of this `Legolas.Schema`, including names that
have been inherited from this `Legolas.Schema`'s parent schema.
    schema_field_types(::Legolas.Schema{name,version})

Get a tuple with the types of the fields of this `Legolas.Schema`, including types of fields that
have been inherited from this `Legolas.Schema`'s parent schema.

Tests have been added to make sure these functions behave like we expect.

Parent = @row("parent@1",
              first_parent_field::Int=1,
              second_parent_field::String="second")

# returns (:first_parent_field, :second_parent_field)
Legolas.schema_field_names(Legolas.Schema{:parent,1}) 
Legolas.schema_field_names(Legolas.Schema("parent@1"))
Legolas.schema_field_names(Parent())
Legolas.schema_field_names(Parent)

# returns (Int, String)
Legolas.schema_field_types(Schema{:parent,1})
Legolas.schema_field_types(Schema("parent@1"))
Legolas.schema_field_types(Parent())
Legolas.schema_field_types(Parent)

This includes tests for schema inheritance as well:

Child = @row("child@1" > "parent@1",
              first_child_field::Symbol=:first,
              second_child_field="I can be anything")

# returns (:first_child_field, :second_child_field, :first_parent_field, :second_parent_field)
Legolas.schema_field_names(Child) 

# returns (Symbol, Any, Int, String)
Legolas.schema_field_types(Child)

@OTDE OTDE added the enhancement New feature or request label Aug 9, 2022
@OTDE OTDE requested a review from jrevels August 9, 2022 23:07
@OTDE OTDE self-assigned this Aug 9, 2022
src/rows.jl Outdated Show resolved Hide resolved
src/rows.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
src/rows.jl Outdated
Get a tuple with the names of the fields of this `Legolas.Schema`, including names that
have been inherited from this `Legolas.Schema`'s parent schema.
"""
schema_field_names(::Type{S}) where {S<:Schema} = throw(UnknownSchemaError(S()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think the MethodError you'd get from removing this fallback method is more correct than an UnknownSchemaError; it could be that a schema exists (and thus is "known") but a method for schema_field_names simply wasn't defined for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be that a schema exists (and thus is "known") but a method for schema_field_names simply wasn't defined for it.

Hmmm how so?

I maybe skimmed too fast but using UnknownSchemaError here seems to be consistent with the other usages of it (and is useful here for the same reason it's useful in those other contexts)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm how so?

Very hypothetical case of defining a schema without using @row.

If it's consistent and useful then that's good enough for me.

Comment on lines +108 to +126
print(
io,
"""
encountered unknown `Legolas.Schema` type: $(e.schema)

This generally indicates that this schema has not been defined (i.e.
the schema's corresponding `@row` statement has not been executed) in
the current Julia session.

In practice, this can arise if you try to read a Legolas table with a
prescribed schema, but haven't actually loaded the schema definition
(or commonly, haven't loaded the dependency that contains the schema
definition - check the versions of loaded packages/modules to confirm
your environment is as expected).

Note that if you're in this particular situation, you can still load
the raw table as-is without Legolas; e.g., to load an Arrow table, call `Arrow.Table(path)`.
"""
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix style?

Comment on lines +233 to +251
"""
schema_field_names(::Type{<:Legolas.Schema})

Get a tuple with the names of the fields of this `Legolas.Schema`, including names that
have been inherited from this `Legolas.Schema`'s parent schema.
"""
schema_field_names(s::Schema) = schema_field_names(typeof(s))
schema_field_names(::Row{S}) where {S} = schema_field_names(S)
schema_field_names(::Type{<:Row{S}}) where {S} = schema_field_names(S)

"""
schema_field_types(::Legolas.Schema{name,version})

Get a tuple with the types of the fields of this `Legolas.Schema`, including types of fields that
have been inherited from this `Legolas.Schema`'s parent schema.
"""
schema_field_types(s::Schema) = schema_field_types(typeof(s))
schema_field_types(::Row{S}) where {S} = schema_field_types(S)
schema_field_types(::Type{<:Row{S}}) where {S} = schema_field_types(S)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of schema_field_names / schema_field_types, should we define Tables.Schema(::Legolas.Schema) or something similar?

https://tables.juliadata.org/stable/#Tables.Schema is the natural object to use for this purpose in Tables.jl ecosystem IIUC

note that I think Tables.schema(::Vector{<:Legolas.Row})might already "just work" on main (the lowercase schema is a generic accessor). Would be worth checking.

Also we should keep in mind that Legolas Schema fields are explicitly unordered - might be best to call that out in the function docs as well

Copy link
Member

@jrevels jrevels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely in favor!

see above comments for change suggestions

also before merging we should bump the package version as part of the PR

@jrevels
Copy link
Member

jrevels commented Oct 3, 2022

superseded by #54

@jrevels jrevels closed this Oct 3, 2022
@ararslan ararslan deleted the sc/column-name-utility-functions branch October 3, 2022 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants