Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables View: For data frame columns, consider showing str representation rather than individual values #968

Open
jjallaire opened this issue Aug 7, 2023 · 3 comments
Labels
area: variables Issues related to Variables category. enhancement New feature or request

Comments

@jjallaire
Copy link
Contributor

Consider the bulldozer blue book data frame (details on downloading and reading it are here: #966)

The environment pane currently shows up to 100 individual values when you expand this column:

Screen Shot 2023-08-07 at 8 09 07 AM

However, there is a print method for the column that IMO provides a much more useful summary (R data frames also support this behavior):

Screen Shot 2023-08-07 at 8 06 43 AM

I think that in the Environment pane we should have the notion of data types that terminate in a textual summary (rather than just recursing into long displays of individual scalars)

@jjallaire jjallaire added this to the Internal Preview milestone Aug 7, 2023
@jjallaire
Copy link
Contributor Author

Here's a related example for a variable that is an array of > 400,000 booleans. We show the first 100 values (all True):

Screen Shot 2023-08-07 at 8 16 35 AM

Whereas the describe() method shows a more useful summary:

Screen Shot 2023-08-07 at 8 18 09 AM

I am hopeful that numpy, pandas, torch, etc. all have ways of inspecting data at a higher level that we can leverage in the environment pane (note that b/c these are all drill downs they don't need to be computed eagerly).

@jjallaire
Copy link
Contributor Author

Another example is our treatment of Matrices. Here is the Environment pane view (nested lists):

Screen Shot 2023-08-07 at 9 06 25 AM

Whereas here is the print() view:

Screen Shot 2023-08-07 at 9 07 17 AM

@petetronic petetronic changed the title for data frame columns, consider showing str representation rather than individual values in environment pane Variables View: For data frame columns, consider showing str representation rather than individual values Nov 23, 2023
@wesm wesm added area: variables Issues related to Variables category. enhancement New feature or request labels Feb 29, 2024
@wesm
Copy link
Contributor

wesm commented Dec 6, 2024

Aside: now that we have the data explorer code to power summary statistics, fulfilling summary requests for the variables pane should be reasonable straightforward, once we have settled on what the new UI treatment is going to look like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: variables Issues related to Variables category. enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants