You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a user of intake-solr I would like to access datasets/queries that are larger than memory.
I believe the intake way to solve this is by creating an Solr Driver that has partitioned access, and that implements the to_dask() method. Is that correct?
The text was updated successfully, but these errors were encountered:
I'm afraid not - to_dask will produce a dask dataframe with a single partition containing all the data, which is the default behaviour in the absence of any dask-specific code. The pysolr package which executes the query has no way to split the output into partitions in the way that would be useful. I might be out of date, though - if you know pysolr better or there is a more recent executor, I'd be happy to help point you towards implementing it for intake/
As a user of intake-solr I would like to access datasets/queries that are larger than memory.
I believe the intake way to solve this is by creating an Solr Driver that has partitioned access, and that implements the to_dask() method. Is that correct?
The text was updated successfully, but these errors were encountered: