-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to load csv-file while CSV does not have a problem #44
Comments
With CSVFiles
If I truncate the file to the first 26 lines, CSVFiles reads it without a problem. Below the first 27 lines (that causes the problem) as an example of the data:
|
This was a very interesting bug! The bug here is that this uncovered a whole bunch of problems that come up in the diagnostic display if parsing fails. I think I fixed all of them in queryverse/TextParse.jl#114. With that PR, things still don't work, but one gets a slightly more helpful error message:
What is happening here is that the type detection algorithm classifies column 12 (and I believe 14 as well) as Two options to solve this for now: 1) you can manually specify that these columns should be parsed as I do have a plan to make this more robust in general, i.e. a way to recover if the type detection fails (which can always happen, even if one samples more lines), but it will be a while until that is done. And this file also highlights that our default table printing code all messes up the width when there is some serious unicode there :) |
Hi @johannspies 👋 Coincidentally, I ran into the same issue with the same data set 😅 Since we were looking at the same data set, I thought I'd let you know I just created these two repos: based off this data set 😊 A bit of explanation and request for feedback can be found here JuliaFinance/Roadmap#5 Cheers 😊 |
After spending a little overcoming mixed content types in CSV columns, I found the solution in [Issue queryverse#44](queryverse#44 (comment)). I thought to add an example showing this, as it's probably quite a common problem with large, web scraped datasets.
No description provided.
The text was updated successfully, but these errors were encountered: