-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full DB Dump #29
Comments
Hi there, far from being the full data dump, but I started to pull data relative to Plenary here https://github.com/scipima/ep_vote_collect.git. |
Datasets can be downloaded from the EP Open Data Portal : https://data.europarl.europa.eu/en/datasets |
Thank you for the suggestion, but the dataset portal only contains a fraction of the API data, and the 236 files have to be downloaded manually. A full dump would be greatly appreciated. In the meantime I'm working on a JS library to dump the API similar to @scipima work, and might open-source it at some point. |
Can you be more specific on this ? What is in the API data that is not in the datasets ? I can understand that datasets are not as fresh as the API data, but other than that, I would expect the RDF content to be identical to the one from the API
If one can scrape thousands of API calls, one could scrape 236 file downloads :-) (in reality, 236 * 28 languages). This could be an alternate way to recreate a full DB dump (but, as I said, probably not as fresh), without stressing the API. |
@tfrancart I was thinking of Thank you for you help, I am still new to this ecosytem. I rate-limited my dump scripts for now. |
Is there a full data dump available somewhere ?
I'm doing research and dataviz (and I suspect many here also do), which requires all the data at once.
Scraping the API is cumbersome and also uses precious CPU time from this service.
Having a giant CSV or JSON or SQL file updated once a month would be awesome.
Wikipedia and Stackoverflow provide a similar service, which are quite popular.
The text was updated successfully, but these errors were encountered: