Full DB Dump #29

arnaudsm · 2024-04-28T17:04:29Z

Is there a full data dump available somewhere ?

I'm doing research and dataviz (and I suspect many here also do), which requires all the data at once.
Scraping the API is cumbersome and also uses precious CPU time from this service.

Having a giant CSV or JSON or SQL file updated once a month would be awesome.
Wikipedia and Stackoverflow provide a similar service, which are quite popular.

scipima · 2024-04-28T17:36:11Z

Hi there, far from being the full data dump, but I started to pull data relative to Plenary here https://github.com/scipima/ep_vote_collect.git.
The README gives you indications for getting either the data for the daily Plenary, or the full mandate.
Hope this helps,
Marco

tfrancart · 2024-04-29T08:34:19Z

Datasets can be downloaded from the EP Open Data Portal : https://data.europarl.europa.eu/en/datasets

arnaudsm · 2024-04-30T12:45:05Z

Thank you for the suggestion, but the dataset portal only contains a fraction of the API data, and the 236 files have to be downloaded manually.

A full dump would be greatly appreciated.

In the meantime I'm working on a JS library to dump the API similar to @scipima work, and might open-source it at some point.

tfrancart · 2024-04-30T13:24:00Z

Thank you for the suggestion, but the dataset portal only contains a fraction of the API data,

Can you be more specific on this ? What is in the API data that is not in the datasets ? I can understand that datasets are not as fresh as the API data, but other than that, I would expect the RDF content to be identical to the one from the API

and the 236 files have to be downloaded manually.

If one can scrape thousands of API calls, one could scrape 236 file downloads :-) (in reality, 236 * 28 languages). This could be an alternate way to recreate a full DB dump (but, as I said, probably not as fresh), without stressing the API.

arnaudsm · 2024-05-03T18:28:36Z

@tfrancart I was thinking of /meetings/{event-id}/vote-results. Is there a way to retrieve it on the datasets page ?

Thank you for you help, I am still new to this ecosytem. I rate-limited my dump scripts for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full DB Dump #29

Full DB Dump #29

arnaudsm commented Apr 28, 2024 •

edited

Loading

scipima commented Apr 28, 2024

tfrancart commented Apr 29, 2024

arnaudsm commented Apr 30, 2024

tfrancart commented Apr 30, 2024

arnaudsm commented May 3, 2024

Full DB Dump #29

Full DB Dump #29

Comments

arnaudsm commented Apr 28, 2024 • edited Loading

scipima commented Apr 28, 2024

tfrancart commented Apr 29, 2024

arnaudsm commented Apr 30, 2024

tfrancart commented Apr 30, 2024

arnaudsm commented May 3, 2024

arnaudsm commented Apr 28, 2024 •

edited

Loading