You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a large list of wikidata id's or Q Numbers and I'd like to filter out purely these entities. Does this already exist/is this possible to implement?
Thank you!
The text was updated successfully, but these errors were encountered:
It's not implemented but could be done fairly easily with grep (which will be much faster, see documentation on prefiltering):
# Create a file with one id per line, matching dump lines startecho"Q1Q2Q3"| awk '{print "^{\"type\":\"item\",\"id\":\"" $1 "\","}'> qid_filter
# Filter the dump with that shortlist of ids
cat latest-all.json.gz | gzip -d | grep -E -f qid_filter | sed 's/,$//'> selected_entities.ndjson
Hello,
I have a large list of wikidata id's or Q Numbers and I'd like to filter out purely these entities. Does this already exist/is this possible to implement?
Thank you!
The text was updated successfully, but these errors were encountered: