You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
where the first case is an item with P698 claim but without P921 claims, and the second has P698 and P921 claims. However out of these 30M there are at least six (6) that are different:
ralf@ark:~/wikidata> grep '[]' PMID.ndjson
Note that 3 don't have P698 (which should not happen given the filter), and 3 have [] instead of null for no P921.
I'm not claiming there is a bug in wikibase-dump-filter, just that this needs investigating, and the ticket is a start. But maybe you have seen this and have an immediate explanation?
The text was updated successfully, but these errors were encountered:
the problem comes from this untested situation where you use both a --claim filter and --simplify:
the claim filter didn't care for ranks, and let the deprecated statements through
while the simplify function, by default, only keeps the truthy statements.
The later behavior can be disabled by passing a keepNonTruthy=true flag to the simplify function, but we could also consider having the filter checking the simplify option to know if it should drop or not a match due to a non-truthy statement
Applying the command
bzcat latest-all.json.bz2 |wikibase-dump-filter --simplify --claim 'P698' |jq '[.id,.claims.P698,.claims.P921]' -c >PMID.ndjson
results in >30M lines like this:where the first case is an item with P698 claim but without P921 claims, and the second has P698 and P921 claims. However out of these 30M there are at least six (6) that are different:
ralf@ark:~/wikidata> grep '[]' PMID.ndjson
Note that 3 don't have P698 (which should not happen given the filter), and 3 have
[]
instead ofnull
for no P921.I'm not claiming there is a bug in wikibase-dump-filter, just that this needs investigating, and the ticket is a start. But maybe you have seen this and have an immediate explanation?
The text was updated successfully, but these errors were encountered: