You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to clarify regarding the Newsela data setup:
Am I right that the originally released data in 2015 (1,130 articles) was used in the paper? (That is, the text file in “newsela_data_share-20150302” in the Newsela release)
Following the description in Section 5 by having the first 1,070 articles for training, the next 30 for development, and the next 30 for testing, followed by filtering out sentence pairs corresponding to alignment levels 0-1, 1-2, and 2-3 gave me numbers of sentence pairs that are different from the paper (94,944 training, 2,531 development, and 2,462 test sentences). How can I come up with 94,208 training sentence pairs, 1,129 development sentence pairs, and 1,076 test sentence pairs as stipulated in the paper?
Thank you.
Regards,
Christian
The text was updated successfully, but these errors were encountered:
Hi,
I would like to clarify regarding the Newsela data setup:
Thank you.
Regards,
Christian
The text was updated successfully, but these errors were encountered: