Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spurious discourse relation set on CSV import #31

Open
cbogart opened this issue Jul 31, 2018 · 1 comment
Open

spurious discourse relation set on CSV import #31

cbogart opened this issue Jul 31, 2018 · 1 comment
Labels

Comments

@cbogart
Copy link
Contributor

cbogart commented Jul 31, 2018

One record had a "parent" contribution even though no parent column existed in the csv file

@cbogart
Copy link
Contributor Author

cbogart commented Aug 1, 2018

Figured out why: CSV import by default assumes that each sequential posting in the same forum is a reply to the previous one. (This is a good assumption in some cases, like the "crito" dataset, but bad in others). In this case, there were a bunch of independent test answers with scores imported, and if two were classified the same way in sequence, they got treated as posting and reply.

Workaround: Create a blank 'replyto' column in a CSV for import where you do not want default reply structure to be inferred.

@cbogart cbogart added the wontfix label Aug 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant