-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract writing encoding error #2
Comments
This is on Windows 10, so can often be a problem. |
Will see if this works :- # write the abstract to a file Also, thanks for doing this - now I just have to do the text extraction piece. |
@bluetyson I wanted to make sure you were aware of EarthArXiv's pending move to California Digital Libraries (CDL). We will be leaving Center for Open Science at the end of August. There's a lot of benefits to moving to CDL; however, the down side is that there will be a new API and this code will no longer work. CDL hasn't released the specs for the new API yet. If you're using this current API for text extraction/analysis please have everything you need from the API by Friday August 21. EarthArXiv will likely be offline for a few weeks after that. I'll update this repository with a new version as soon as we know more about the new API |
Thanks, I did read that. Yes, I have downloaded everything I need for my test now. Answers the question of whether a pull request is useful, too. |
Downloading preprint 1 of 1582
Downloading preprint 2 of 1582
Downloading preprint 3 of 1582
UnicodeEncodeError Traceback (most recent call last)
in
74 # write the abstract to a file
75 abstF = open(localAbstract, 'w')
---> 76 abstF.write( preprint.description )
77 abstF.close()
78
~\miniconda3\envs\avant2\lib\encodings\cp1252.py in encode(self, input, final)
17 class IncrementalEncoder(codecs.IncrementalEncoder):
18 def encode(self, input, final=False):
---> 19 return codecs.charmap_encode(input,self.errors,encoding_table)[0]
20
21 class IncrementalDecoder(codecs.IncrementalDecoder):
UnicodeEncodeError: 'charmap' codec can't encode character '\u1e9f' in position 541: character maps to
The text was updated successfully, but these errors were encountered: