You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
updated extract_text_one method to use Timeout.timeout while looping through pages of a PDF to extract text - sometimes pdf-reader hangs, so only allow 1 second max before skipping to the next page
add error catching to extract_from_metadata method for xml parsing of PDF metadata; sometimes PDF metadata is full of non-parseable XML