Earnings Calls Extraction
In this repository I include different attempts of processing earnings calls transcripts to write the desired text into a txt file. This required removing summaries, annotation as well as headers and footers. Furthermore it checks for duplicate files.
I also wrote a script to retrieve file information like date, quarter, sector, and company name. Furthermore, there is a script to retrieve the company ticker from a pdf document and crawl EPS surprise data from yahoo finance to write the collective results into an excel file.
These scripts are all written for the specific format of Earnings Calls transcripts. To process a folder of transcripts, just add the a Pdfs folder to the working directory which includes all the files.