NLP tool used to extract the name, email, and phone number from an input document
Project Description
This project extracts the name, email, and phone from an inputted file. It uses the ApacheOpenNLP library
to preform NER(Name Entity Recognition) and email regex expression in accordance with the RFC & SMTP standard
- To execute the program first clone the repository and navigate to the location of clone using the below commands
- Upon doing so, execute the runnable jar file with the below command
- Follow the instructions presented through the user interface. Example input/output below
git clone https://github.com/Arjun-Vijay/Business-Card-Parser.git
cd /pathToClone
java -jar BusinessCardParser.jar
Welcome to the Business Card Parser
Please Enter Either 'Test X' to run tests 1-3, 'Self Test' to run a new test, or 'Exit' to quit
Enter: Test 1
Name: John Doe
Email: [email protected]
Number: 4105551234
Note, The commands must be entered as presented by the user interface. Any trailing whitespace will result in invalid input
- To execute new tests first add the test file to the root directory of the cloned repository. You can make sure the file exists by listing the directory contents.
- You may then run the appropriate command during execution as seen below
Arjuns-MBP:Business-Card-Parser arjunvijay$ ls
BusinessCardParser.jar en-ner-person.bin selfTest.txt test1.txt test3.txt
bin en-token.bin src test2.txt
Enter: Self Test
Enter the name of your .txt file: selfTest.txt
Name: Bill Gates
Email: [email protected]
Number: 1112223333