Skip to content

NLP tool used to extract the name, email, and phone number from an example business card. Built using ApacheOpenNLP

Notifications You must be signed in to change notification settings

Arjun-Vijay/Business-Card-Parser

Repository files navigation

Business-Card-Parser

NLP tool used to extract the name, email, and phone number from an input document

Project Description
This project extracts the name, email, and phone from an inputted file. It uses the ApacheOpenNLP library to preform NER(Name Entity Recognition) and email regex expression in accordance with the RFC & SMTP standard

Execute Pre-Built Tests

  1. To execute the program first clone the repository and navigate to the location of clone using the below commands
  2.   git clone https://github.com/Arjun-Vijay/Business-Card-Parser.git
      cd /pathToClone
    
  3. Upon doing so, execute the runnable jar file with the below command
  4.   java -jar BusinessCardParser.jar 
    
  5. Follow the instructions presented through the user interface. Example input/output below
  6.   Welcome to the Business Card Parser
      Please Enter Either 'Test X' to run tests 1-3, 'Self Test' to run a new test, or 'Exit' to quit
      Enter: Test 1
      
      
      Name: John Doe
      Email: [email protected]
      Number: 4105551234
    

    Note, The commands must be entered as presented by the user interface. Any trailing whitespace will result in invalid input

Execute New Tests

  1. To execute new tests first add the test file to the root directory of the cloned repository. You can make sure the file exists by listing the directory contents.
  2.     Arjuns-MBP:Business-Card-Parser arjunvijay$ ls
        BusinessCardParser.jar	en-ner-person.bin	selfTest.txt		test1.txt		test3.txt
        bin			        en-token.bin		src			test2.txt
    
  3. You may then run the appropriate command during execution as seen below
  4.     Enter: Self Test
        Enter the name of your .txt file: selfTest.txt 
        
        Name: Bill Gates
        Email: [email protected]
        Number: 1112223333
    

Language Dependencies

The NameFinder model was trained using an English-based model. Therefore, when performing NER the model identifies names commonly seen in the English language.

About

NLP tool used to extract the name, email, and phone number from an example business card. Built using ApacheOpenNLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages