Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
nabihanaqvie authored Jan 5, 2022
1 parent a46a3d4 commit e683e55
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,10 @@
# SANA-Project
# SANA-Project

The SANA project's goal is to create a Islamic-specific database for research purposes. My contribution to this goal is to create a model that would predict category based on Abstract and Title.

In order to accomplish this, I have currently divided the work with taking removig no punctuation and removing punctuation to see the overall noise difference it creates.

Next Steps:
1) Create a dictionary or import a list of arabic names for grouping. Ex: Mohammed and Mohamad --> Mohammad
2) Include removing some punctuation vs others
3) Write machine learning classifiers as a pipeline

0 comments on commit e683e55

Please sign in to comment.