SANA-Project

The SANA project's goal is to create a Islamic-specific database for research purposes. My contribution to this goal is to create a model that would predict category based on Abstract and Title.

In order to accomplish this, I have currently divided the work with taking removig no punctuation and removing punctuation to see the overall noise difference it creates.

Next Steps:

Create a dictionary or import a list of arabic names for grouping. Ex: Mohammed and Mohamad --> Mohammad
Include removing some punctuation vs others
Write machine learning classifiers as a pipeline

Name	Name	Last commit message	Last commit date
Latest commit nabihanaqvie Update README.md Jan 5, 2022 e683e55 · Jan 5, 2022 History 10 Commits
README.md	README.md	Update README.md	Jan 5, 2022
SANA-P.ipynb	SANA-P.ipynb	Punct Removal	Dec 9, 2021
SANA-P2.ipynb	SANA-P2.ipynb	no punct removal	Dec 9, 2021
SANA.ipynb	SANA.ipynb	Updated	Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SANA-Project

About

Releases

Packages

Languages

nabihanaqvie/SANA-Project

Folders and files

Latest commit

History

Repository files navigation

SANA-Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages