GitHub - joliciel-informatique/talismane: NLP framework: sentence detector, tokeniser, pos-tagger and dependency parser

Talismane is a natural language processing framework with sentence detector, tokeniser, pos-tagger and dependency syntax parser. Current available language packs include French (standard and Universal Dependencies) and English.

Sample input:

Les amoureux qui se bécotent sur les bancs publics ont des petites gueules bien sympathiques.

Sample output: a syntax tree, shown below in CoNLL-X format, also available as a Java object for manipulation in code.

1	Les	les	DET	DET	n=p|	2	det	2	det
2	amoureux	amoureux	NC	NC	g=m|	10	suj	10	suj
3	qui	qui	PROREL	PROREL	n=s|	5	suj	5	suj
4	se	se	CLR	CLR	n=p|p=3|	5	aff	5	aff
5	bécotent	bécoter	V	V	n=p|t=PS|p=3|	2	mod_rel	2	mod_rel
6	sur	sur	P	P		5	mod	5	mod
7	les	les	DET	DET	n=p|	8	det	8	det
8	bancs	banc	NC	NC	n=p|g=m|	6	prep	6	prep
9	publics	public	ADJ	ADJ	n=p|g=m|	8	mod	8	mod
10	ont	avoir	V	V	n=p|t=P|p=3|	0	root	0	root
11	des	des	DET	DET	n=p|	13	det	13	det
12	petites	petit	ADJ	ADJ	n=p|g=f|	13	mod	13	mod
13	gueules	gueule	NC	NC	n=p|	10	obj	10	obj
14	bien	bien	ADV	ADV		15	mod	15	mod
15	sympathiques	sympathique	ADJ	ADJ	n=p|	13	mod	13	mod
16	.	.	PONCT	PONCT		15	ponct	15	ponct

Downloads: The latest release and language packs can be downloaded on the releases pages.

Wiki: Simple instructions for use can be found on the Talismane wiki.

Command-line usage: follow the setup instructions, and then run a command similar to the following:

java -Xmx1G -Dconfig.file=talismane-fr-X.X.X.conf -jar talismane-core-X.X.X.jar --analyse --sessionId=fr --encoding=UTF8 --inFile=data/frTest.txt --outFile=data/frTest.tal

Calling from Java: For syntax analysis within Java code via the API, see this Java code example.

JavaDoc API: You may also consult the full JavaDoc API online.

User's manual: An out-of-date users's manual can be found on the GitHub Talismane project page. For up-to-date documentation, you're far better off consulting the wiki or the JavaDoc API .

Additional information on the project can be found on the CLLE-ERSS laboratory Talismane project home page.

Language pack usage

The French language pack can be used for research purposes provided that you have a license for the French Treebank. The model included is not optimised as it uses a Maximum Entropy model (which only requires about 1G of RAM) rather than a Linear SVM model (which requires about 24G RAM). If you would like the more optimised Linear SVM model, please contact Assaf Urieli.
The English language pack can be used for research purposes provided that you have a license for the Penn Treebank. WARNING: the English model is only an initial version, with no attempts at optimisation.

Name		Name	Last commit message	Last commit date
Latest commit History 547 Commits
examples		examples
talismane_core		talismane_core
talismane_distribution		talismane_distribution
talismane_examples		talismane_examples
talismane_extensions		talismane_extensions
talismane_fr		talismane_fr
talismane_machine_learning		talismane_machine_learning
talismane_utils		talismane_utils
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases 57

Packages

Contributors 5

Languages

License

joliciel-informatique/talismane

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 57

Packages 0

Contributors 5

Languages

Packages