-
Notifications
You must be signed in to change notification settings - Fork 229
/
setup.py
96 lines (78 loc) · 4.65 KB
/
setup.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""The setup script."""
from setuptools import setup, find_packages
readme = """
# Yet Another Keyword Extractor (Yake)
Unsupervised Approach for Automatic Keyword Extraction using Text Features.
YAKE! is a light-weight unsupervised automatic keyword extraction method which rests on text statistical features extracted from single documents to select the most important keywords of a text. Our system does not need to be trained on a particular set of documents, neither it depends on dictionaries, external-corpus, size of the text, language or domain. To demonstrate the merits and the significance of our proposal, we compare it against ten state-of-the-art unsupervised approaches (TF.IDF, KP-Miner, RAKE, TextRank, SingleRank, ExpandRank, TopicRank, TopicalPageRank, PositionRank and MultipartiteRank), and one supervised method (KEA). Experimental results carried out on top of twenty datasets (see Benchmark section below) show that our methods significantly outperform state-of-the-art methods under a number of collections of different sizes, languages or domains. In addition to the python package here described, we also make available a <a href="http://yake.inesctec.pt" target="_blank">demo</a>, an <a href="http://yake.inesctec.pt/apidocs/#!/available_methods/post_yake_v2_extract_keywords" target="_blank">API</a> and a <a href="https://play.google.com/store/apps/details?id=com.yake.yake" target="_blank">mobile app</a>.
## Main Features
* Unsupervised approach
* Corpus-Independent
* Domain and Language Independent
* Single-Document
## Where can I find YAKE!?
YAKE! is available online [http://yake.inesctec.pt], as an open source Python package [https://github.com/LIAAD/yake] and on [Google Play](https://play.google.com/store/apps/details?id=com.yake.yake).
## References
Please cite the following works when using YAKE
<b>In-depth journal paper at Information Sciences Journal</b>
Campos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C. and Jatowt, A. (2020). YAKE! Keyword Extraction from Single Documents using Multiple Local Features. In Information Sciences Journal. Elsevier, Vol 509, pp 257-289. [pdf](https://doi.org/10.1016/j.ins.2019.09.013)
<b>ECIR'18 Best Short Paper</b>
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). A Text Feature Based Automatic Keyword Extraction Method for Single Documents. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 684 - 691. [pdf](https://link.springer.com/chapter/10.1007/978-3-319-76941-7_63)
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). YAKE! Collection-independent Automatic Keyword Extractor. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 806 - 810. [pdf](https://link.springer.com/chapter/10.1007/978-3-319-76941-7_80)
## Awards
[ECIR'18](http://ecir2018.org) Best Short Paper
"""
requirements = [
'tabulate',
'click>=6.0',
"numpy",
"segtok",
"networkx",
"jellyfish"]
setup_requirements = [
'pytest-runner'
]
test_requirements = [
"pytest",
"flake8"
]
setup(
name='yake',
version='0.4.8',
description="Keyword extraction Python package",
long_description=readme,
long_description_content_type='text/markdown',
url='https://pypi.python.org/pypi/yake',
project_urls={
'Documentation': 'https://liaad.github.io/yake/',
'Source': 'https://github.com/LIAAD/yake',
},
packages=find_packages(include=['yake','StopwordsList']),
entry_points={
'console_scripts': [
'yake=yake.cli:keywords'
]
},
license="LGPLv3",
include_package_data=True,
install_requires=requirements,
zip_safe=False,
keywords='yake',
classifiers=[
'Development Status :: 3 - Alpha',
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Topic :: Scientific/Engineering :: Information Analysis',
'Topic :: Software Development :: Libraries',
'Topic :: Text Processing',
'Topic :: Text Processing :: Linguistic',
],
test_suite='tests',
tests_require=test_requirements,
setup_requires=setup_requirements,
)