We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
是不是根据默认模式(精确模式)?如何自己修改成全模式?
The text was updated successfully, but these errors were encountered:
Sorry, something went wrong.
邮件已收到~
还有一个问题,我测试了一句话的分词:“我喜欢看电视,不喜欢看电影”。直接默认模式分词以后会分出:我,喜欢, 看电视, ,不, 喜欢, 看, 电影 这几个词。但是如果用TF-IDF找关键词用topK=None的模式也就是不设定关键词个数,显示的分词则不会包含“我”,“不” 这种单个字。是什么原因呢。怎样能让TF-IDF找关键词生成的词表结果也包含单个字?
邮件已收到~ 还有一个问题,我测试了一句话的分词:“我喜欢看电视,不喜欢看电影”。直接默认模式分词以后会分出:我,喜欢, 看电视, ,不, 喜欢, 看, 电影 这几个词。但是如果用TF-IDF找关键词用topK=None的模式也就是不设定关键词个数,显示的分词则不会包含“我”,“不” 这种单个字。是什么原因呢。怎样能让TF-IDF找关键词生成的词表结果也包含单个字?
TFIDF在进行关键词提取时,会对长度小于2的词进行过滤
# jieba.analyes.tfidf.py class TFIDF(KeywordExtractor): ... def extract_tags(self, sentence, topK=20, withWeight=False, allowPOS=(), withFlag=False): ... if len(wc.strip()) < 2 or wc.lower() in self.stop_words: continue
No branches or pull requests
是不是根据默认模式(精确模式)?如何自己修改成全模式?
The text was updated successfully, but these errors were encountered: