Skip to content

Commit

Permalink
update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
bab2min committed Aug 4, 2020
1 parent 98a5033 commit c850b42
Show file tree
Hide file tree
Showing 5 changed files with 71 additions and 5 deletions.
18 changes: 17 additions & 1 deletion README.kr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ tomotopy 란?

더 자세한 정보는 https://bab2min.github.io/tomotopy/index.kr.html 에서 확인하시길 바랍니다.

tomotopy의 가장 최신버전은 0.8.2 입니다.
tomotopy의 가장 최신버전은 0.9.0 입니다.

시작하기
---------------
Expand Down Expand Up @@ -253,6 +253,22 @@ tomotopy의 Python3 예제 코드는 https://github.com/bab2min/tomotopy/blob/ma

역사
-------
* 0.9.0 (2020-08-04)
* 모델의 상태를 알아보기 쉽게 출력해주는 `tomotopy.LDAModel.summary()` 메소드가 추가되었습니다.
* 난수 생성기를 `EigenRand`_로 대체하여 생성 속도를 높이고 플랫폼 간의 결과 차이를 해소하였습니다.
* 이로 인해 `seed`가 동일해도 모델 학습 결과가 0.9.0 이전 버전과 달라질 수 있습니다.
* `tomotopy.HDPModel`에서 간헐적으로 발생하는 학습 오류를 수정했습니다.
* 이제 `tomotopy.DMRModel.alpha`가 메타데이터별 토픽 분포의 사전 파라미터를 보여줍니다.
* `tomotopy.DTModel.get_count_by_topics()`가 2차원 `ndarray`를 반환하도록 수정되었습니다.
* `tomotopy.DTModel.alpha`가 `tomotopy.DTModel.get_alpha()`와 동일한 값을 반환하도록 수정되었습니다.
* `tomotopy.GDMRModel`의 document에 대해 `metadata` 값을 얻어올 수 없던 문제가 해결되었습니다.
* 이제 `tomotopy.HLDAModel.alpha`가 문헌별 계층 분포의 사전 파라미터를 보여줍니다.
* `tomotopy.LDAModel.global_step`이 추가되었습니다.
* 이제 `tomotopy.MGLDAModel.get_count_by_topics()`가 전역 토픽과 지역 토픽 모두의 단어 개수를 보여줍니다.
* `tomotopy.PAModel.alpha`, `tomotopy.PAModel.subalpha`, `tomotopy.PAModel.get_count_by_super_topic()`이 추가되었습니다.

.. _EigenRand: https://github.com/bab2min/EigenRand

* 0.8.2 (2020-07-14)
* `tomotopy.DTModel.num_timepoints`와 `tomotopy.DTModel.num_docs_by_timepoint` 프로퍼티가 추가되었습니다.
* `seed`가 동일해서 플랫폼이 다르면 다른 결과를 내던 문제가 일부 해결되었습니다. 이로 인해 32bit 버전의 모델 학습 결과가 이전 버전과는 달라졌습니다.
Expand Down
18 changes: 17 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The current version of `tomoto` supports several major topic models including

Please visit https://bab2min.github.io/tomotopy to see more information.

The most recent version of tomotopy is 0.8.2.
The most recent version of tomotopy is 0.9.0.

Getting Started
---------------
Expand Down Expand Up @@ -259,6 +259,22 @@ meaning you can use it for any reasonable purpose and remain in complete ownersh

History
-------
* 0.9.0 (2020-08-04)
* The `tomotopy.LDAModel.summary()` method, which prints human-readable summary of the model, has been added.
* The random number generator of package has been replaced with `EigenRand`_. It speeds up the random number generation and solves the result difference between platforms.
* Due to above, even if `seed` is the same, the model training result may be different from the version before 0.9.0.
* Fixed a training error in `tomotopy.HDPModel`.
* `tomotopy.DMRModel.alpha` now shows Dirichlet prior of per-document topic distribution by metadata.
* `tomotopy.DTModel.get_count_by_topics()` has been modified to return a 2-dimensional `ndarray`.
* `tomotopy.DTModel.alpha` has been modified to return the same value as `tomotopy.DTModel.get_alpha()`.
* Fixed an issue where the `metadata` value could not be obtained for the document of `tomotopy.GDMRModel`.
* `tomotopy.HLDAModel.alpha` now shows Dirichlet prior of per-document depth distribution.
* `tomotopy.LDAModel.global_step` has been added.
* `tomotopy.MGLDAModel.get_count_by_topics()` now returns the word count for both global and local topics.
* `tomotopy.PAModel.alpha`, `tomotopy.PAModel.subalpha`, and `tomotopy.PAModel.get_count_by_super_topic()` have been added.

.. _EigenRand: https://github.com/bab2min/EigenRand

* 0.8.2 (2020-07-14)
* New properties `tomotopy.DTModel.num_timepoints` and `tomotopy.DTModel.num_docs_by_timepoint` have been added.
* A bug which causes different results with the different platform even if `seeds` were the same was partially fixed.
Expand Down
18 changes: 17 additions & 1 deletion tomotopy/documentation.kr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ tomotopy 란?
* Correlated Topic Model (`tomotopy.CTModel`)
* Dynamic Topic Model (`tomotopy.DTModel`)

tomotopy의 가장 최신버전은 0.8.2 입니다.
tomotopy의 가장 최신버전은 0.9.0 입니다.

.. image:: https://badge.fury.io/py/tomotopy.svg

Expand Down Expand Up @@ -292,6 +292,22 @@ tomotopy의 Python3 예제 코드는 https://github.com/bab2min/tomotopy/blob/ma

역사
-------
* 0.9.0 (2020-08-04)
* 모델의 상태를 알아보기 쉽게 출력해주는 `tomotopy.LDAModel.summary()` 메소드가 추가되었습니다.
* 난수 생성기를 `EigenRand`_로 대체하여 생성 속도를 높이고 플랫폼 간의 결과 차이를 해소하였습니다.
* 이로 인해 `seed`가 동일해도 모델 학습 결과가 0.9.0 이전 버전과 달라질 수 있습니다.
* `tomotopy.HDPModel`에서 간헐적으로 발생하는 학습 오류를 수정했습니다.
* 이제 `tomotopy.DMRModel.alpha`가 메타데이터별 토픽 분포의 사전 파라미터를 보여줍니다.
* `tomotopy.DTModel.get_count_by_topics()`가 2차원 `ndarray`를 반환하도록 수정되었습니다.
* `tomotopy.DTModel.alpha`가 `tomotopy.DTModel.get_alpha()`와 동일한 값을 반환하도록 수정되었습니다.
* `tomotopy.GDMRModel`의 document에 대해 `metadata` 값을 얻어올 수 없던 문제가 해결되었습니다.
* 이제 `tomotopy.HLDAModel.alpha`가 문헌별 계층 분포의 사전 파라미터를 보여줍니다.
* `tomotopy.LDAModel.global_step`이 추가되었습니다.
* 이제 `tomotopy.MGLDAModel.get_count_by_topics()`가 전역 토픽과 지역 토픽 모두의 단어 개수를 보여줍니다.
* `tomotopy.PAModel.alpha`, `tomotopy.PAModel.subalpha`, `tomotopy.PAModel.get_count_by_super_topic()`이 추가되었습니다.

.. _EigenRand: https://github.com/bab2min/EigenRand

* 0.8.2 (2020-07-14)
* `tomotopy.DTModel.num_timepoints`와 `tomotopy.DTModel.num_docs_by_timepoint` 프로퍼티가 추가되었습니다.
* `seed`가 동일해서 플랫폼이 다르면 다른 결과를 내던 문제가 일부 해결되었습니다. 이로 인해 32bit 버전의 모델 학습 결과가 이전 버전과는 달라졌습니다.
Expand Down
18 changes: 17 additions & 1 deletion tomotopy/documentation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ The current version of `tomoto` supports several major topic models including
* Correlated Topic Model (`tomotopy.CTModel`)
* Dynamic Topic Model (`tomotopy.DTModel`).

The most recent version of tomotopy is 0.8.2.
The most recent version of tomotopy is 0.9.0.

.. image:: https://badge.fury.io/py/tomotopy.svg

Expand Down Expand Up @@ -295,6 +295,22 @@ meaning you can use it for any reasonable purpose and remain in complete ownersh

History
-------
* 0.9.0 (2020-08-04)
* The `tomotopy.LDAModel.summary()` method, which prints human-readable summary of the model, has been added.
* The random number generator of package has been replaced with `EigenRand`_. It speeds up the random number generation and solves the result difference between platforms.
* Due to above, even if `seed` is the same, the model training result may be different from the version before 0.9.0.
* Fixed a training error in `tomotopy.HDPModel`.
* `tomotopy.DMRModel.alpha` now shows Dirichlet prior of per-document topic distribution by metadata.
* `tomotopy.DTModel.get_count_by_topics()` has been modified to return a 2-dimensional `ndarray`.
* `tomotopy.DTModel.alpha` has been modified to return the same value as `tomotopy.DTModel.get_alpha()`.
* Fixed an issue where the `metadata` value could not be obtained for the document of `tomotopy.GDMRModel`.
* `tomotopy.HLDAModel.alpha` now shows Dirichlet prior of per-document depth distribution.
* `tomotopy.LDAModel.global_step` has been added.
* `tomotopy.MGLDAModel.get_count_by_topics()` now returns the word count for both global and local topics.
* `tomotopy.PAModel.alpha`, `tomotopy.PAModel.subalpha`, and `tomotopy.PAModel.get_count_by_super_topic()` have been added.

.. _EigenRand: https://github.com/bab2min/EigenRand

* 0.8.2 (2020-07-14)
* New properties `tomotopy.DTModel.num_timepoints` and `tomotopy.DTModel.num_docs_by_timepoint` have been added.
* A bug which causes different results with the different platform even if `seeds` were the same was partially fixed.
Expand Down
4 changes: 3 additions & 1 deletion tomotopy/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -289,7 +289,9 @@ def __call__(self, raw:str, user_data=None):
----------
stemmer : Callable[str, str]
단어를 스테밍하는데 사용되는 호출가능한 객체. 만약 이 값이 `None`이라면 스테밍은 사용되지 않습니다.
pattern : str
토큰을 추출하는데 사용할 정규식 패턴
SimpleTokenizer와 NLTK를 사용하여 스테밍을 하는 예제는 다음과 같습니다.
.. include:: ./auto_labeling_code_with_porter.rst"""
Expand Down

0 comments on commit c850b42

Please sign in to comment.