28 Jul 17:27

Alvant

8c5371c

v0.9.0 Latest

Latest

New

Datasets are now also available on HuggingFace: https://huggingface.co/TopicNet.

Fix

Datasets downloading via dataset manager (that is, fix datasets site URL address; issue, pr).
Thetaless regularizer behavior for some "extreme modality cases" (issue, pr, related pr).
Dataset class internals for newer pandas, also "pandas + numpy" collab in some scores (issue1, issue2, pr).
Update some demo notebooks (issue1, pr1; issue2, pr2).
Fix some ARTM-master-related-stuff, freeze "ARTM-compatible" protobuf version for easier installation (related issue, pr1, pr2).
Freeze all other package versions in order to facilitate the work of argeologists in the future (pr, and minor pr-s: pr1, pr2, pr3, pr4).
"Lick" Readme: add more links, fix some "formatting irregularities" (issue1, issue2, pr1, pr2).
Add long project description for proper setup (pr), which is now displayed on TopicNet's PyPI page: https://pypi.org/project/topicnet.

Change

Dummy model now stores all score values (not only the last one; pr).
Frozen score now can be added to topic model (this allows loading model after incomplete score saving; pr).

Assets 2

06 Jul 23:51

Alvant

v0.8.0

cab4c5a

v0.8.0

Fixed

Is the score out of control or not in controller_cube.py
Separate/multithread mode for cubes in config_parser.py
TopicNet is finally Mac-installable using Pip 🎉

Changed

From now on the library is compatible with Python 3.7 or higher (not Python 3.6)

New

Ability to define a score controller for any custom score
Experiment's restore_mode: if something happens during the computation, one can resume the process and proceed from last completed cube
Dataset's dictionary can be altered using recipes
Thetaless regularizer is now more user-friendly: only dataset is required as input, not n_dw matrix (the notebook is also updated)
Score's should_compute: scores may be computed not necessarily on every iteration
Score's precomputed_data: scores may share some data between each other (eg. one score calculates something and makes the result available for other scores)

Assets 2

19 May 12:03

TatyanaGreenkina

v0.7.1

f7c2b55

v0.7.1

Reworked top_tokens_viewer and top_documnets_viewer
Added WNTM recipe
Added dataset_cooc
Reworked dataset
Speed up get_possible_modalities
Reworked write_vw
Added new regularizer thetaless and demo of its usage and benefits

Assets 2

15 Apr 12:44

Evgeny-Egorov-Projects

v0.7.0

468f8a8

Datasets, Recipes, Viewers Update and Rework

Various changes, as can be seen in release commit description

Assets 2

10 Mar 09:02

Alvant

v0.6.1

5dadfaa

v0.6.1

Fixed

Recipes topicnet.cooking_machine.recipes now included in the assembly on PyPi: setup.py updated, project rebuilt and uploaded.

Assets 2

22 Feb 13:25

Alvant

v0.6.0

988b70d

v0.6.0

New

Added demo notebooks:
- one: some comparison of TopicNet with Gensim library
- two: example of analysis of the 20 Newsgroups dataset, more examples of how one can conduct topic modeling with the help of TopicNet and ARTM
- three: more about 20 Newsgroups dataset analysis before actual topic modeling

Fixed

Improved top tokens html display by TopTokensViewer
Fixed TopicNet installation via pip: now all the necessary packages should be installed automatically.
So the command pip install topicnet should work just fine... for Linux :)

Future Plans

Add new regularizers
Add more abilities to control TopicModel's training process — with model.fit() function
Make the library installable with pip also for Windows and Mac without BigARTM preinstalled

Assets 2

27 Dec 10:30

Evgeny-Egorov-Projects

v0.5.0

d8e3be2

Functionality update

Enable end-to-end pip install topicnet (without BigARTM preinstalled) for Linux users.
Adding recipe import into the library
Support of custom regularizers
Experimental feature allowing to change regularization coefficients during model training
Documentation updated
Various code improvements