Releases: machine-intelligence-laboratory/TopicNet
Releases · machine-intelligence-laboratory/TopicNet
v0.9.0
New
- Datasets are now also available on HuggingFace: https://huggingface.co/TopicNet.
Fix
- Datasets downloading via dataset manager (that is, fix datasets site URL address; issue, pr).
- Thetaless regularizer behavior for some "extreme modality cases" (issue, pr, related pr).
- Dataset class internals for newer pandas, also "pandas + numpy" collab in some scores (issue1, issue2, pr).
- Update some demo notebooks (issue1, pr1; issue2, pr2).
- Fix some ARTM-master-related-stuff, freeze "ARTM-compatible" protobuf version for easier installation (related issue, pr1, pr2).
- Freeze all other package versions in order to facilitate the work of argeologists in the future (pr, and minor pr-s: pr1, pr2, pr3, pr4).
- "Lick" Readme: add more links, fix some "formatting irregularities" (issue1, issue2, pr1, pr2).
- Add long project description for proper setup (pr), which is now displayed on TopicNet's PyPI page: https://pypi.org/project/topicnet.
Change
- Dummy model now stores all score values (not only the last one; pr).
- Frozen score now can be added to topic model (this allows loading model after incomplete score saving; pr).
v0.8.0
Fixed
- Is the score out of control or not in controller_cube.py
- Separate/multithread mode for cubes in config_parser.py
- TopicNet is finally Mac-installable using Pip 🎉
Changed
- From now on the library is compatible with Python 3.7 or higher (not Python 3.6)
New
- Ability to define a score controller for any custom score
- Experiment's restore_mode: if something happens during the computation, one can resume the process and proceed from last completed cube
- Dataset's dictionary can be altered using recipes
- Thetaless regularizer is now more user-friendly: only dataset is required as input, not
n_dw
matrix (the notebook is also updated) - Score's should_compute: scores may be computed not necessarily on every iteration
- Score's precomputed_data: scores may share some data between each other (eg. one score calculates something and makes the result available for other scores)
v0.7.1
- Reworked top_tokens_viewer and top_documnets_viewer
- Added WNTM recipe
- Added dataset_cooc
- Reworked dataset
- Speed up get_possible_modalities
- Reworked write_vw
- Added new regularizer thetaless and demo of its usage and benefits
Datasets, Recipes, Viewers Update and Rework
Various changes, as can be seen in release commit description
v0.6.1
v0.6.0
New
- Added demo notebooks:
Fixed
- Improved top tokens html display by TopTokensViewer
- Fixed TopicNet installation via pip: now all the necessary packages should be installed automatically.
So the commandpip install topicnet
should work just fine... for Linux :)
Future Plans
- Add new regularizers
- Add more abilities to control TopicModel's training process — with
model.fit()
function - Make the library installable with
pip
also for Windows and Mac without BigARTM preinstalled
Functionality update
-
Enable end-to-end
pip install topicnet
(without BigARTM preinstalled) for Linux users. -
Adding recipe import into the library
-
Support of custom regularizers
-
Experimental feature allowing to change regularization coefficients during model training
-
Documentation updated
-
Various code improvements
v0.4.1
Recipe demo-notebook and minor changes in cubes added.
v0.4.0
Optimized memory consumption
Recipes and queue in the multithreading part updated.
Performance update
Fixed print lengths