Skip to content

Latest commit

 

History

History

viewers

Viewers

Module viewers provides information from a topic model allowing to estimate the model quality. Its advantage is in unified call ifrastucture to the topic model making the routine and tedious task of extracting the information easy.

Currently module contains the following viewers:

base_viewer (BaseViewer)

Module responsible for base infrastructure.

document_cluster (DocumentClusterViewer)

Module which allows to visualize collection documents. May be slow for large document collections as it uses TSNE algorithm from sklearn library.

Visualisation of reduced document embeddings colored according to their topic made by DocumentClusterViewer.

spectrum (TopicSpectrumViewer)

Module contains heuristics for solving TSP to arrange topics minimizing total distance of the spectrum.

Each point on the plot represents some topic. The viewer helped to calculate such a route between topics when one topic is connected with similar one, and so on, forming a circle.

top_documents_viewer (TopDocumentsViewer)

Module with functions that work with dataset document collections.

The viewer shows fragments of top documents corresponding to some topic.

top_similar_documents_viewer (TopSimilarDocumentsViewer)

Module containing class for finding similar document for a given one. This viewer helps to estimate homogeneity of clusters given by the model.

Some document from text collection (on top), and documents nearest to it given topic model. The viewer (currently) gives only document names as output, but the picture is not very difficult to be made.

top_tokens_viewer (TopTokensViewer)

Module with class for displaying the most relevant tokens in each topic of the model.

Output of the TopTokensViewer. Token score in the topic is calculated for every token, score function can be specified at the stage of a viewer initialization.

topic_mapping (TopicMapViewer)

Module allowing to compare topics between two different models trained on the same collection.

The mapping between topics of two models (currently only topic names are displayed).

Deprecated

  • initial_doc_to_topic_viewer — first edition of TopDocumentsViewer

  • tokens_viewer - first edition of TopTokensViewer