In Keras
Overall Idea:
- Convolve over character embeddings with different kernel sizes
- Concat them to get the char-word embedding
- Pass them through a Dense layer with Residual connection
- Optionally concat them with separate word embedding
- Pass sequence of obtained word embeddings through a LSTM encoder
- Train with a constrastive loss function (see References)
- TODO: Add loading utils
- TODO: Add preprocessing and padding utils
- TODO: Add batching utils
- TODO: Add model training code
- TODO: Add model continue-training code
- TODO: Test Similarity implementation on Quora similar pair dataset
- TODO: Test Classification implementation on Kaggle Toxic internet comments dataset
- TODO: Tune Hyperparameters and try different modifications to architectures
- TODO: Take Hyperparameters using argparse
- TODO: Add tensorboard and tfdbg support
from model import ClassifierModel, SimilarityModel
classifier = ClassifierModel(vocab_size=10000,
charset_size=100,
num_classes=5,
mode=ClassifierModel.MULTILABEL,
char_kernel_sizes=(3,),
encoder_hidden_units=128,
bidirectional=False)
classifier.compile_model()
similarity_model = SimilarityModel(vocab_size=10000,
charset_size=100,
num_negative_samples=1)
similarity_model.compile_model()
Overall Idea
Encoder architecture heavily inspired from
Loss function taken from
Other Contrastive Loss functions to try