A causal intervention framework to learn robust and interpretable character representations inside subword-based language models
-
Updated
Jul 10, 2023 - Jupyter Notebook
A causal intervention framework to learn robust and interpretable character representations inside subword-based language models
It aims to write new sentences by learning character units sentences using RNN. As training data, a collection of Shakespeare's novels was used.
retro style tokenization for language models
This repository contains the code and PLODv2 dataset to train character-level language models (CLM) for abbreviation and long-form detection released with our LREC-COLING 2024 publication
Name generation using RNN. This model was trained for generating indian names. Made using keras.
Recurrent neural network for building a character-level language model and its application to generating new dinosaur names
In this project, I worked with a small corpus consisting of simple sentences. I tokenized the words using n-grams from the NLTK library and performed word-level and character-level one-hot encoding. Additionally, I utilized the Keras Tokenizer to tokenize the sentences and implemented word embedding using the Embedding layer. For sentiment analysis
Text Article generator using using Character level LSTM network.
An implementation of "Character-level Convolutional Networks for Text Classification" in Tensorflow. See https://arxiv.org/pdf/1509.01626.pdf.
Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING-2020).
Build a character level language model to generate new dinosaur names
Sequence Models coding assignments
Lyrics Generation:notes: using LSTM , word2vec Analysis and more
Notebooks of programming assignments of Sequence Models course of deeplearning.ai on coursera in May-2020
This is a diacritization model for Arabic language. This model was built/trained using the Tashkeela: the Arabic diacritization corpus on Kaggle
Add a description, image, and links to the character-level-language-model topic page so that developers can more easily learn about it.
To associate your repository with the character-level-language-model topic, visit your repo's landing page and select "manage topics."