Project: Fashion Style Prediction

Install

This project requires Python and the following Python libraries installed:

NumPy Pandas matplotlib scikit-learn spacy xgboost nltk gensim asyncio

You will also need to have software installed to run and execute a Jupyter Notebook

To see the detailed package installations, please use requirementst.txt and run code below in terminal: -> pip install -r requirements.txt

If you do not have Python3 installed yet, it is highly recommended that you install the Anaconda distribution of Python3, which already has the above packages and more included.

Code

DataCleaning.ipynb -- Data Cleaning includes useful information filtering; clothes categories filtering; removing of html formats, digits, punctuations etc.

DataPreprocessing.ipynb -- removing stop words, tokenizition and lemmatization.

DataLabelling.ipynb -- labelled data with pre-defined style-brand lists by LCF groups as well as manually labelled by using Labelbox.

TFIDFs.ipynb -- use gensim to get tfidf vectors to weight the embeddings as well as been used to train on the model directly.

Word2Vec.ipynb -- use gensim to get word2vec embeddings with different dimensions, and use tfidf vectors to weight the embeddings. The description embeddings are represented as averaging weighted word embeddings.

Doc2Vec.ipynb -- use gensim to get document/description veoctors directly.

FastText.ipynb -- use gensim to get fastText word/term embeddings, and then use weighted word embeddings (by tfidf) to represent description embeddings.

pretrainedembeddings.ipynb -- use pre-trained word embeddings by googlenews to represent descriptions.

kmeans.ipynb, kmeans_nap.ipynb, kmeans_nap_googlenews.ipynb -- use kmeans clustering algorithms to cluster the descriptions embeddings.

dissimilar_desc_kmeans.ipynb -- use the clustering results to find the most dissimilar products/styles

PseudoLabelling.ipynb -- use pseudo labelling to give approximate labels based on the basis of labels.

multilabel.ipynb -- use the approximate albels to do multi-label classifications.

Run

In a terminal or command window, navigate to the top-level project directory boston_housing/ (that contains this README) and run one of the following commands:

ipython notebook notebookname.ipynb or

jupyter notebook notebookname.ipynb This will open the Jupyter Notebook software and project file in your browser.

Data

Data used in this project is scraped from shopstyle.com, net-a-porter.com, farfetch.com.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
0DataCleaning.ipynb		0DataCleaning.ipynb
0get_data_speedup.ipynb		0get_data_speedup.ipynb
1DataLabelling.ipynb		1DataLabelling.ipynb
1DataPreprocessing.ipynb		1DataPreprocessing.ipynb
2TFIDFs.ipynb		2TFIDFs.ipynb
3FastText.ipynb		3FastText.ipynb
3doc2vec.ipynb		3doc2vec.ipynb
3pretrainedembeddings.ipynb		3pretrainedembeddings.ipynb
3word2vec.ipynb		3word2vec.ipynb
4PseudoLabelling.ipynb		4PseudoLabelling.ipynb
4classification.ipynb		4classification.ipynb
4dbscan.ipynb		4dbscan.ipynb
4dissimilar_desc_kmeans.ipynb		4dissimilar_desc_kmeans.ipynb
4kmeans.ipynb		4kmeans.ipynb
4kmeans_nap.ipynb		4kmeans_nap.ipynb
4kmeans_nap_googlenews.ipynb		4kmeans_nap_googlenews.ipynb
5PseudoLabelling.ipynb		5PseudoLabelling.ipynb
5multiclass.ipynb		5multiclass.ipynb
5multilabel.ipynb		5multilabel.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project: Fashion Style Prediction

Install

Code

Run

Data

About

Releases

Packages

Contributors 2

Languages

GUOYIRU/FashionStylePrediction_ucl

Folders and files

Latest commit

History

Repository files navigation

Project: Fashion Style Prediction

Install

Code

Run

Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages