Project Name

A short description of the project. (e.g. Exploring the iris dataset with scikit-learn.)

Features

pipenv for managing packages and virtualenvs in a modern way.
Batteries included: pandas, numpy, and scipy already installed.
ML and statistical learning with statsmodels and scikit-learn.
Plotting libraries matplotlib and seaborn included.
Data testing and validation with pandera.
And the jupyter metapackage.
A couple utility packages; openpyxl for reading excel file formats, and ipykernel to make use of VSCode's Jupyter functionality.

Quickstart

The only prerequisites for working with this template are git, and pipenv.

within the project folder run the following to install all dependencies listed in the pipenv file:

pipenv install

Next, within the project folder run the following to activate the virtualenv:

pipenv shell

Lastly to ensure the repository level .gitconfig is included:

git config --local include.path ../.gitconfig

Directory structure

.
├── notebooks            # A directory to place all notebooks files.
│   ├── *.ipynb
│   └── README.md
├── data                 # A directory to place all input and output data files.
│   ├── input/
│   │    └── .gitignore
│   ├── output/
│   │    └── .gitignore
│   └── README.md
├── .gitignore           # Standard python .gitignore
├── .gitattributes
├── .gitconfig
├── pipfile              # The pipfile for reproducing the package environment
├── pipfile.lock 
└── README.md            # This readme.md

Safety

This repo uses a .gitattributes and .gitconfig to clean all *.ipynb cell outputs within the notebooks/ directory, here's a link to how this is done.

The data/* folders both contain a .gitignore with a bunch of common data formats listed such as multiple different Excel extensions, csv, parquet extensions, hfd extensions, json, and txt, among others. You may need to add your extensions here if they're less common.

This is to ensure the only content commited to a repository is code, and not data.

Resources

See this link for naming conventions of notebook files which may work for you.

The following are unofficial third party resources I've found which may be helpful, feel free to add if you come across any!

awesome-python-data-science - A curated list of packages and resources related to python datascience and multiple subfields.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Name

Features

Quickstart

Directory structure

Safety

Resources

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
data		data
notebooks		notebooks
src		src
.gitattributes		.gitattributes
.gitconfig		.gitconfig
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

paper-19/jupyter-template

Folders and files

Latest commit

History

Repository files navigation

Project Name

Features

Quickstart

Directory structure

Safety

Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages