🎉 thanks so much for considering contributing to this project! 🎉
A quick note before we get started: this project and everyone participating in it is governed by our code of conduct. By participating you are expected to uphold this code, so please make yourself familiar with it.
There are many options for participating in this project; whether you're a seasoned github user and are ready to 📝 create an issue or 🎣 make a pull request, or you're new here and don't have an account. Read on to find out more.
Since this project is about collecting resources, the most simple and helpful contribution you could make is to add one or more examples to the repository's main data file, which is the csv file in the data folder.
To contribute by adding an example to the csv:
- take note of the 💅 repo style guide
- fork and clone the repo, and make a new branch (see instructions in 🎣 make a pull request)
- add your contributions to the csv file (data/archaeology-machine-learning-data.csv)
Important
Add your contributions to the next available row of the csv file. We have a github action which automatically sorts the csv after each commit to keep it neat and tidy 🧹🤖
Simple contributions of this kind are hugely appreciated and make all the difference ✨
- post a message on our 👋 community introductions issue
- report a mistake or error in the repository contents by 📝 creating an issue
- help us resolve any open issues relating to errors (you may need to 🎣 make a pull request)
- check out the current 🐢 milestones we're working towards
- click into a milestone to see the open issues
- help us resolve any open issues (you may need to 🎣 make a pull request)
- read our project 🗺️ roadmap
- in the 🗺️ project vision and roadmap milestone, comment on an open issue or 📝 create a new issue explaining your idea
- help us resolve any open issues relating to project planning (you may need to 🎣 make a pull request)
- get in touch via email at [email protected]
This repository's contents is written in github flavored markdown.
As mentioned on the README, this repository aims to simplify the navigation of machine learning research and is based on a hierarchy of information which goes from the most general way of describing a method to the most specific:
application area —> task —> model/algorithm
We recognise that this isn't going to cover all the details of every example, but it will provide enough information for the community to learn, explore, and build. When choosing how to add your example, think generally about the main aim, task and technique used.
This is what the csv structure looks like:
task | author(s) | year | application area | data type | technique | paper | code | data |
---|---|---|---|---|---|---|---|---|
... | ... | ... | ... | ... | ... | ... | ... | ... |
And here's some specific guidelines for filling it in:
-
task = the [machine learning task] for [thing being analysed], e.g.:
- [image classification] for [hollow roads]
- [named entity recognition] for [archaeological text]
- [classification] for [soil geochemistry]
- [regression] for [stable isotope analysis]
-
author(s) = author(s) last name (+ et al if 3 or more authors)
-
year = year of publication/creation
-
application area = this column is important as it categorises all the data on the README. choose the overall domain of the example from the existing column values, e.g.:
- chemical analysis
- natural language processing
- site prospection/monitoring
- spatial predictive modelling
- OR add new areas as needed (see next section)
-
data type = the kind of data that the example uses, e.g.:
- lidar visualisations
- DEM
- english language text
- strontium
-
technique = the name of the main machine learning model/algorithm used. use acronyms if you can to keep it short, there will be a reference section to explain them, e.g.:
- R-CNN
- BERT
- SVM
-
paper = link to paper if published (DOI preferred)
- if not published, add a relevant link for information about the example
-
code = link to the code or model (DOI preferred)
-
data = link to the dataset (DOI preferred)
If the example you're adding doesn't have its domain represented, simply enter a new value for the overall domain of the example in the application area column of the csv.
If you feel like it, you can also choose an emoji to represent the section on the README. Find the emoji mapping section in the file build/update-csv-and-readme.py and add your new area and emoji:
> emoji_mapping = {
'chemical analysis': '⚛️',
'natural language processing': '📚️',
'site prospection/monitoring': '🛰️',
'spatial predictive modelling': '🌏',
'new area': '📊',
# insert new areas in the list in alphabetical order
}
This file is is our tidy-up-and-README-making script which runs automatically after each change to the csv file 🧹🤖.
To manage changes to the project's content we use github's standard workflow, which is based on contributors making requests for their changes to be pulled into the main project content (or, a pull request!).
To make a pull request:
- create a fork of this repository on GitHub
- clone your fork of this repository to create a local copy on your computer
- create a new branch in your local copy for each significant change
- commit the changes in that branch
- push that branch to your fork on github
- submit a pull request from that branch to the master repository
- if you receive feedback, make changes in your local clone and push them to your branch on github (the pull request will update automatically)
- pull requests will be merged after at least one other person has reviewed the pull request and approved it
For lots more heplful info, check out github's guidance on collaborating with pull requests.
This document is adapted from the contributor guidelines for Open Life Science.