👋 welcome to the archaeology machine learning repository
📖 introduction to the project
Machine learning (ML) methods present new ways of approaching archaeological research questions and interest in applying these methods continues to grow.
This repository collects resources relating to the application of ML methods to archaeological data, aiming to:
provide an overview of the ways ML is being applied in archaeology
spark new ideas whilst reducing duplication of work
encourage the sharing of code, data, and other resources
make resources more FAIR (Findable, Accessible, Interoperable, and Reuseable)
By doing this, we hope to support practitioners to learn about, critically apply, or contribute to conversations about, ML in archaeology .
Check out our 🗺️ roadmap for an overview of what we're working on, or go straight to the ✅ contributor guidelines .
Please cite the project if you've found it useful. Releases are made at regular intervals and archived on Zenodo .
ML case studies (split by application area)
📊 datasets
📖 glossary of technique names
task
authors
year
data type
technique
paper
code
data
segmentation for carved reliefs
Ji et al.
2023
RGB images [digital photos], depth map, soft-edge images
CNN [DenseNet121]
paper
nan
nan
classification for ceramic elemental analysis
Ruschioni et al.
2023
x-ray fluorescence
LR, LDA, MLP, SVM, DT, RF, NB, KNN
paper
code
data
classification for ceramic sherds
Helden et al.
2022
RGB images [smartphone photos], synthetic data
CNN [VGG19, Mobilenetv2, ResNet50v2, Inceptionv3]
paper
models
data
classification for multiple artefact types
Resler et al.
2021
RGB images [digital camera photos]
CNN [EfficientNetB3], KNN
paper
nan
data
classification for ceramic petrography
Lyons
2021
RGB images [microscope photos]
CNN [VGG19, ResNet50]
paper
nan
nan
object detection for rock carvings
Tsigkas et al.
2020
RGB images [digital camera photos]
CNN [YOLOv2, TinyYOLOv2]
paper
nan
nan
classification for lithics
Grove and Blinkhorn
2020
lithic types, period
NN
paper
code
data
classification for ceramic elemental analysis
Charalambous et al.
2016
x-ray fluorescence
KNN, DT, LVQ
paper
nan
nan
task
authors
year
data type
technique
paper
code
data
classification for multi-cell phytoliths
Berganzo-Besga et al.
2022
RGB images [microscope photos]
CNN [VGG19, ResNet50v2]
paper
code
nan
classification for contexts
Vos et al.
2021
geochemistry, phytolith type and quantity
DT
paper
nan
data
classification for starch granules
Arráiz et al.
2016
morphometric and optical measurements
RF
paper
nan
nan
📚️ natural language processing
task
authors
year
data type
technique
paper
code
data
masked language modelling for archaeological text
Brandsen
2023
english language
BERT
paper
model
nan
named entity recognition for archaeological text
Brandsen
2023
english language
BERT
paper
model
nan
masked language modelling for archaeological text
Brandsen
2023
dutch language
BERT
paper
model
nan
named entity recognition for archaeological text
Brandsen
2023
dutch language
BERT
paper
model
data
masked language modelling for archaeological text
Brandsen
2023
german language
BERT
paper
model
nan
named entity recognition for archaeological text
Brandsen
2023
german language
BERT
paper
model
nan
restoration/attribution for ancient Greek inscriptions
Assael et al.
2022
transcribed inscriptions, place, time
transformer
paper
code
data
transliteration and segmentation of cuneiform characters
Gordin et al.
2020
encoded Unicode cuneiform
bidirectional LSTM
paper
code
data
🛰️ remote sensing feature detection
task
authors
year
data type
technique
paper
code
data
transfer learning between geographic areas
Sech et al.
2023
lidar visualisations [e2MSTP]
CNN [U-Net, DeepLabv3+, ResNet, EfficientNet, SegFormer]
paper
nan
nan
segmentation for mounds on maps
Berganzo-Besga et al.
2023
RGB images [historical maps], synthetic data
CNN [Mask R-CNN]
paper
nan
on request
segmentation for field systems
Küçükdemirci et al.
2022
lidar DTMs
CNN [U-Net]
paper
nan
nan
classification for hollow roads
Verschoof-van der Vaart and Landauer
2021
lidar visualisations [local relief model, openness], lidar DTM
CNN [ResNet34]
paper
nan
nan
classification for land use
Mboga et al.
2020
panchromatic images [historical aerial photographs]
CNN [FCN-ATR-SKIP, U-Net]
paper
nan
nan
classification for war landforms
de Matos-Machado et al.
2019
morphometric measurements
SOM, HAC
paper
nan
nan
object detection for mining pits
Gallwey et al.
2019
lidar DSM
U-Net
paper
model
nan
object detection for multiple classes
Verschoof-van der Vaart and Lambers
2019
lidar visualisations [simple local relief model]
CNN [Faster R-CNN]
paper
nan
nan
🌏 spatial predictive modelling
task
authors
year
data type
technique
paper
code
data
regression for neolithic sites
Li et al.
2023.3
topography, hydrology
RF
paper
nan
nan
regression for neolithic sites
Li et al.
2023
topography, hydrology
RF
paper
nan
nan
classification for site dating
Reese
2021
ceramic types, dendochronology dates
NN
paper
code
data
regression for roman sites
Castiello and Tonini
2021
soil, topography
RF
paper
nan
nan
regression for formative period sites
Yaworsky et al.
2020
environmental, topography
MaxEnt, RF
paper
code
data
regression for strontium isoscapes
Bataille et al.
2020
strontium, coordinates, geology, climate, environmental, anthropogenic
RF
paper
code
data
regression for strontium isoscapes
Funck et al.
2020
strontium, coordinates, geology, climate, environmental
RF
paper
nan
data
classification for habitat suitability
Jones et al.
2019
climate, topography
RF
paper
nan
nan
regression for strontium isoscapes
Bataille et al.
2018
strontium, geology, climate, environmental, topographic
RF
paper
code
nan
classification for soil geochemistry
Oonk and Spijker
2015
soil geochemistry
KNN, SVM, NN
paper
nan
nan
task
authors
year
data type
technique
paper
code
data
proposed null dataset for lithics
Eren et al.
2023
tbc, qual and quant info from naturally fractured rocks
nan
paper
nan
nan
dataset for named entity recognition
Brandsen et al.
2020
dutch language
named entity recognition
paper
nan
data
dataset for maya site detection
Kokalj et al.
2023
lidar visualisations [multiple], lidar canopy height, SAR, optical satellite
object recognition, object detection, semantic segmentation
paper
nan
data
acronym
technique
BERT
bidirectional encoder representations from transformers
CNN
convolutional neural network
DT
decision tree
HAC
hierarchical agglomerative clustering
KNN
k-nearest neighbours
LDA
linear discriminant analysis
LR
logistic regression
LSTM
long short-term memory network
LVQ
learning vector quantisation
MaxEnt
maximum entropy
MLP
multi-layer perceptron
NB
naive bayes
NN
neural network
RF
random forest
SOM
self-organizing map
SVM
support vector machine