The goal of this project is to train a model that is capable of identifying metastatic cancer in small image patches taken from larger digital pathology scans. The project is part of CU Boulder's MS-DS program (course DTSA-5511).
The PatchCamelyon (PCam) benchmark is an image classification dataset containing 327,680 96x96 color images from histopathologic lymph node scans, each labeled with a binary indication of metastatic tissue. It serves as a middle-ground benchmark for machine learning models, larger than CIFAR-10 but smaller than ImageNet, and is trainable on a single GPU. It has been slightly modified for Kaggle.
Link to competition and data: Kaggle. (n.d.). Histopathologic cancer detection. Retrieved January 5, 2025, from https://www.kaggle.com/c/histopathologic-cancer-detection/overview.