Skip to content

Latest commit

 

History

History
63 lines (42 loc) · 8.27 KB

README.md

File metadata and controls

63 lines (42 loc) · 8.27 KB

Spatial omics datasets

Here you can find all datasets necessary to run the example notebooks already converted to the ZARR file format.

If you want to convert additional datasets check out the scripts available in the spatialdata sandbox.

Technology Sample File Size Filename (spatialdata-sandbox) download data work with data remotely (see note below) license
Visium HD Mouse intestin 1 1 GB visium_hd_3.0.0_id .zarr.zip S3 CCA
Visium Breast cancer 2 1.5 GB visium_associated_xenium_io .zarr.zip S3 CCA
Xenium Breast cancer 2 2.8 GB xenium_rep1_io .zarr.zip S3 CCA
Xenium Breast cancer 2 3.7 GB xenium_rep2_io .zarr.zip S3 CCA
CyCIF (MCMICRO output) Small lung adenocarcinoma 3 250 MB mcmicro_io .zarr.zip S3 CC BY-NC 4.0 DEED
MERFISH Mouse brain 4 50 MB merfish .zarr.zip S3 CC0 1.0 DEED
MIBI-TOF Colorectal carcinoma 5 25 MB mibitof .zarr.zip S3 CC BY 4.0 DEED
Imaging Mass Cytometry (Steinbock output) 4 different cancers (SCCHN, BCC, NSCLC, CRC) 678 820 MB steinbock_io .zarr.zip S3 CC BY 4.0 DEED

For the first 3 datasets, we also provide a version of them in which they are all aligned in a common coordinate system, and where we added the cell-type information, as described in our paper, to annotate the Xenium cells.

Technology Sample File Size Filename (spatialdata-sandbox) download data work with data remotely (see note below) license
Visium Breast Cancer 2 1.5 GB visium_associated_xenium_io .zarr.zip S3 CCA
Xenium Breast Cancer 2 2.8 GB xenium_rep1_io .zarr.zip S3 CCA
Xenium Breast Cancer 2 3.7 GB xenium_rep2_io .zarr.zip S3 CCA

Note on S3 storage: opening the S3 URLs in a web browser will not work, you need to treat the URLs as Zarr stores. For example if you append .zgroup to any of the URLs above you will be able to see that file.

Licenses abbreviations

  • CCA: Creative Common Attribution
  • CC0 1.0 DEED: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
  • CC BY 4.0 DEED: Creative Common Attribution 4.0 International
  • CC BY-NC 4.0 DEED: Creative Common Attribution-NonCommercial 4.0 International

The data retains the license of the original published data.

Artificial datasets

Also, here you can find additional datasets and resources for methods developers.

References

If you use the datasets please cite the original sources and double-check their license.

Footnotes

  1. From https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-mouse-intestine

  2. Janesick, A. et al. High resolution mapping of the breast cancer tumor microenvironment using integrated single cell, spatial and in situ analysis of FFPE tissue. bioRxiv 2022.10.06.510405 (2022) doi:10.1101/2022.10.06.510405. 2 3 4 5 6

  3. Schapiro, D. et al. MCMICRO: A scalable, modular image-processing pipeline for multiplexed tissue imaging. Cold Spring Harbor Laboratory 2021.03.15.435473 (2021) doi:10.1101/2021.03.15.435473.

  4. Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 362, (2018).

  5. Hartmann, F. J. et al. Single-cell metabolic profiling of human cytotoxic T cells. Nat. Biotechnol. (2020) doi:10.1038/s41587-020-0651-8.

  6. Windhager, J., Bodenmiller, B. & Eling, N. An end-to-end workflow for multiplexed image processing and analysis. bioRxiv 2021.11.12.468357 (2021) doi:10.1101/2021.11.12.468357.

  7. Eling, N. & Windhager, J. Example imaging mass cytometry raw data. (2022). doi:10.5281/zenodo.5949116.

  8. Eling, N. & Windhager, J. steinbock results of IMC example data. (2022). doi:10.5281/zenodo.7412972.