Integrating ATAC-seq data from multiple species with a consensus peak set #1756
Unanswered
jenellewallace
asked this question in
Q&A
Replies: 1 comment
-
I found a workaround that allows integration across species - would still love feedback on whether this is the recommended way to do this but here's what I did in case this helps anyone else (only showing modified steps): #Make objects for each species and store in a list called multi_species
#Subset to consensus peaks
multi_species_con <- list()
for (sp in species){
multi_species_con[[sp]] = multi_species[[sp]]
DefaultAssay(multi_species_con[[sp]]) <- 'peaks_consensus'
multi_species_con[[sp]][["peaks_celltypes"]] <- NULL #remove to use less memory
multi_species_con[[sp]] <- subset(multi_species_con[[sp]], features = rownames(multi_species_con[[sp]])[1:num_con_peaks])
}
#Rename peaks to human coords and process each species
for (sp in species){
Annotation(multi_species_con[[sp]]) = Annotation(multi_species_con$human)
#rename counts
rownames(multi_species_con[[sp]]@assays$peaks_consensus@counts)[1:num_con_peaks] = rownames(multi_species_con$human@assays$peaks_consensus@counts)[1:num_con_peaks]
#rename data
rownames(multi_species_con[[sp]]@assays$peaks_consensus@data)[1:num_con_peaks] = rownames(multi_species_con$human@assays$peaks_consensus@data)[1:num_con_peaks]
#rename meta.features
rownames(multi_species_con[[sp]]@assays$peaks_consensus@meta.features)[1:num_con_peaks] = rownames(multi_species_con$human@assays$peaks_consensus@meta.features)[1:num_con_peaks]
multi_species_con[[sp]] <- RunTFIDF(multi_species_con[[sp]])
multi_species_con[[sp]] <- FindTopFeatures(multi_species_con[[sp]], min.cutoff = 'q0') #need to use all features or else they won't be the same across species
multi_species_con[[sp]] <- RunSVD(multi_species_con[[sp]])
multi_species_con[[sp]] <- RunUMAP(object = multi_species_con[[sp]], reduction = 'lsi', dims = 2:30)
multi_species_con[[sp]] <- FindNeighbors(object = multi_species_con[[sp]], reduction = 'lsi', dims = 2:30)
}
#Merge and integrate following the ATAC integration vignette. In FindIntegrationAnchors I increased k.anchor to 20 to improve the integration as it did not look very good with the default value of 5 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I am trying to integrate ATAC-seq data from human, chimp, and macaque. I have defined a consensus peak set in all three species using my own pipeline, so now I have consensus peaks that are named Peak_1, Peak_2, etc in all species which correspond to the location of the peak in each species' own coordinates. I would appreciate some help in figuring out how to set this up in Signac as the integration vignettes only seem to cover integrating datasets from a single species. Is there a way to name the peaks with their consensus names? I assume this would be necessary in order to merge the object and use the integration pipeline. When I tried this I got an error (Peak_1, etc names are stored in all_peaks$name):
I was able to create the ChromatinAssay when I left the rownames as is (chromosome coordinates), but this means the same peak will have a different name in each species and then I'm not sure how to do the integration. Any advice would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions