-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue resolving conflicts with resolve() for Baysor segmentation #152
Comments
Hello @NJNataren, the error you have is very likely related to baysor 0.7.0 (can you confirm you have this version?) which is a recent version of baysor that introduced many breaking changes. Recently, we updated the CLI and the Snakemake pipeline to support this new version, but not the API tutorial. Actually, it's also fixed in the API, but not released yet: I expect to release Since |
Thanks @quentinblampey, that is indeed the Baysor version I am using! |
Hello @NJNataren, if you need results really soon, maybe you can downgrade baysor to
|
Great suggestion, thanks! |
Hi, first of all, thanks for developing this amazing looking package, I'm really excited to start using it!
I am having an issue using the API version. I am working on a conda environment (python=3.10) which I am running as a kernel in Jupyter labs on another environment.
I am just trying to familiarise myself with the workflow using the toy dataset and following your code in the API tutorial (https://gustaveroussy.github.io/sopa/tutorials/api_usage/)
When i tried running Baysor on the patches I initally used your code
for patch_index in valid_indices: command = f""" cd {baysor_temp_dir}/{patch_index} {baysor_executable_path} run --save-polygons GeoJSON -c config.toml transcripts.csv """ subprocess.run(command, shell=True)
But I received the
cannot find --save-polygons
error, so I then switched to--polygon-format=FeatureCollection
for patch_index in valid_indices: command = f""" cd {baysor_temp_dir}/{patch_index} {baysor_executable_path} run --polygon-format=FeatureCollection -c config.toml transcripts.csv """ subprocess.run(command, shell=True)
This completed successfully. When I then try to resolve conflicts using your code
from sopa.segmentation.transcripts import resolve resolve(sdata, baysor_temp_dir, gene_column, min_area=10)
I get the error message below
`[INFO] (sopa.segmentation.transcripts) Cells whose area is less than 10 microns^2 will be removed
Reading transcript-segmentation outputs: 0%| | 0/1 [00:00<?, ?it/s]
FileNotFoundError Traceback (most recent call last)
Cell In[15], line 3
1 from sopa.segmentation.transcripts import resolve
----> 3 resolve(sdata, baysor_temp_dir, gene_column, min_area=10)
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:45, in resolve(sdata, temp_dir, gene_column, patches_dirs, min_area, shapes_key)
42 if min_area > 0:
43 log.info(f"Cells whose area is less than {min_area} microns^2 will be removed")
---> 45 patches_cells, adatas = _read_all_segmented_patches(temp_dir, min_area, patches_dirs)
46 geo_df, cells_indices, new_ids = _resolve_patches(patches_cells, adatas)
48 image_key, _ = get_spatial_image(sdata, return_key=True)
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:140, in _read_all_segmented_patches(temp_dir, min_area, patches_dirs)
137 if patches_dirs is None or not len(patches_dirs):
138 patches_dirs = [subdir for subdir in Path(temp_dir).iterdir() if subdir.is_dir()]
--> 140 outs = [
141 _read_one_segmented_patch(path, min_area)
142 for path in tqdm(patches_dirs, desc="Reading transcript-segmentation outputs")
143 ]
145 patches_cells, adatas = zip(*outs)
147 return patches_cells, adatas
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:141, in (.0)
137 if patches_dirs is None or not len(patches_dirs):
138 patches_dirs = [subdir for subdir in Path(temp_dir).iterdir() if subdir.is_dir()]
140 outs = [
--> 141 _read_one_segmented_patch(path, min_area)
142 for path in tqdm(patches_dirs, desc="Reading transcript-segmentation outputs")
143 ]
145 patches_cells, adatas = zip(*outs)
147 return patches_cells, adatas
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:112, in _read_one_segmented_patch(directory, min_area, min_vertices)
109 cells_num = pd.Series(adata.obs["CellID"].astype(int), index=adata.obs_names)
110 del adata.obs["CellID"]
--> 112 with open(directory / "segmentation_polygons.json") as f:
113 polygons_dict = json.load(f)
114 polygons_dict = {c["cell"]: c for c in polygons_dict["geometries"]}
FileNotFoundError: [Errno 2] No such file or directory: 'tuto.zarr/.sopa_cache/baysor/0/segmentation_polygons.json'`
I think this is caused because the output is called
segmentation_polygons_2d.json
notsegmentation_polygons.json
as expected by your package. However, if I then manually change the name tosegmentation_polygons.json
I get the following error message`[INFO] (sopa.segmentation.transcripts) Cells whose area is less than 10 microns^2 will be removed
Reading transcript-segmentation outputs: 0%| | 0/1 [00:00<?, ?it/s]
KeyError Traceback (most recent call last)
Cell In[17], line 3
1 from sopa.segmentation.transcripts import resolve
----> 3 resolve(sdata, baysor_temp_dir, gene_column, min_area=10)
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:45, in resolve(sdata, temp_dir, gene_column, patches_dirs, min_area, shapes_key)
42 if min_area > 0:
43 log.info(f"Cells whose area is less than {min_area} microns^2 will be removed")
---> 45 patches_cells, adatas = _read_all_segmented_patches(temp_dir, min_area, patches_dirs)
46 geo_df, cells_indices, new_ids = _resolve_patches(patches_cells, adatas)
48 image_key, _ = get_spatial_image(sdata, return_key=True)
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:140, in _read_all_segmented_patches(temp_dir, min_area, patches_dirs)
137 if patches_dirs is None or not len(patches_dirs):
138 patches_dirs = [subdir for subdir in Path(temp_dir).iterdir() if subdir.is_dir()]
--> 140 outs = [
141 _read_one_segmented_patch(path, min_area)
142 for path in tqdm(patches_dirs, desc="Reading transcript-segmentation outputs")
143 ]
145 patches_cells, adatas = zip(*outs)
147 return patches_cells, adatas
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:141, in (.0)
137 if patches_dirs is None or not len(patches_dirs):
138 patches_dirs = [subdir for subdir in Path(temp_dir).iterdir() if subdir.is_dir()]
140 outs = [
--> 141 _read_one_segmented_patch(path, min_area)
142 for path in tqdm(patches_dirs, desc="Reading transcript-segmentation outputs")
143 ]
145 patches_cells, adatas = zip(*outs)
147 return patches_cells, adatas
File ~/miniforge3/envs/xenium/lib/python3.10/site-packages/sopa/segmentation/transcripts.py:114, in _read_one_segmented_patch(directory, min_area, min_vertices)
112 with open(directory / "segmentation_polygons.json") as f:
113 polygons_dict = json.load(f)
--> 114 polygons_dict = {c["cell"]: c for c in polygons_dict["geometries"]}
116 cells_num = cells_num[cells_num.map(lambda num: len(polygons_dict[num]["coordinates"][0]) >= min_vertices)]
118 gdf = gpd.GeoDataFrame(index=cells_num.index, geometry=[shape(polygons_dict[cell_num]) for cell_num in cells_num])
KeyError: 'geometries'`
I am likely missing something obvious, but I am not sure how to proceed. I want to eventually use this API to process some Xenium Spatial data, but I would like to know I can get this working with the toy test set. Any help would be appreciated!
The text was updated successfully, but these errors were encountered: