Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: Can only use .str accessor with string values! #3261

Open
2 of 3 tasks
DUAN-GAO opened this issue Sep 26, 2024 · 3 comments
Open
2 of 3 tasks

AttributeError: Can only use .str accessor with string values! #3261

DUAN-GAO opened this issue Sep 26, 2024 · 3 comments
Labels
Bug 🐛 Needs info❔ More information needed

Comments

@DUAN-GAO
Copy link

DUAN-GAO commented Sep 26, 2024

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the main branch of scanpy.

What happened?

Dear scanpy teams, research fellows,
I downloaded some scRNA-seq data from https://zenodo.org/records/3357167,
and when I was tring to use anndata.AnnData.concatenate to combine two read count data(I checked their dimensions and the result were Baron_human: [2133,22758] and Segerstolpe: [8569,17500] which means they certainly have different annotated genes), I got below error.

Could u help. many thanks!!

Minimal code sample

all_adata = anndata.AnnData.concatenate(train_adata,test_adata)

Error output

  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 245, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!

Versions

>>> scanpy.logging.print_versions()
-----
anndata     0.8.0
scanpy      1.9.3
-----
CIForm              NA
PIL                 9.1.0
astunparse          1.6.3
cffi                1.15.1
colorama            0.4.6
cycler              0.10.0
cython_runtime      NA
dateutil            2.8.2
google              NA
h5py                3.11.0
igraph              0.10.4
joblib              1.2.0
kiwisolver          1.4.2
leidenalg           0.9.1
llvmlite            0.39.1
matplotlib          3.5.2
mpl_toolkits        NA
natsort             8.3.1
nt                  NA
numba               0.56.4
numpy               1.23.5
opt_einsum          v3.3.0
packaging           21.3
pandas              2.2.3
plotly              5.13.1
psutil              5.9.4
pyparsing           3.0.9
pytz                2022.1
scipy               1.10.0
session_info        1.0.0
six                 1.16.0
sklearn             1.2.1
texttable           1.6.7
threadpoolctl       3.1.0
torch               1.13.1+cpu
tqdm                4.64.1
typing_extensions   NA
yaml                6.0
zoneinfo            NA
zope                NA
-----
Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]
Windows-10-10.0.19041-SP0
-----
Session information updated at 2024-09-26 11:05
@DUAN-GAO DUAN-GAO added Bug 🐛 Triage 🩺 This issue needs to be triaged by a maintainer labels Sep 26, 2024
@flying-sheep flying-sheep added Needs info❔ More information needed and removed Triage 🩺 This issue needs to be triaged by a maintainer labels Sep 26, 2024
@flying-sheep
Copy link
Member

flying-sheep commented Sep 26, 2024

Hi, please create a minimal reproducible example, including a complete stack trace of the error.

Mentioning which data set you used and the error message alone does not help us to figure out what happened, as we lack the context of both the steps you took that lead up to the error and where exactly in the code the error happened.

@DUAN-GAO
Copy link
Author

DUAN-GAO commented Oct 8, 2024

I think the reason is Baron_human and Segerstolpe got different dimensions, which I printed out in the error. If so, how to fix it, many thanks.
code

x_Traindata_path = 'F:/迅雷下载/Intra-dataset/Pancreatic_data/Baron_human/'
Train_name = 'Baron'
Testdata_path = 'F:/迅雷下载/Intra-dataset/Pancreatic_data/Segerstolpe/'
Testdata_name = 'Segerstolpe'

import scanpy as sc

test_adata = sc.read_csv(Testdata_path + Testdata_name + ".csv")

train_adata = sc.read_csv(x_Traindata_path + Train_name + ".csv")

all_adata = sc.AnnData

all_adata = all_adata.concatenate(train_adata)
all_adata = all_adata.concatenate(test_adata)

error

AnnData object with n_obs × n_vars = 2133 × 22757
AnnData object with n_obs × n_vars = 8569 × 17499

  File "<stdin>", line 1, in <module>
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\anndata\_core\anndata.py", line 1806, in concatenate
    out.var.columns.str.extract(pat, expand=False)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\accessor.py", line 224, in __get__
    accessor_obj = self._accessor(obj)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 191, in __init__
    self._inferred_dtype = self._validate(data)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 245, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!

@DUAN-GAO
Copy link
Author

DUAN-GAO commented Oct 8, 2024

Also
scanpy == 1.10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Needs info❔ More information needed
Projects
None yet
Development

No branches or pull requests

2 participants