-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when running EasySearch: only results for "Identity" column #9
Comments
Update! I figured out I was getting the issue above when removing the "Bender et al lines" from the id_to_study.txt file. Could you let me know how to get the latest pre-aligned version of OAS? I am running the command from the issue above:
|
Hi Laura, thank you for using KA-Search and highlighting this issue! Some time ago we decided to remove parts of the Bender 2020 study from OAS because we suspect some of the human sequences contain mouse sequences. However, because this would break the public pre-processed OAS for KA-Search, we updated the kasearch code to highlight when user queries would match with Bender 2020 sequences. This results in results without meta data, as the meta data is not in OAS any more. Unfortunately, we left a sequence which matches with Bender 2020 sequences as the example sequence, this has now been changed (#10). For convenience, you can create your own pre-aligned version of OAS using the prepareOASdb.ipynb notebook. This will take some time or resources (~1 day on 20 CPUs), but you will then have an up-to-date pre-aligned version of OAS. I hope this helps, otherwise please let me know if you have any other issues. |
Hi Tobias, thank you so much for your fast answer! |
Hi Tobias! I managed to download all human heavy sequences (IGH) from OAS, which are now sitting in a folder as .csv.gz.
Since all my .csv.gz are in one folder I think it cannot find them? Just wondering what am I doing wrong - I've been trying to run ka-search for a long time! Any help would be greatly appreciated:) Thank you so much!
|
I figured it out:) Indeed, the input files need to be in the OASdb folder structure, so local_oas_path/unpaired/Heavy/Human/, in my case. Then it works nicely:) |
Dear kasearch team,
First of all, thanks for all your work, kasearch is really promising!! I'm really hoping I can get it running soon.
I'm trying to run EasySearch on the sample sequence. I downloaded the publication dataset into this folder: /researchers/laura.twomey/Tools/omics_tools/kasearch/oasdb_20230111/
But get this error:
I'm using:
The text was updated successfully, but these errors were encountered: