Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linclust.sh: line 76: Segmentation fault #927

Open
sinamajidian opened this issue Jan 15, 2025 · 3 comments
Open

linclust.sh: line 76: Segmentation fault #927

sinamajidian opened this issue Jan 15, 2025 · 3 comments

Comments

@sinamajidian
Copy link

sinamajidian commented Jan 15, 2025

Hi team!
We use mmseqs easy-linclust in FastOMA for comparative genomics and orthology inference, recently some users are experiencing Segmentation fault with mmseqs (like here).

Expected Behavior

When I use easy-clust with version mmseqs v14.7e284, it works well and it generates the clusters reported in _cluster.tsv and _all_seqs.fasta.

conda install -c conda-forge -c bioconda mmseqs2=14.7e284
mmseqs easy-linclust --threads 2 singleton_unmapped.fa singleton_unmapped tmp_linclust  > log.out 2>&1

Current Behavior

When I use latest mmseqs2 from conda (16.747c6), I face Segmentation fault.

I also installed from github source (0898eb9), which was sucessfull but I face the same Segmentation fault

Compute score and coverage
Query database size: 61247 type: Aminoacid
Target database size: 61247 type: Aminoacid
Calculation of alignments
[=========================tmp_linclust/758994687944913325/clu_tmp/4818287843092703793/linclust.sh: line 76: 409503 Segmentation fault      (core dumped) $RUNNER "$MMSEQS" "${ALIGN_MODULE}" "$INPUT" "$INPUT" "$RESULTDB" "${TMP_PATH}/aln" ${ALIGNMENT_PAR}
Error: Alignment step died
Error: Search died

Steps to Reproduce (for bugs)

I started each time from new folders.

try1) Failed with version 16.747c6

conda create -n mms python=3.12
conda activate mms
conda install -c conda-forge -c bioconda mmseqs2
mmseqs easy-linclust --threads 2 singleton_unmapped.fa singleton_unmapped tmp_linclust  > log.out 2>&1

try2) Failed with 0898eb9

$ date
Wed Jan 15 15:08:46 CET 2025
conda create -n mms14 python=3.12
conda activate mms14
conda install conda-forge::cmake

git clone [email protected]:soedinglab/MMseqs2.git
cd MMseqs2
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=. ..
make -j8
make install 

 $ ./MMseqs2/build/bin/mmseqs 
...
MMseqs2 Version: 0898eb901272f318bd099a4b7e56d221bbb050cc
mmseqs easy-linclust --threads 2 singleton_unmapped.fa singleton_unmapped tmp_linclust  > log.out 2>&1 

MMseqs Output (for bugs)

successful_14.7e284.tar.gz
failed_16.747c6_089eb_.tar.gz

Context

This is the fasta file
singleton_unmapped.fa.zip

Your Environment

I'm using university's cluster login node with 48 CPUs.

  • Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters):
    0898eb901272f318bd099a4b7e56d221bbb050cc

  • Which MMseqs version was used (Statically-compiled, self-compiled, Homebrew, etc.):
    I used latest conda and also I compiled myself from the latest github.

  • For self-compiled and Homebrew: Compiler and Cmake versions used and their invocation:

$ cmake --version
cmake version 3.31.4
$ g++ --version
g++ (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3)
 $ gcc --version
gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3)
  • Server specifications (especially CPU support for AVX2/SSE and amount of system memory):
$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         43 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  48
  On-line CPU(s) list:   0-47
Vendor ID:               AuthenticAMD
  Model name:            AMD EPYC 7402 24-Core Processor
..
$  grep -o 'avx[^ ]*' /proc/cpuinfo | head -2
avx
avx2
$  grep -o 'sse[^ ]*' /proc/cpuinfo | head -2
sse
sse2

$ cat /proc/meminfo
MemTotal:       527942460 kB
MemFree:         5523840 kB
MemAvailable:   425549692 kB
  • Operating system and version:
$ cat /etc/os-release 
NAME="Red Hat Enterprise Linux"
VERSION="9.4 (Plow)"
ID="rhel"
ID_LIKE="fedora"
..

The previous github issue here is related to this.

@martin-steinegger
Copy link
Member

martin-steinegger commented Jan 18, 2025

Thank you for the report. Should be fixed with commit 492297b

The problem was introduced by changes to our GPU database code. Your sequence *RRTVALGFHPTNPLQFP...RKGLNH*TALTLLVP*QFENLFGPCR, which begins with *, triggered the GPU sequence mapping code.

@sinamajidian
Copy link
Author

Great! thanks.
I can confirm that this solved the issue.
Looking forward to have it on Conda too.

Best,
Sina

@martin-steinegger
Copy link
Member

Its on the way, we made a new release https://github.com/soedinglab/MMseqs2/releases/tag/17-b804f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants