Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test switching from threads to processes #792

Open
wpietri opened this issue Jan 14, 2025 · 0 comments
Open

Test switching from threads to processes #792

wpietri opened this issue Jan 14, 2025 · 0 comments

Comments

@wpietri
Copy link
Contributor

wpietri commented Jan 14, 2025

We had a number of problems with runs hanging during the 1.0 push. A common issue had a stack trace like this:

Thread 0x000079a1b2a006c0 (most recent call first):
File "/usr/local/lib/python3.10/ssl.py", line 1165 in read
File "/usr/local/lib/python3.10/ssl.py", line 1292 in recv
File "/venv/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 128 in read
File "/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 217 in _receive_event
File "/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 177 in _receive_response_headers
File "/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 106 in handle_request
File "/venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 103 in handle_request
File "/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236 in handle_request
File "/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 236 in handle_request
File "/venv/lib/python3.10/site-packages/httpx/_client.py", line 1027 in _send_single_request
File "/venv/lib/python3.10/site-packages/httpx/_client.py", line 991 in _send_handling_redirects
File "/venv/lib/python3.10/site-packages/httpx/_client.py", line 954 in _send_handling_auth
File "/venv/lib/python3.10/site-packages/httpx/_client.py", line 926 in send
File "/venv/lib/python3.10/site-packages/mistralai/basesdk.py", line 218 in do
File "/venv/lib/python3.10/site-packages/mistralai/utils/retries.py", line 72 in do_request
File "/venv/lib/python3.10/site-packages/mistralai/utils/retries.py", line 176 in retry_with_backoff
File "/venv/lib/python3.10/site-packages/mistralai/utils/retries.py", line 104 in retry
File "/venv/lib/python3.10/site-packages/mistralai/basesdk.py", line 255 in do_request
File "/venv/lib/python3.10/site-packages/mistralai/chat.py", line 122 in complete
File "/venv/lib/python3.10/site-packages/modelgauge/suts/mistral_client.py", line 54 in request
File "/venv/lib/python3.10/site-packages/modelgauge/suts/mistral_sut.py", line 64 in evaluate
File "/venv/lib/python3.10/site-packages/modelbench/benchmark_runner.py", line 300 in handle_item
File "/venv/lib/python3.10/site-packages/modelgauge/pipeline.py", line 211 in run
File "/usr/local/lib/python3.10/threading.py", line 953 in run
File "/usr/local/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/local/lib/python3.10/threading.py", line 973 in _bootstrap

This was discussed here and elsewhere around then.

A core problem is that Python's thread management isn't very good. You can't kill a thread. So although there are other possible solutions to this particular problem, the only thing that eliminates a whole class of problems is to switch to subprocesses. But that may introduce other problems, so we should try it it out first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant