-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add doc to explain multithreading #1154
base: main
Are you sure you want to change the base?
Changes from all commits
279d1fb
e52865b
d15c200
14e14b2
5522f92
d90b801
3186923
e940376
7a37a4a
038c27e
7ec2726
789164e
7621ab0
40e0c3a
e1120a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,179 @@ | ||
--- | ||
title: Threading in Streamlit | ||
slug: /develop/concepts/design/multithreading | ||
--- | ||
|
||
# Multithreading in Streamlit | ||
|
||
Multithreading is a common technique to improve the efficiency of computer programs. It's a way for processors to multitask. Streamlit uses threads within its architecture, which can make it difficult for app developers to include their own multithreaded processes. Streamlit does not officially support multithreading in app code, but this guide provides information on how it can be accomplished. | ||
|
||
## Prerequisites | ||
|
||
- You should have a basic understanding of Streamlit's [architecture](/develop/concepts/architecture/architecture). | ||
|
||
## Threads created by Streamlit | ||
|
||
Streamlit creates two types of threads in Python: | ||
|
||
- The **server thread** runs the Tornado web (HTTP + WebSocket) server. | ||
- A **script thread** runs page code — one thread for each script run in a session. | ||
|
||
When a user connects to your app, this creates a new session and runs a script thread to initialize the app for that user. As the script thread runs, it renders elements in the user's browser tab and reports state back to the server. When the user interacts with the app, another script thread runs, re-rendering the elements in the browser tab and updating state on the server. | ||
|
||
This is a simplifed illustration to show how Streamlit works: | ||
|
||
![Each user session uses script threads to communicate between the user's front end and the Streamlit server.](/images/concepts/Streamlit-threading.svg) | ||
|
||
## `streamlit.errors.NoSessionContext` | ||
|
||
Many Streamlit commands, including `st.session_state`, expect to be called from a script thread. When Streamlit is running as expected, such commands use the `ScriptRunContext` attached to the script thread to ensure they work within the intended session and update the correct user's view. When those Streamlit commands can't find any `ScriptRunContext`, they raise a `streamlit.errors.NoSessionContext` exception. Depending on your logger settings, you may also see a console message identifying a thread by name and warning, "missing ScriptRunContext!" | ||
|
||
## Creating custom threads | ||
|
||
When you work with IO-heavy operations like remote query or data loading, you may need to mitigate delays. A general programming strategy is to create threads and let them work concurrently. However, if you do this in a Streamlit app, these custom threads may have difficulty interacting with your Streamlit server. | ||
|
||
This section introduces two patterns to let you create custom threads in your Streamlit app. These are only patterns to provide a starting point rather than complete solutions. | ||
|
||
### Option 1: Do not use Streamlit commands within a custom thread | ||
|
||
If you don't call Streamlit commands from a custom thread, you can avoid the problem entirely. Luckily Python threading provides ways to start a thread and collect its result from another thread. | ||
|
||
In the following example, five custom threads are created from the script thread. After the threads are finished running, their results are displayed in the app. | ||
|
||
```python | ||
import streamlit as st | ||
import time | ||
from threading import Thread | ||
|
||
|
||
class WorkerThread(Thread): | ||
def __init__(self, delay): | ||
super().__init__() | ||
self.delay = delay | ||
self.return_value = None | ||
|
||
def run(self): | ||
start_time = time.time() | ||
time.sleep(self.delay) | ||
end_time = time.time() | ||
self.return_value = f"start: {start_time}, end: {end_time}" | ||
|
||
|
||
delays = [5, 4, 3, 2, 1] | ||
threads = [WorkerThread(delay) for delay in delays] | ||
for thread in threads: | ||
thread.start() | ||
for thread in threads: | ||
thread.join() | ||
for i, thread in enumerate(threads): | ||
st.header(f"Thread {i}") | ||
st.write(thread.return_value) | ||
|
||
st.button("Rerun") | ||
``` | ||
|
||
<Cloud name="doc-multithreading-no-st-commands-batched" height="700px" /> | ||
|
||
If you want to display results in your app as various custom threads finish running, use containers. In the following example, five custom threads are created similarly to the previous example. However, five containers are initialized before running the custom threads and a `while` loop is used to display results as they become available. Since the Streamlit `write` command is called outside of the custom threads, this does not raise an exception. | ||
|
||
```python | ||
import streamlit as st | ||
import time | ||
from threading import Thread | ||
|
||
|
||
class WorkerThread(Thread): | ||
def __init__(self, delay): | ||
super().__init__() | ||
self.delay = delay | ||
self.return_value = None | ||
|
||
def run(self): | ||
start_time = time.time() | ||
time.sleep(self.delay) | ||
end_time = time.time() | ||
self.return_value = f"start: {start_time}, end: {end_time}" | ||
|
||
|
||
delays = [5, 4, 3, 2, 1] | ||
result_containers = [] | ||
for i, delay in enumerate(delays): | ||
st.header(f"Thread {i}") | ||
result_containers.append(st.container()) | ||
|
||
threads = [WorkerThread(delay) for delay in delays] | ||
for thread in threads: | ||
thread.start() | ||
thread_lives = [True] * len(threads) | ||
|
||
while any(thread_lives): | ||
for i, thread in enumerate(threads): | ||
if thread_lives[i] and not thread.is_alive(): | ||
result_containers[i].write(thread.return_value) | ||
thread_lives[i] = False | ||
time.sleep(0.5) | ||
|
||
for thread in threads: | ||
thread.join() | ||
|
||
st.button("Rerun") | ||
``` | ||
|
||
<Cloud name="doc-multithreading-no-st-commands-iterative" height="700px" /> | ||
|
||
### Option 2: Expose `ScriptRunContext` to the thread | ||
|
||
If you want to call Streamlit commands from within your custom threads, you must attach the correct `ScriptRunContext` to the thread. | ||
|
||
<Warning> | ||
|
||
- This is not officially supported and may change in a future version of Streamlit. | ||
- This may not work with all Streamlit commands. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we know exhaustively which commands these might be, or should this just be a generic warning to cover the possibilities? |
||
- Ensure custom threads do not outlive the script thread owning the `ScriptRunContext`. Leaking of `ScriptRunContext` may cause security vulnerabilities, fatal errors, or unexpected behavior. | ||
|
||
</Warning> | ||
|
||
In the following example, a custom thread with `ScriptRunContext` attached can call `st.write` without a warning. | ||
|
||
```python | ||
import streamlit as st | ||
from streamlit.runtime.scriptrunner import add_script_run_ctx, get_script_run_ctx | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Before we release this, we might want to double-check if it might be better to expose There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If there's any hope of doing so relatively quickly/easily, that'd be great! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if it's relatively quickly/easily. Probably needs some discussion with the product (cc @jrieke). I think in the best case, we can semi-officially expose it to a stable namespace (e.g., There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do have a warning included in the section: https://deploy-preview-1154--streamlit-docs.netlify.app/develop/concepts/design/multithreading#option-2-expose-scriptruncontext-to-the-thread I'll bring this up in office hours to see if there are any other concerns before publishing. My biggest question is if I should include a little more careful handling of the exposed thread context. The warning states that custom threads should not outlive the script thread from whence they came, but the example doesn't actually do enough to prevent that since it does nothing to handle an interrupted script run. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah definitely needs a bit of discussion and thought about how we do that API. I know Joshua wanted to do it but I never really deeply looked into multithreading so far, so it doesn't really make sense to do something ad-hoc right now. I think it's fine mentioning the internal API in a guide but we should definitely put in a disclaimer that it's an internal, unstable API and will change in the future. |
||
import time | ||
from threading import Thread | ||
|
||
|
||
class WorkerThread(Thread): | ||
def __init__(self, delay, target): | ||
super().__init__() | ||
self.delay = delay | ||
self.target = target | ||
|
||
def run(self): | ||
# runs in custom thread, but can call Streamlit APIs | ||
start_time = time.time() | ||
time.sleep(self.delay) | ||
end_time = time.time() | ||
self.target.write(f"start: {start_time}, end: {end_time}") | ||
|
||
|
||
delays = [5, 4, 3, 2, 1] | ||
result_containers = [] | ||
for i, delay in enumerate(delays): | ||
st.header(f"Thread {i}") | ||
result_containers.append(st.container()) | ||
|
||
threads = [ | ||
WorkerThread(delay, container) | ||
for delay, container in zip(delays, result_containers) | ||
] | ||
for thread in threads: | ||
add_script_run_ctx(thread, get_script_run_ctx()) | ||
thread.start() | ||
|
||
for thread in threads: | ||
thread.join() | ||
Comment on lines
+159
to
+174
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Commenting for engineering review later in the week when people are back from the holidays): Should we be storing the threads in Session State and running a check for threads at the top of the script? And/or disabling widgets when threading? Adding try-except to the Streamlit commands in the threads? Although the script works like this, if a user reruns the app before the page finishes loading, that'd be an issue. Hence this is very fragile, right? |
||
|
||
st.button("Rerun") | ||
``` | ||
|
||
<Cloud name="doc-multithreading-expose-context" height="700px" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this in fact be "All Streamlit commands?" @lukasmasuch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not 100% sure. I believe a lot of commands will not raise a
NoSessionContext
if there is noScriptRunContext
. This is to support bare execution (execute a Streamlit app script with pure python). But calling these commands with the context won't do anything. But I believe all Streamlit commands require aScriptRunContext
to be fully functional.