-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with lambda cold starts in the real-time data pipeline #312
Dealing with lambda cold starts in the real-time data pipeline #312
Conversation
lucylu-coveo
commented
Oct 16, 2024
•
edited
Loading
edited
The blog post is also available in Google Doc: https://docs.google.com/document/d/1N_3vd_IJpnqAVrtTufF1AiGTbgBEmDxHvbs144nq5lU/edit?tab=t.0 |
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
``` | ||
|
||
## 2. We kept all the old version of Lambda functions | ||
Since SnapSart only works with published versions of Lambda, we started publishing versions after enabling SnapStart (before that we had been only using the unpublished $LATEST version). As we didn’t delete old versions (and Lambda doesn’t provide a configurable feature to delete old versions), we ended up with over 50 versions of a single Lambda function. We weren’t aware that all these versions were being re-initialized periodically by Lambda service, until we received a notification about an initialization failure for a very old version. Although this behavior was mentioned in this [documentation](https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html#snapstart-pricing) under SnapStart pricing section, we overlooked it after seeing the statement “there's no additional cost for SnapStart”. To mitigate this issue and to reduce costs associated with re-initializations, we created the following bash script to keep only the latest two versions of Lambda function. The previous version is kept in case we need to roll back. This script is triggered automatically after a new version is published. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This script is triggered automatically after a new version is published."
Improved: "This script is automatically triggered after a new version is published."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm I prefer Lucy's version here 🤷🏼 We need a native speaker's input here. CC @mpayne-coveo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first one sounds more correct to me, but they both sound valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't argue too much, just for fun, that's what chat gpt thinks about this:
`
The improved sentence, “This script is automatically triggered after a new version is published,” is better primarily because of clarity, flow, and emphasis:
Clarity and Readability: When “automatically” is placed directly after “is,” it’s immediately clear that automation is the focus. This placement eliminates any chance of misinterpreting what “automatically” is modifying, making the sentence clearer on first read.
Natural Flow: Placing “automatically” right after “is” aligns with the natural rhythm of English, where adverbs often come directly after auxiliary verbs (like “is,” “was,” “has”). This structure feels more fluid and effortless to read.
Emphasis: Putting “automatically” earlier in the sentence highlights the automation aspect, making it a more prominent detail. Since this is likely an important characteristic, bringing it forward can subtly enhance its importance.
In professional or technical writing, concise, naturally flowing sentences are often preferred as they increase comprehension and readability.
`
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few comments, other amazing post as always
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
``` | ||
|
||
## 2. We kept all the old version of Lambda functions | ||
Since SnapSart only works with published versions of Lambda, we started publishing versions after enabling SnapStart (before that we had been only using the unpublished $LATEST version). As we didn’t delete old versions (and Lambda doesn’t provide a configurable feature to delete old versions), we ended up with over 50 versions of a single Lambda function. We weren’t aware that all these versions were being re-initialized periodically by Lambda service, until we received a notification about an initialization failure for a very old version. Although this behavior was mentioned in this [documentation](https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html#snapstart-pricing) under SnapStart pricing section, we overlooked it after seeing the statement “there's no additional cost for SnapStart”. To mitigate this issue and to reduce costs associated with re-initializations, we created the following bash script to keep only the latest two versions of Lambda function. The previous version is kept in case we need to roll back. This script is triggered automatically after a new version is published. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since SnapSart only works with published versions of Lambda, we started publishing versions after enabling SnapStart (before that we had been only using the unpublished $LATEST version).
I would make the part in parentheses a separate sentence:
Since SnapStart only works with published versions of Lambda, we started publishing versions after enabling SnapStart. (Before that, we had been using only the unpublished $LATEST version).
I also think the placement of the word "only" in that sentence could be better (I moved it in my "suggestion" above). I'm not 100% sure that is actually better; I'm not a native English speaker 😅.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌🏼
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
_posts/2024-10-22-dealing-with-lambda-cold-start-in-real-time-data-pipeline.md
Outdated
Show resolved
Hide resolved
``` | ||
|
||
## 2. We kept all the old version of Lambda functions | ||
Since SnapSart only works with published versions of Lambda, we started publishing versions after enabling SnapStart (before that we had been only using the unpublished $LATEST version). As we didn’t delete old versions (and Lambda doesn’t provide a configurable feature to delete old versions), we ended up with over 50 versions of a single Lambda function. We weren’t aware that all these versions were being re-initialized periodically by Lambda service, until we received a notification about an initialization failure for a very old version. Although this behavior was mentioned in this [documentation](https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html#snapstart-pricing) under SnapStart pricing section, we overlooked it after seeing the statement “there's no additional cost for SnapStart”. To mitigate this issue and to reduce costs associated with re-initializations, we created the following bash script to keep only the latest two versions of Lambda function. The previous version is kept in case we need to roll back. This script is triggered automatically after a new version is published. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm I prefer Lucy's version here 🤷🏼 We need a native speaker's input here. CC @mpayne-coveo