-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/fetch on build #1
Conversation
This was also my initial idea for building the site from Sessionize data – a dynamic JS driven "data" file that pulls data from Sessionize at build time. However, I made a separate step for fetching the data and writing it to json files because the Sessionize API documentation requests that we cache the data and it also seemed like a good idea to have the data committed to the repo. Although the live build approach in this PR should work very well for the time around when the schedule is published leading up to the day of the event (updating to reflect changes, etc), once the event has passed, having Sessionize as a dependency of the build processes seems like an extra possible point of failure since the data likely won't be changing at that point. I think we ought to have a config setting for build_mode that allows us to easily change the data source from Sessionize to json files in the repo. Also, I wonder how the Sessionize dependent build process would respond to a failed API request? I think the build would probably fail, but since we're not explicitly checking maybe it would build the site with blank schedule, sessions, and speakers pages? Side note: This being the 2018 site with no need for updates from Sessionize, I'm considering this PR as intended for the 2019 repo once we add sessions, speakers, and schedule pages. |
@zekefarwell , yeah, great points, the resilience of the build if the API is down is probably pretty important for the live site. This pivot was mostly focused on how we would manage any incremental changes in the 2019 sessionize data and pull them into the site, in essence caching them as part of the daily build rather than hitting them by every single user on every page request. But yeah, for past events with static, un-changing data, we should probably never do a fresh hit on the API, so we'd need an environment variable somewhere to toggle between .json and sessionize or make a commit each season to winterize everything and wrap it up. The trick is, I want eleventy to own the build process, and it seems like that's going to have a bias at saving things in the Let's leave this PR on hold for now |
@KyleMit, I spent some time thinking about the problem yesterday. If we were hosting this site on a server instead of Netlify, I think I would set up the Sessionize fetch as a cron job running daily or every few hours, doing the following:
The build process would be triggered by webhook whenever there's a commit to master:
Although the build process wouldn't directly fetch data from Sessionize, it would be triggered whenever the fetch cron job committed changes to the repo. The fetch job could even run really frequently and the site would only rebuild if there were actual data changes committed to the repo. Unfortunately, since we aren't running a server we can't just add a cron job. I definitely found ways to trigger a Netlify build on a schedule, but not a custom process. Perhaps this would be a good use case for a serverless function like an AWS Lambda or Azure Function? Here's an example of someone doing pretty much what I've described with a Lambda: Or maybe this idea is overly complex and we should just have a simple way to toggle the data source from sessionize to data files. We could run the site in Sessionize build mode for August and September (or starting whenever the schedule is published), and then toggle it over to data file mode after the event is over. We'd need to be able to run the Sessionize data fetch script manually and commit the data files to the repo before switching it over of course. To deal with potential sessionize api failure we could make sure the Netlify build failed if data doesn't come back. This would mean a sessionize failure would prevent us building the site for an unrelated change, but failure does seem pretty unlikely. We'd also want some sort of cron job to trigger at least a daily build. Seems like there's a fairly straightforward Zapier-Netlify integration for that. |
Love watching this conversation and it looks like you are having fun with the challenges! :) |
Hot Take: According to Sarah Drasner of @netlify, we can use Netlify Functions Which is available on the free tier & allows for 100 hours of runtime and 125k requests per month! |
Yeah I was looking at the Netlify functions feature the other day. Looks like it could work nicely. Sounds like they are just using AWS lambda, but it's convenient that we can manage it right from the Netlify admin without a setting up an account with a separate service! The piece I haven't quite wrapped my head around is how we can get the function to commit to the repo. Seems like in the context of a serverless function the git binary wouldn't be available to use. I've seen suggestions that maybe the GitHub API can be used instead though? |
Yeah, Netlify Functions is only really replacing the AWS lambdas portion of things and allowing us to run serverless compute on command so we can schedule a chron job. That's probably preferable only in that we can commit the source code directly inside of our repository and minimize another set of third party dependencies & access credentials. The second half is once we have that compute, our persistence layer is just using github / .json files, so we need to have an API to read/write directly to git/github. We can generate a scoped Personal Access Token and use the Github API OctcoKit provides wrappers for the GitHub API, including a REST API client for JavaScript which would look kinda like this: // create client
const octokit = new Octokit({
auth: process.env.GITHUB_TOKEN
})
// default params
const PARAMS = {
owner: "vtcodecamp",
repo: "2018.vtcodecamp.org",
committer: {
name: "some-bot",
email: "[email protected]"
}
};
// read file
const readFileParams = {...PARAMS, path: "sessionize.json" }
const {data} = await octokit.repos.getContents(readFileParams);
const writeFileParams = {
...PARAMS,
path: "sessionize.json",
message: "Update sessionize data.",
content: Buffer.from(JSON.stringify(content, null, 2)).toString('base64')
}
// write file
await octokit.repos.updateFile(writeFileParams ) |
Proof of concept seems to be going okay on, pushed to feature/fetch_sync Please welcome our new bot overlords: // todo
|
Sweet - now we have a netlify functions end point, which will execute the All of that will get executed every time we hit the following address: https://vtcodecamp2018.netlify.com/.netlify/functions/fetchData Then I setup a Zapier account (same credentials as Netlify Account) which offers a set of If This Then That actions and triggers: And setup a daily request to the netlify functions URL which will kick off data cache build |
Closed by PR #2 |
This looks awesome, @KyleMit ! Thanks for digging into this process. I'm pretty busy with a home remodeling project this week, but I'll make some time to review this and make sure I understand it all soon. |
So now the whole data fetch process piggybacks on the sessionize fetch call
Ignoring any static
.json
files for now, we can build up all the applicable objects/arrays in-memory and return them on the globaldata
propertyRather than having to own the
fs.writeFile
part as an interim step to fetching the data and loading to data directory, we can go skip straight to the data directory. The actual pages don't care where/when the data came from.And just so we have the source available if we ever wanted to get back to the raw json info, we can leverage the
debug.11ty.js
page and have 11ty handle the file IO for us and preserve the original json data.