Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/fetch on build #1

Closed
wants to merge 3 commits into from
Closed

Feature/fetch on build #1

wants to merge 3 commits into from

Conversation

KyleMit
Copy link
Member

@KyleMit KyleMit commented Jun 14, 2019

So now the whole data fetch process piggybacks on the sessionize fetch call

module.exports = async function() {
    const response = await fetch('https://sessionize.com/api/v2/bm8zoh0m/view/all');
    const sessionize = await response.json();

    const {levels, formats, categories, speakers, sessions } = sessionize

   // do any data mapping / ETL

    return {levels, formats, categories, speakers, sessions};
};

Ignoring any static .json files for now, we can build up all the applicable objects/arrays in-memory and return them on the global data property

Rather than having to own the fs.writeFile part as an interim step to fetching the data and loading to data directory, we can go skip straight to the data directory. The actual pages don't care where/when the data came from.

And just so we have the source available if we ever wanted to get back to the raw json info, we can leverage the debug.11ty.js page and have 11ty handle the file IO for us and preserve the original json data.

@zekefarwell
Copy link
Contributor

zekefarwell commented Jun 15, 2019

This was also my initial idea for building the site from Sessionize data – a dynamic JS driven "data" file that pulls data from Sessionize at build time. However, I made a separate step for fetching the data and writing it to json files because the Sessionize API documentation requests that we cache the data and it also seemed like a good idea to have the data committed to the repo.

Although the live build approach in this PR should work very well for the time around when the schedule is published leading up to the day of the event (updating to reflect changes, etc), once the event has passed, having Sessionize as a dependency of the build processes seems like an extra possible point of failure since the data likely won't be changing at that point. I think we ought to have a config setting for build_mode that allows us to easily change the data source from Sessionize to json files in the repo.

Also, I wonder how the Sessionize dependent build process would respond to a failed API request? I think the build would probably fail, but since we're not explicitly checking maybe it would build the site with blank schedule, sessions, and speakers pages?

Side note: This being the 2018 site with no need for updates from Sessionize, I'm considering this PR as intended for the 2019 repo once we add sessions, speakers, and schedule pages.

@KyleMit
Copy link
Member Author

KyleMit commented Jun 15, 2019

@zekefarwell , yeah, great points, the resilience of the build if the API is down is probably pretty important for the live site. This pivot was mostly focused on how we would manage any incremental changes in the 2019 sessionize data and pull them into the site, in essence caching them as part of the daily build rather than hitting them by every single user on every page request.

But yeah, for past events with static, un-changing data, we should probably never do a fresh hit on the API, so we'd need an environment variable somewhere to toggle between .json and sessionize or make a commit each season to winterize everything and wrap it up.

The trick is, I want eleventy to own the build process, and it seems like that's going to have a bias at saving things in the _site directory, not _data directory. I'll have to think a bit more about what could kick off a distinct build step or how to update the netlify files on build

Let's leave this PR on hold for now

@zekefarwell
Copy link
Contributor

@KyleMit, I spent some time thinking about the problem yesterday. If we were hosting this site on a server instead of Netlify, I think I would set up the Sessionize fetch as a cron job running daily or every few hours, doing the following:

  1. Fetch data from Sessionize and write as JSON files in _data directory
  2. If Sessionize fetch fails, leave data files untouched
  3. If data files have changes git commit them and git push back to repo

The build process would be triggered by webhook whenever there's a commit to master:

  1. git pull changes from repo
  2. Run eleventy

Although the build process wouldn't directly fetch data from Sessionize, it would be triggered whenever the fetch cron job committed changes to the repo. The fetch job could even run really frequently and the site would only rebuild if there were actual data changes committed to the repo.

Unfortunately, since we aren't running a server we can't just add a cron job. I definitely found ways to trigger a Netlify build on a schedule, but not a custom process. Perhaps this would be a good use case for a serverless function like an AWS Lambda or Azure Function? Here's an example of someone doing pretty much what I've described with a Lambda:

Or maybe this idea is overly complex and we should just have a simple way to toggle the data source from sessionize to data files. We could run the site in Sessionize build mode for August and September (or starting whenever the schedule is published), and then toggle it over to data file mode after the event is over. We'd need to be able to run the Sessionize data fetch script manually and commit the data files to the repo before switching it over of course. To deal with potential sessionize api failure we could make sure the Netlify build failed if data doesn't come back. This would mean a sessionize failure would prevent us building the site for an unrelated change, but failure does seem pretty unlikely. We'd also want some sort of cron job to trigger at least a daily build. Seems like there's a fairly straightforward Zapier-Netlify integration for that.

@julielerman
Copy link

Love watching this conversation and it looks like you are having fun with the challenges! :)

@KyleMit
Copy link
Member Author

KyleMit commented Jun 17, 2019

Hot Take: According to Sarah Drasner of @netlify, we can use Netlify Functions

Which is available on the free tier & allows for 100 hours of runtime and 125k requests per month!

@zekefarwell
Copy link
Contributor

Yeah I was looking at the Netlify functions feature the other day. Looks like it could work nicely. Sounds like they are just using AWS lambda, but it's convenient that we can manage it right from the Netlify admin without a setting up an account with a separate service! The piece I haven't quite wrapped my head around is how we can get the function to commit to the repo. Seems like in the context of a serverless function the git binary wouldn't be available to use. I've seen suggestions that maybe the GitHub API can be used instead though?

@KyleMit
Copy link
Member Author

KyleMit commented Jun 17, 2019

Yeah, Netlify Functions is only really replacing the AWS lambdas portion of things and allowing us to run serverless compute on command so we can schedule a chron job. That's probably preferable only in that we can commit the source code directly inside of our repository and minimize another set of third party dependencies & access credentials.

The second half is once we have that compute, our persistence layer is just using github / .json files, so we need to have an API to read/write directly to git/github.

We can generate a scoped Personal Access Token and use the Github API

OctcoKit provides wrappers for the GitHub API, including a REST API client for JavaScript which would look kinda like this:

// create client
const octokit = new Octokit({
  auth: process.env.GITHUB_TOKEN
 })

// default params
const PARAMS = {
  owner: "vtcodecamp",
  repo: "2018.vtcodecamp.org",
  committer: {
    name: "some-bot",
    email: "[email protected]"
  }
};

// read file
const readFileParams = {...PARAMS, path: "sessionize.json" }
const {data} = await octokit.repos.getContents(readFileParams);


const writeFileParams = {
    ...PARAMS,
    path: "sessionize.json",
    message: "Update sessionize data.",
    content: Buffer.from(JSON.stringify(content, null, 2)).toString('base64')
  }

// write file
await octokit.repos.updateFile(writeFileParams )

@KyleMit
Copy link
Member Author

KyleMit commented Jun 17, 2019

Proof of concept seems to be going okay on, pushed to feature/fetch_sync

Please welcome our new bot overlords:

image

// todo

  • Sharing env variables / access token
  • setup call to fetchData.js via Netlify Functions
  • setup fetch schedule (should auto trigger build when a commit on master occurs)
  • Previously we would manually run node fetchData.js, get local file updates, and commit/push changes. Now we publish changes directly against GitHub, which means that they don't actually exist locally because we've skipped that step, so we might get needless merge conflicts. Maybe we'd need a separate repo just for caching data, but then we're not saving ourselves much resilience
  • make sure we push to the correct branch (if we need support for PR preview branches)
  • see if we can bundle file changes into a single commit
  • compare before and after file changes and only update if we need it

@KyleMit
Copy link
Member Author

KyleMit commented Jun 18, 2019

Sweet - now we have a netlify functions end point, which will execute the fetchData.js which calls sessionize, grabs data, compares to the existing github json data files, and writes any new file content if they're different

All of that will get executed every time we hit the following address:

https://vtcodecamp2018.netlify.com/.netlify/functions/fetchData

Then I setup a Zapier account (same credentials as Netlify Account) which offers a set of If This Then That actions and triggers:

image

And setup a daily request to the netlify functions URL which will kick off data cache build

@KyleMit KyleMit mentioned this pull request Jun 18, 2019
@KyleMit
Copy link
Member Author

KyleMit commented Jun 18, 2019

Closed by PR #2

@KyleMit KyleMit closed this Jun 18, 2019
@zekefarwell
Copy link
Contributor

This looks awesome, @KyleMit ! Thanks for digging into this process. I'm pretty busy with a home remodeling project this week, but I'll make some time to review this and make sure I understand it all soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants