Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Hasura input validation for tables storing user-submitted HTML #2551

Merged
merged 9 commits into from
Dec 18, 2023

Conversation

DafyddLlyr
Copy link
Contributor

@DafyddLlyr DafyddLlyr commented Dec 11, 2023

What does this PR do?

  • Attempts to resolve the issue of sanitation of user-submitted HTML in a generalised method using Hasura input validation
  • Adds a new API endpoint to handle input validation
  • Adds validation to global_settings and flows tables (this is not available on a per-column basis)
  • Adds very basic frontend feedback (a toast message)

Apologies for such a big PR... 🙏

How does this work?

Essentially, prior to opening a database transaction, Hasura makes a request to an endpoint which can run business logic to validate a payload.

This endpoint does not modify the submitted payload, it merely accepts or rejects it. This is different to the previous approach of sanitising and accepting changes.

Please see diagram below which demonstrates the data flow -

sequenceDiagram
  participant PlanX as PlanX Editor
  participant Hasura as Hasura GraphQL API
  participant API as PlanX API
  participant Middleware as Validation Middleware
  participant Controller as Controller
  participant DB as PostgreSQL

  PlanX ->> Hasura: User-submitted content
  activate Hasura

  Hasura ->> API: Input validation request
  activate API

  activate API
  API ->> Middleware: Validate against Zod schema
  activate Middleware
  alt Valid
    Middleware ->> Controller: Transformed payload
    activate Controller
    Controller ->> Hasura: 200 OK
    deactivate Controller
    Hasura ->> DB: Write to Postgres
    activate DB
    DB ->> Hasura: Success
    deactivate DB
    Hasura ->> PlanX: 200 OK (Empty)
  else Invalid
    Middleware ->> Controller: Transformed payload
    deactivate Middleware
    activate Controller
    Controller ->> Hasura: 400 Bad Request
    deactivate Controller
  deactivate API
    Hasura ->> PlanX: 400 Bad Request (Invalid HTML content)
  end

  deactivate Hasura
Loading

Why input validation? Why not Hasura actions?

This approach is fairly re-usable and generalisable so it can be used, re-used, and expanded upon.

Our initial crack at this in #2488 is pretty specific to implementation and has a few gotchas / edge cases / footguns around permissions which are covered by comments etc. If we needed to do this for multiple tables things could get quite messy.

This approach means there's a single bit of logic for user-submitted HTML which can be used by multiple tables.

The real catch here is that it's just validation - it doesn't sanitise the input and proceed. Realistically this is likely fine for unsafe HTML!

I had initially considered Hasura Actions as a viable method as this could handle transforming data and then writing to the database but I ultimately decided against it as -

  • It would require some custom code per-table
  • We don't actually need sanitised HTML in the DB, rejecting it is fine
  • It just seemed more complex all around

Testing

Screen.Recording.2023-12-13.at.08.53.34.mov
  • Updates to global_settings works for clean content, is rejected for unsafe content
  • Updates to flows works for clean content, is rejected for unsafe content (flow.data still updates via ShareDB)
  • The same validation check happens via the GraphQL API directly

Next steps...

image

Copy link

github-actions bot commented Dec 11, 2023

🤖 Hasura Change Summary compared a subset of table metadata including permissions:

Updated Tables (2)

  • public.flows permissions:

    insert select update delete
    api / /
    platformAdmin / /
    teamEditor / /
  • public.global_settings permissions:

    insert select update delete
    platformAdmin / /

@DafyddLlyr DafyddLlyr force-pushed the dp/postgres-input-validation-sanitation branch from 4876ee1 to 812165d Compare December 11, 2023 19:33
Copy link

github-actions bot commented Dec 11, 2023

Removed vultr server and associated DNS entries

@@ -53,6 +55,12 @@ router.post(
sanitiseApplicationDataController,
);

router.post(
"/webhooks/hasura/validate-input/jsonb/clean-html",
Copy link
Contributor Author

@DafyddLlyr DafyddLlyr Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to thoughts on the URL path here - I figured having a column type parameter would handle us possibly expanding this in future, e.g -

  • /webhooks/hasura/validate-input/text/clean-html
  • /webhooks/hasura/validate-input/number/is-uprn

});

describe("isCleanHTML() helper function", () => {
const dirtyHTML = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Recursively iterate over an object, checking is validator callback condition is met
* Fails fast - will return false once first invalid value is encountered
*/
export const isObjectValid = (
Copy link
Contributor Author

@DafyddLlyr DafyddLlyr Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially hoping to just use some lodash method to get all nested values from an object so I didn't have to write this (and then write tests for it!).

I don't think there is a method for this however - a few requests on the lodash repo suggest using flat but a whole library for this seems a bit excessive.

If there's a simpler library method or approach for this that I'm just missing please let me know 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels readable to me, agree a whole library feels like overkill at this stage!

@DafyddLlyr DafyddLlyr force-pushed the dp/postgres-input-validation-sanitation branch from c20cd16 to c3d809c Compare December 12, 2023 17:28
Comment on lines +28 to +32
type isCleanJSONBSchema = z.ZodType<
IsCleanJSONBRequest,
z.ZodTypeDef,
HasuraValidateInputRequest
>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using .transform() to join two schemas requires this pretty gnarly type.

Docs: https://zod.dev/?id=zodtype-with-zodeffects

@DafyddLlyr DafyddLlyr force-pushed the dp/postgres-input-validation-sanitation branch from c3d809c to b806764 Compare December 12, 2023 18:23
@@ -290,6 +290,15 @@
- id
- version
- data
validate_input:
Copy link
Contributor Author

@DafyddLlyr DafyddLlyr Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are repeated per table, per role, per operation (insert & update).

Now that we've updated Hasura it might be worth re-looking at inherited roles as a way of simplifying a few things.

@DafyddLlyr DafyddLlyr requested a review from a team December 12, 2023 18:26
@DafyddLlyr DafyddLlyr marked this pull request as ready for review December 12, 2023 18:26
@DafyddLlyr DafyddLlyr changed the title feat(wip): Hasura input validation for tables storing user-submitted HTML feat: Hasura input validation for tables storing user-submitted HTML Dec 12, 2023
Copy link
Member

@jessicamcinchak jessicamcinchak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me - really nice to see immediate use cases for these new features like input validation 🎉

`<p>This is <b>bold</b> text and <i>italic</i> text.</p>`,
`<div><p>This content is inside a div.</p></div>`,
`<h2>Subheading</h2><p>This is a paragraph under the subheading.</p>`,
`<div><h1>Main Title</h1><p>This is a <b>nested</b> paragraph with a <i>nested</i> element.</p></div>`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what we know about Planx content / rich text editing - should we have a clean example with a valid <img> tag too just to be safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good shout - test case added 👍

* Recursively iterate over an object, checking is validator callback condition is met
* Fails fast - will return false once first invalid value is encountered
*/
export const isObjectValid = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels readable to me, agree a whole library feels like overkill at this stage!

@DafyddLlyr DafyddLlyr merged commit fc08523 into main Dec 18, 2023
12 checks passed
@DafyddLlyr DafyddLlyr deleted the dp/postgres-input-validation-sanitation branch December 18, 2023 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants