-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persist and serve user preferences #1184
Persist and serve user preferences #1184
Conversation
I took the liberty of renaming the PR so it captures what it does, feel free to edit again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In retrospect, we should have chatted more about backend on friday! But taking some time to think about it allowed all these thoughts to percolate. It's an opinionated code base, and I wanted to unpack some of that.
Coupling of Backend and Frontend
We talked about some of the reasons backend and frontend are split into separate repos and developed semi-independently some of the time(e.g. some API changes have no bearing on frontend or some frontend redesigns need no or very little backend changes).
But we also talked about how they're tightly coupled, and how there is a lot baked into Backend that is specifically built for our Frontend. Some users seem to use Central without the frontend..... I was going to write that no one uses a custom frontend with backend, but that's not exactly true... some users do use their own custom frontends, but in that case, they mainly use the data API parts of backend, not the frontend API parts of backend. If someone else makes their own frontend, it'll have it's own separate backend/db/whatever in addition to talking to Central. (So many frontends/backends in this last paragraph 😅)
Basically... while the ODK team usually puts a lot of effort into making certain things within the ODK ecosystem generic and interchangeable (e.g. non-Central backends working with Collect, or just getting things to be compatible across combinations of Central/Collect/Android/web-based forms/versions of forms/etc.), the frontend-parts of backend don't really fall under that umbrella and are allowed to be super specific.
We always release Central with Backend and Frontend in sync, and there's generally enough communication along the way to keep them in sync during development in between releases, too. If someone is working on something full stack that touches both parts, they are thoughtful about the timing of getting both separate parts merged in. A frequent pattern is either working on and merging the backend first, or working on both parts in parallel, getting something functional, refining the backend and merging it, then refining the frontend and merging it. If there's a change in backend that breaks something in frontend, we try to get the frontend fix done ahead of time or asap.
What that means for this specific user preferences PR... is that it's okay and likely desirable for backend to have a deeper, structured understanding of the user preferences frontend needs to work with. See below! This is all up for discussion and pushback, we're just trying to have folks on the Central team in agreement about what's getting stored and how.
Structure of User Preferences
In the frontend code, I'm seeing preferences with the keys projectSortMode
and formTrashCollapsed
. In the backend, could there be a more explicit link between a project and a project-specific preference? We have an Actees table that holds acteeId
s for projects, forms, users, the site (*
)
The issue talks about site-wide preference and project-specific preferences, and in addition to imagining multiple preferences at each level, we could also possibly have form-specific or dataset-specific user preferences in the future. (I know only the 2 of project-sort-order and form-trash-collapsing are in scope for this PR.)
What if we had a user_preferences
table with something like
actorId
(pointed to the actor record of a user)acteeId
(pointed to a project actee ID or*
for site-wide)preferences
(jsonb)
and the API response returned the combination of multiple rows for a given user?
Or it could go further (taking inspiration from config) and store a actorId, acteeId, key, value, setAt
? Splitting every preference for every user into its own row might be overkill, though, or maybe it's not much worse than already splitting things out by actee/project if there are only ever a small number of preferences per actee.
On the other hand, maybe a generic approach is simpler in the long run and quicker to implement.
I think the desire to at least split it by actee (site and per project) is that through inspecting the database and code, we'd have a better idea of what the preferences were about. We don't really want our users to be able to store random stuff in their database through this endpoint. And we'd be able to tidy things up in the future if we needed to. E.g. preferences about a deleted project or form could also be deleted. Not that you can delete projects now (only archive) or have user preferences about forms...
Tests
I was reflecting on the config tests because they are the closest, but they are also not quite the right match. It's an acceptable intro to the kinds of tests we see for a lot of our API endpoints, though!
https://github.com/getodk/central-backend/blob/master/test/integration/api/config.js
Tests in that file should inspire tests in a user-preferences file:
-
should reject if user can't do a thing
- doesn't apply here because endpoint is about setting one's own preferences?
- any user can set their own preferences? what if they are using the api and authenticated through an app user token or something, though? i can't remember how the auth is plumbed together in those cases right now.
-
should reject for unknown config / user preferences?
- maybe we should check that they're setting a known preference key on a known resource (site, project, etc.) so we have a better idea of the shape of what we're storing?
- possible new test: reject for unknown resource (e.g. project)?
-
should set the config / should patch the config
- frontend is only going to send one kind of request? it shouldn't care if its setting new configs or not
- test cases within this scenario (adding totally new user preferences, adding to existing prefernces, updating existing preferennce)
-
should there be a way to delete a preference? (unsetting site-wide configs was something we had to do, but I don't know if it's necessary here for user preferences. could add it later if it becomes a thing.)
Plan
PUT PUT db tables:
DELETE /user-preferences/project/3/formTrashCollapse |
4d8a013
to
817eaca
Compare
82fbf29
to
1b2af23
Compare
1b2af23
to
907d1e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core of this PR is looking really good. I like where the database organization of the user preferences ended up. I hope its working well for frontend! It seems like it would be easy to extend to include form-specific preferences if we ever needed those.
My main comments are about tests and seeing if there are things we can hold off on including in this PR (unless you think they're really important).
.expect(400); // project 123 doesn't exist | ||
})); | ||
|
||
it('PUT/DELETE preference storage operations', testService(async (service) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see more/smaller tests!
In this big test, the main themes i'm getting out of it are:
- "it should store site-wide and project-specific preferences of varying complexity"
- "it should not return preferences that do not belong to the user"
- "it should group project preferences by project"
I think you should keep that test, or maybe split it up into 2 or 3 tests, but maybe there could also be stupidly basic tests like:
- "it allows a site-wide preference to be set"
- "it allows a project-specific preference to be set"
- "it allows a preference to be deleted"
- "it returns the (non blank) preferences in the extended metadata for the current user"
And one of our original concerns that the design of the api alleviates:
- "it allows a preference to be modified without overwriting another preference""
I'd also be interested in tests of the failure cases
- no propertyValue supplied
- empty string value supplied
- (anything else that tests the database constraints or other error handling in the resources)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would've liked smaller tests too! Yet the way the DB transaction control works in this test framework, combined with how the GET-end of the API is an aggregate (and thus to be fully tested, needs some stuff to aggregate over...) just leads down the path of incrementally building up a whole bunch of DB state in one transaction. Unless I'm missing something that's just the nature of this thing. If I want to test the aggregation I need to do a whole bunch of PUTs. While I'm at that, I might as well test those PUTs (and then I may as well test whether DELETEd properties don't appear in the aggregate). Ideally I'd have some transaction control so I can declare DB savepoints to roll back to after an expected failure.
I'll have a look to see if there's something to be done about that!
On to your points:
- "it should store site-wide and project-specific preferences of varying complexity"
- "it should not return preferences that do not belong to the user"
- "it should group project preferences by project"
Yes, it indeed tests for that, plus it tests whether deletion and overwriting work.
And one of our original concerns that the design of the api alleviates:
- "it allows a preference to be modified without overwriting another preference"
The atomicity boundary in the API is by HTTP resource, and those map 1:1 to the DB primary keys. So we follow HTTP semantics (intuitively understood by API consumers as well), and those are already tested during the incremental buildup of DB state...
I'd also be interested in tests of the failure cases
- no propertyValue supplied
You mean {"propertyValue": }
? That'd be invalid json, the framework catches that, and I shouldn't be testing the framework here. But I did find that a null
propertyValue was not working (for now I'm blaming (our version of) slonik for not doing the right thing for sql.json(null)
), fixed in e514b52, with regression test ;-)
I've added some informative error messages (and tests for those), in ed4b92d.
- empty string value supplied
An empty json-string (thus {"propertyValue": ""}
if I understand you correctly) is valid json, nothing special about it, and thus a valid preference propertyvalue per the contract (yet to be documented, but as I mentioned: value and semantics of the propertyValue
are completely up to the frontend). At some point we need to trust the DB, DB driver (*cough* e514b52), and framework to pass things through correctly, including 0-length strings.
- (anything else that tests the database constraints or other error handling in the resources)
There's a test that triggers a foreign key violation (well... until someone creates a fixture for a project with ID 123
...). The propertyName nonzero length DB check constraint is impossible to violate via the HTTP API because you can't supply a zero-length pathname component. I suppose I could test a PUT
to /v1/user-preferences/site//
, but what would I actually be testing at that point? The Express framework's URL path normalization perhaps? Theoretically they could change it and it would be nice if when they do (would they? It'd cause mayhem) it'd break our tests, but then I'd say that if that is a reasonable expectation then it's worth testing for their URL path normalization explicitly rather than testing for it as a side effect of testing for zero-length propertyName URLs.
I'll have a look at chopping up the tests into more it()
s, but as I mentioned, I have to build up state one way or another (either with many PUTs, or by writing to the DB directly, but that has no place in an API test, yet I still want to test the /users/current
API endpoint with that DB content...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is pretty much the situation for all Central tests.
...just leads down the path of incrementally building up a whole bunch of DB state in one transaction. Unless I'm missing something that's just the nature of this thing. If I want to test the aggregation I need to do a whole bunch of PUTs....
Send a bunch of requests about forms/submissions to build up the state. If it gets super repetitive, we make special setup functions for specific batches of tests, but it's still making the requests for every test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding some of the smaller tests! e.g.
- validates the request body stringently
- can store a JS null propertyValue
- should not allow storing preferences for nonexistent projects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can still see an advantage of smaller tests so i can just look at an individual test and grok the 1 thing its trying to do instead of having to follow the whole journey, but eh, it's fine for now!
// Yet the string 'null' (as distinct from the *jsonb* string '"null"' one would get with sql.json('null') !) | ||
// gets properly casted by PostgreSQL to a jsonb null (as distinct from an SQL NULL), so we use that in this case. | ||
const preparedPropertyValue = (propertyValue === null) ? 'null': sql.json(propertyValue); | ||
const values = [userId, propertyName, preparedPropertyValue] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if you wanted to trim the propertyName
. I noticed i'm able to make properties named ' ' and ' '.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 No, in the end I think that would be bad. I don't think I should treat the property name as anything else but an opaque token, as it's just an identifier. Spaces in the URL path component for the property name (encoded though of course) are valid.
Tangentially, Github markdown squished (trimmed) your ' '
and ' '
whitespace property name examples. That's sort of illustrative: it's because Markdown is for presentation, and it is documented to do that. Yet for the property names in the DB and API I don't see a benefit in constraining the property name (beyond refusing a zero-length string). They're chosen by developers, and of course a developer can choose a bad property name (including ' '
) just like they can choose bad variable names, but presumably we'd catch that in review, just like we catch bad variable names.
// { expected: "list of expected properties", actual: "list of provided properties" } | ||
unexpectedProperties: problem(400.33, ({ expected, actual }) => `Expected properties: (${expected.join(', ')}). Got (${actual.join(', ')}).`), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there another existing Problem we could use instead? I see this only comes up if you pass propertyValue
AND extra stuff like {'propertyValue': 'meow', 'cats': ' '}
. Maybe we just accept propertyValue
and ignore the rest.
I'm also ok leaving it as is with this new Problem because there is a test for it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did look at the existing Problems, and they had their problems. There's unexpectedAttributes
but its comments and problem string then proceed to talk about arguments rather than attributes. I didn't feel like fixing that because due to the ambiguity some use sites will use it "because it's about arguments" and some others will use it "because it's about attributes" so that would require unraveling the original intent at each use site.
I could probably have lived with using the existing unexpectedAttributes
error even though its name implies we're talking about "attributes" rather than "properties" ("properties" is what we're dealing with in json), I'm not that zealous, but I drew the line at "arguments" as in unexpectedAttributes
's error text. Because talking about "arguments" will make for a confusing error message in the context of posting a JSON body, and I value a precise and appropriate error message more than I value saving a line of code in this case ;-)
.expect(400); // project 123 doesn't exist | ||
})); | ||
|
||
it('PUT/DELETE preference storage operations', testService(async (service) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is pretty much the situation for all Central tests.
...just leads down the path of incrementally building up a whole bunch of DB state in one transaction. Unless I'm missing something that's just the nature of this thing. If I want to test the aggregation I need to do a whole bunch of PUTs....
Send a bunch of requests about forms/submissions to build up the state. If it gets super repetitive, we make special setup functions for specific batches of tests, but it's still making the requests for every test.
.expect(400); // project 123 doesn't exist | ||
})); | ||
|
||
it('PUT/DELETE preference storage operations', testService(async (service) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding some of the smaller tests! e.g.
- validates the request body stringently
- can store a JS null propertyValue
- should not allow storing preferences for nonexistent projects
.expect(400); // project 123 doesn't exist | ||
})); | ||
|
||
it('PUT/DELETE preference storage operations', testService(async (service) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can still see an advantage of smaller tests so i can just look at an individual test and grok the 1 thing its trying to do instead of having to follow the whole journey, but eh, it's fine for now!
const { testService } = require('../setup'); | ||
|
||
describe('api: user-preferences', () => { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was playing around with this new endpoint (with a notebook and pyodk) and saw that i could make properties named
and whitespace
and so on. I think the property values being pretty unrestricted is fine but having prop
be different from prop
seems potentially problematic.
Would love to see a test showing propertyNames get trimmed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see here for my opinion on property names. Otherwise, what's next? somepropertyname
and sօmepropertyname
are also distinct labels (if your font doesn't show a subtle difference, just look at the bytes!) but could easily be confused, shouldn't I prevent that kind of pitfall too then in order to keep everyone safe 🦸? I'm happy to do the work, it sounds like fun, but is it actually a real requirement? Note also that any check should be duplicated and implementations synced in the frontend, because ideally that's where you'd want to get dinged during development when you're inventing a bad 🦹 property name... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were talking XForm spec (Forms, Submissions, Datasets, Entities), we'd want to (and do) get very clear about what counts as the same or different values, because we want to be as clear and as nice to form designers as possible and not have something not work as expected because the casing or whitespace was subtly different.
But you're right that in this context, we've talked about how this user preferences endpoint is mainly for Frontend to use, so it makes sense to keep it simple and use the property name as is.
The one thing I could do to skip building up state this way is to mass-create state via the DB, but then the test will fail if the DB layout would change, and these tests are supposed to test the API and not the DB specifics, as that is an implementation detail. In terms of grokkability, the comments I've added in 4f6f7ba should help? |
We definitely want to test with the API and not build up database state directly because of the DB changing. And some of the core logic is so complicated in the database it would be / is not fun to manually build it up. I have done a bit of that for trying to test certain database migrations, which is tricky because you can't actually use the API if the DB schema is changing mid-test. This file is one of my pain points; I'm glad to have been able to test some of these migrations but the whole situation feels a little off. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woohoo, let's merge it!
This is a companion to getodk/central-frontend#1024, let's hold high level discussions in there for clarity (but simple on-point code comments etc can go in here)
Related: issue getodk/central#689
Related: PR getodk/central-frontend#1024