-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a new view can break previous requests when embedding with column or FK as target #2277
Comments
Adding a view of a junction will also make existing many-to-many embeds fail. For example, adding this to our fixtures: create view users_tasks_view as
select * from users_tasks; Results in 12 test failures. For example this one: GET /users?select=id,tasks(id)&order=id.asc&tasks.order=id.asc So we can take the same approach for junction views - they will only be considered if they're explicit: GET /users?select=id,tasks!users_tasks_view(id)&order=id.asc&tasks.order=id.asc The downside is that we'd break this previous use case #1736 For this we can add a legacy config. |
This would still allow breaking existing queries when adding a view, namely those queries that use the column name for embedding - and then a view with the name of the column is added later on. This change would also create more differences between tables and views - and lead to a situation where self-embedding would suddenly not be supported the same on views compared to tables anymore. Also migrating to schema isolation step-by-step as mentioned in PostgREST/postgrest-docs#531 would be made harder, because some queries might need adjustment at the same time.
Please don't. Cluttering the code with legacy config options is hardly an improvement. Let's instead make sure that we change it in a way that everyone can migrate to easily. Let's break down on a higher level why adding new views can break previous requests. There's two reasons for that:
Note that both situations where an existing query is broken by adding a view can also happen when adding a column or a FK to a table - so those are not really specific to views. Given this and the down-sides regarding schema isolation, I suggest to find a way that does not depend on view vs. table - but rather something like #2238 (comment). |
Hey Wolfgang,
That is #2238 right? My answer here applies then - telling the user to change the view name is fine. It's also a rare case.
Right, self join on views would not be supported. Maybe I can make an exception in that case.
True. I guess it's a chicken and egg problem.
I'd say 1 is much more impactful and surprising for users(some will just give up supabase/postgrest-js#229). Specially considering that we recommend views for more complex queries. So it's a fair trade. A middle ground would be having a way to mark views for full embedding inference - views that work with col/fk as target. ALTER ROLE authenticator SET pgrst.fully_inferred_views = 'a_view, another_view' That would allow users to migrate, we can warn about this setting too. It's one more config, but I don't see another way around it. |
Not sure what you mean by this, you mean:
Wouldn't that be a much worse breaking change? Also as I mentioned there, users naturally reach for col as target - so we can't break that.
That's already done btw: #2262. But it's not enough to get to a stable embedding.
Can an embed break when adding columns? Regarding adding more FKs - that is the only way reasonable and acceptable for us to break embeddings. More FKs mean more relationships. I was actually thinking to add a warning for this in the docs. Our embeddings are foreign key joins after all. But we should not break requests when adding views, foreign key joins on these is more of an artificial concept. |
Actually, the above is not 100% true if the client used the table name as target: GET /projects?select=*,clients(*) If the If the client used: GET /projects?select=*,client_id(*) In that case, yes, changing that to a view won't work. The config I proposed would solve that though. |
Also note that on #2272 the hint mode is being changed to the above form, which is the stricter one. We can actually recommend that one now for even less chance of breaking requests. Before that wasn't possible, because as mentioned in the docs https://postgrest.org/en/latest/api.html#hint-disambiguation In the |
And I think we should break it. Using columns for embedding might feel natural to some (it doesn't for me - at all) - but really it's just the root cause of all the ambiguities when embedding. No o2m/m2o embedding that has the relation as target can be broken by adding another view - because the relation is already explicitly specified. Those can only be broken by adding new FKs and that's certainly something that would be expected as you rightly point out. The same for m2m relationships - as long as those are given as Relationships are defined by junction tables and foreign keys - as you can tell from our haskell typing which models relationships this way. And I think we should limit ourselves to just those and throw columns out. This would make embedding so much more consistent overall, much easier to understand, much less error prone. I'd say:
Yes, this might feel like a "big" breaking change. But really anything we do here will be a breaking change. And once you start thinking in terms of "relations" and "foreign keys" instead of "columns" when embedding, it will not be hard at all. After all an embedding is just a join - and users certainly understand they need a target relation for a join. One more argument against columns: Using columns does not allow using multi-column FKs as relationships. So it's not a complete feature anyway. PostgREST has made a decision a long time ago, that there is no "simple" route->PK mapping like |
I think we're going in circles here. The above was already solved a long time with FK-as-target, which we would also drop with your proposal.
I've seen many of these cases(e.g. supabase/supabase#6274), not offering the col-as-target would break those and not offer a path forward - this is what I consider a "big" breaking change.
What I can do to make it more understandable is to deprioritize the col-fk-as-target on the docs and prefer the I do see a future where col-fk-as-target support is removed, but this change should not be so abrupt. |
Good point. So maybe the above breaking change I proposed is not needed, it's reasonable to fail in that case I believe. Also, the adding of junction views is more of an artificial problem, I've never saw them reported. The new strict m2m junctions also reduce likelyhood.
I can also prioritize the "relation!junction" form in the docs and leave the old form as a deprioritized shortcut.
No need to support that now or in the future, flattening should cover those cases #1233 (comment). In fact, flattening(as ideated on the issue) would be able to express any m2m relationship - somewhat making the |
From #2238 (comment) In light of the opposition to the proposal - the above solution would solve the issue without a breaking change. Though I'd have to see how feasible is to implement it(would probably be less efficient). |
Ah, I didn't comment on that, because it seemed like the idea was thrown away already: But I would be even more opposed to this, than to the other suggestion. And that's because with this proposal a working request could not only fail by adding a new view - but it could suddenly return a different result (without failing with an error code) when adding a new table relationship that would otherwise conflict with an inferred view relationship, which is already used. So assume you use an existing inferred view relationship and you're adding a new table relationship that would cause a conflict: Suddenly, because table wins vs view you're now embedding a different relation compared to before, without getting an error. That's something we should absolutely avoid - and that's why I think we should avoid all "ranking" based approaches like this one. All the different options need to have the same priority. |
Having experienced incorrect embedding very briefly a year or so ago (primarily from duck-typing), I'm curious what truly would be a solution to this with the current auto-detection that is in place? I've realized that being as explicit as possible in our queries has been the safest way as possible to avoid this scenario and always using hints. Is it even possible to create a solution that guarantees not breaking current query embeds when adding in new tables/views down the line? |
It is certainly possible, but the problem at hand is 2-dimensional and there are many solutions. One dimension is the number of embeddings we detect / the types of embeddings we allow. The other dimension is our query syntax. The more embeddings we allow, the more verbose (or less flexible) our syntax needs to be for such a guarantee. The fewer embeddings, the simpler (or more flexible) the syntax can be. Right now we have a simple syntax paired with many embeddings. That's trouble. I think there's also a third dimension somehow and that's something I'd call "level of symmetry" within the whole feature. That's basically whether embedding is supported the same across all types of relations (tables, views), types of relationships (o2m, m2o, m2m, self-embedding) and all different syntaxes (using column names / fk names / relation names in target / hint). We can approach this from two angles:
There are also ideas that would reduce the level of symmetry to try to keep a balance between a backwards compatible syntax and a reduction of ambiguities. This is what Steve is aiming to do in #2272. There will be breaking changes introduced by that as well - but those will be harder to explain. Some will be resolvable by changing syntax. Some will be resolvable by changing your schema. Some won't be resolvable. That's the nature of the asymmetric solution. I think we should strive for a solution that creates as much symmetry as possible, because this is easier to understand well for users and has other benefits (full support for schema isolation would be one). At the same time we should try to only introduce breaking changes that can be solved by adapting the request you make. We should avoid breaking changes that force you to change your schema - or completely change the way how to query the api in certain scenarios (e.g. forcing you to use "something else" where an embedding worked before). |
@bolerodan IIRC you use columns as hints - e.g. Right now by using the |
@wolfgangwalther We haven't been following that criteria for breaking changes. Adapting the request you make involves client updates, which is a much harder change(making sure many clients download the new JS lib version, invalidating cache, etc). In contrast, changing your schema is a much easier change. For example, renaming a constraint(simple change in the db) allowed a migration path from the old duck-typing bolerodan mentioned. Renaming views is also a simple change(which could be done to avoid a breaking change for col-fk-as-target). Renaming a table/column name would be out of the question of course. Ideally this should be scoped to the |
Interesting take. For me, it's quite the opposite. I can't avoid all breaking changes to my api. To be able to handle any breaking changes to my api cleanly, I need to use proper api versioning mechanisms. And that's not just the breaking changes we're talking about here (introduced by PostgREST), but also all breaking changes introduced by my own api development. If I don't have api versioning mechanisms in place, I can never do a proper upgrade. We suggest to use schema-based api versioning somewhere. The procedure is as follows:
This is a "rolling upgrade" of a server/client architecture without downtime. You can automate this in Continuous Deployment setups, too. Once you have this in place, a change to a client's request is really simple.
Renaming a constraint would be a breaking change to your API, which would force those clients that already use the previous constraint name to update their requests.
Renaming a view would be a breaking change to your API, which would force those clients that already use the previous view name to update their requests.
Renaming a column name (in a relation in the api schema) would be a breaking change to your API, which would force those clients that already use the previous column name to update their requests. Renaming things does not avoid breaking changes to a client's request at all. |
We have always avoided telling users to execute that step.
Once upon a time, there was duck-typing. Assuming GET /projects?select=*,client(*) That is, requesting "clients" without the "s" at the end. That thing caused many more ambiguities that were hard to understand so it was removed. The migration path for that was to rename the autogenerated constraint name(say like
So yeah, the above is just plain false. |
Hm, maybe it's not renaming, so if you used a Actually I've noted that the above could lead to the same situation as #2238. So not sure it's infallible as a migration path. |
This is still a good guide that we should document. Managing an additional postgrest instances plus proxy config is definitely not as simple as changing the |
But that was only possible, because FKs were not used at all in the exposed API before. That's not the case here. We're talking about renaming things that are exposed already.
I wasn't specific enough, I guess. Let me rephrase: Renaming things that are already part of the API does not avoid breaking changes ... etc. |
Full schema isolation support is still there as always, the I'd also argue that disambiguation will be easier to understand with this change, since we no longer have the case mentioned in Another win is that we can now document the self-relationship embedding, without solving #1643 it was not stable enough to document it, but now it is. That being said, #2272 is not a "perfect solution" by any means, but it's does solve 3 bugs in our issue tracker - a win I'd say. |
Let's see how this goes in the next release. There's still a way to offer backward compatibility with this change, but it would require manually specifying relationships: ALTER DATABASE your_db -- could also be ALTER ROLE authenticator
SET pgrst.your_schema.my_view = $$
{
"relationships": [
{ "target": "arbitrary_name"
, "foreign_table": "other"
, "join_columns": ["my_view.col1", "other.col2"]
, "cardinality": ''o2m"}
]
}
$$ Then, it would be possible to: GET /my_view?select=*,arbitrary_name(*) We should try to keep the "low-code" approach though. |
Actually implementing the above is not a bad idea for this release. It not only provides a way for users to migrate but also workarounds for other issues like #1907 Since it provides a way to do embeds without hints, it can also solve the case where there is a legitimate FK ambiguity, without patching clients. It would be capable of overriding existing inferred relationships. |
In hindsight, I wonder if the server/db side config for resolving embeds would have been a better choice than the client side hints. Not only it would have kept our syntax simpler, but also allows us to provide better metadata(typings, openapi, etc) since the embeds are static and not resolved at runtime. @bolerodan Curious on your thoughts on the above proposal |
@steve-chavez could you elaborate a bit more about this proposal? |
@bolerodan Sorry for the late reply. I gave it more thought and will implement #2144 to offer a way forward from this breaking change. With #2144 any embedding can be resolved unambiguosly. |
@steve-chavez cool sounds good. I've been keeping my eye on that thread to see how it materializes (hah) on that front. |
As exemplified in the docs: https://postgrest.org/en/stable/api.html#hint-disambiguation
Whenever a view is added it can break previous embeds - which is surprising behavior for most users.
This is specially bad considering we generate a lot of view relationships: #2262 (comment). Note that we generate view relationships recursively.
Proposal
^ Will continue working even when adding more
addresses
views.(From https://postgrest.org/en/stable/api.html#embedding-disambiguation)
Instead, views should be explicit in the request:
(the above will get orders)
We had some use cases before doing this #1230, so it's an organic use case.
This will solve #2238 and #1643
The text was updated successfully, but these errors were encountered: