-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model multi-byline contributor as strings instead of Tags #254
Conversation
3: optional list<Tag> contributors; | ||
3: optional list<string> contributorIds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we okay to drop the old field given there are no multi-byline articles in PROD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ok because it's an optional field and not used, but will have to check the thrift docs to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I look at this, I do think it might be better to comment out
3: optional list<Tag> contributors;
and add
4: optional list<string> contributorIds;
in order to record the fact that it has changed, even though it was never used.
Let's think on it a bit before committing either way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for looking into this.
4: optional list<string> contributorIds;
Did you intend to use the field ID 4
here? Can we reuse 3
? Otherwise perhaps it should be 9
to avoid clashes with other fields?
Let's think on it a bit before committing either way?
For sure - I appreciate the need for caution here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi again @JustinPinner, I've discussed this with @davidfurey and we agree your approach above is needed, so I've made the change
@JustinPinner has published a preview version of this PR with release workflow run #75, based on commit f4d3730: 26.0.0-PREVIEW.model-contributors-as-strings-not-tags.2024-10-30T1748.f4d37308 Want to make another preview release?Click 'Run workflow' in the GitHub UI, specifying the model-contributors-as-strings-not-tags branch, or use the GitHub CLI command: gh workflow run release.yml --ref model-contributors-as-strings-not-tags Want to make a full release after this PR is merged?Click 'Run workflow' in the GitHub UI, leaving the branch as the default, or use the GitHub CLI command: gh workflow run release.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
What does this change?
Previously, we assumed multi-byline contributors would be modelled as tag objects in CAPI (#247).
However, for rendering purposes, it's actually sufficient to model them as IDs (e.g.
["profile/owen-jones"]
). When we render the multi-byline element, these IDs will be mapped to the contributor tags on the article level, which contain all the relevant metadata (image URL, etc.)Example CAPI response (simplified):
To this end, Porter already writes contributors to elasticsearch as IDs (https://github.com/guardian/content-api/pull/2879).
This mismatch between Porter and Concierge only came to light recently when testing a branch of flexible-content which began sending contributors data to the Kinesis stream (https://github.com/guardian/flexible-content/pull/4966). Defining the type of contributors as tags imposed a requirement on Concierge to transform those IDs into tag objects, which caused Concierge to explode.
Emily provided a more complete explanation of the problem here: #247 (comment)
The fix is to model contributors as IDs instead of tags.
I have renamed this field
contributorIds
to make this distinction clearer.Next steps...
contributors
tocontributorIds
https://github.com/guardian/content-api/pull/2929