Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model multi-byline contributor as strings instead of Tags #254

Merged
merged 2 commits into from
Nov 13, 2024

Conversation

simonbyford
Copy link
Contributor

@simonbyford simonbyford commented Oct 29, 2024

What does this change?

Previously, we assumed multi-byline contributors would be modelled as tag objects in CAPI (#247).

However, for rendering purposes, it's actually sufficient to model them as IDs (e.g. ["profile/owen-jones"]). When we render the multi-byline element, these IDs will be mapped to the contributor tags on the article level, which contain all the relevant metadata (image URL, etc.)

Example CAPI response (simplified):

{
  "blocks": {
    "body": [
      {
        "elements": [
          {
            "type": "list",
            "listTypeData": {
              "type": "multi-byline"
              "items": [
                {
                  "title": "Labour is the only party serious about building homes",
                  "bio": "David Miliband is CEO of the International Rescue Committee",
                  "elements": [ ... ],
                  "contributorIds": ["profile/davidmiliband"],
                  "bylineHtml": "<a href=\"profile/davidmiliband\">David Miliband</a>"
                }
              ]
            }
          }
        ]
      }
    ]
  }
  "tags": [
    {
        "id": "profile/davidmiliband",
        "type": "contributor",
        "bylineImageUrl": "https://uploads.guim.co.uk/2023/09/15/David_Miliband.jpg",
        // etc.
    },
  ]
}

To this end, Porter already writes contributors to elasticsearch as IDs (https://github.com/guardian/content-api/pull/2879).

This mismatch between Porter and Concierge only came to light recently when testing a branch of flexible-content which began sending contributors data to the Kinesis stream (https://github.com/guardian/flexible-content/pull/4966). Defining the type of contributors as tags imposed a requirement on Concierge to transform those IDs into tag objects, which caused Concierge to explode.

Emily provided a more complete explanation of the problem here: #247 (comment)

The fix is to model contributors as IDs instead of tags.

I have renamed this field contributorIds to make this distinction clearer.

Next steps...

  1. Update Concierge to use this model https://github.com/guardian/content-api/pull/2928
  2. Update Porter to:
  1. Update Composer to send contributor data to the Kinesis stream https://github.com/guardian/flexible-content/pull/4966

@simonbyford simonbyford requested a review from a team as a code owner October 29, 2024 11:07
3: optional list<Tag> contributors;
3: optional list<string> contributorIds;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay to drop the old field given there are no multi-byline articles in PROD?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok because it's an optional field and not used, but will have to check the thrift docs to make sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I look at this, I do think it might be better to comment out
3: optional list<Tag> contributors;
and add
4: optional list<string> contributorIds;
in order to record the fact that it has changed, even though it was never used.

Let's think on it a bit before committing either way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for looking into this.

4: optional list<string> contributorIds;

Did you intend to use the field ID 4 here? Can we reuse 3? Otherwise perhaps it should be 9 to avoid clashes with other fields?

Let's think on it a bit before committing either way?

For sure - I appreciate the need for caution here

Copy link
Contributor Author

@simonbyford simonbyford Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again @JustinPinner, I've discussed this with @davidfurey and we agree your approach above is needed, so I've made the change

@gu-scala-library-release
Copy link
Contributor

@JustinPinner has published a preview version of this PR with release workflow run #75, based on commit f4d3730:

26.0.0-PREVIEW.model-contributors-as-strings-not-tags.2024-10-30T1748.f4d37308

Want to make another preview release?

Click 'Run workflow' in the GitHub UI, specifying the model-contributors-as-strings-not-tags branch, or use the GitHub CLI command:

gh workflow run release.yml --ref model-contributors-as-strings-not-tags

Want to make a full release after this PR is merged?

Click 'Run workflow' in the GitHub UI, leaving the branch as the default, or use the GitHub CLI command:

gh workflow run release.yml

Copy link
Member

@JustinPinner JustinPinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@simonbyford simonbyford merged commit 6bc3cd7 into main Nov 13, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants