Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve API discoverability/clarity for devs with S3ByteArrayFetching #65

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

rtyley
Copy link
Member

@rtyley rtyley commented Nov 27, 2024

Introducing:

  • New type alias S3ByteArrayFetching for Fetching[ObjectId, Array[Byte]] - when developers encounter a method needing S3ByteArrayFetching, the type alias now documents how to create that precise thing.
  • New method S3ObjectFetching.byteArraysWith(s3AsyncClient) as a convenience to create a S3ByteArrayFetching for AWS SDK v2, removing the need to know about the Byte transformer, or the need to map with _.asByteArray()

This prompted by this PR (aiming to support the PR and make the changes in it easier to adopt):

How calling code changes

Given calling code that already has "com.gu.etag-caching" %% "aws-s3-sdk-v2" as a dependency, and has this setup:

import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching
val s3AsyncClient: software.amazon.awssdk.services.s3.S3AsyncClient = ??? // AWS SDK v2

...the code to get a Fetching[ObjectId, Array[Byte]] looks like this:

Before

S3ObjectFetching(s3AsyncClient, Bytes).mapResponse(_.asByteArray())

After

S3ObjectFetching.byteArraysWith(s3AsyncClient)

@rtyley rtyley force-pushed the add-convenience-for-fetching-s3-byte-arrays branch 4 times, most recently from 8c415ab to 86dcb2b Compare December 13, 2024 08:43
@rtyley rtyley changed the title Add convenience for fetching S3 byte arrays Add S3ByteArrayFetching for simpler fetching of byte arrays Dec 13, 2024
@rtyley rtyley force-pushed the add-convenience-for-fetching-s3-byte-arrays branch 4 times, most recently from a461e5d to e294eae Compare December 13, 2024 12:19
@rtyley rtyley changed the title Add S3ByteArrayFetching for simpler fetching of byte arrays Improve API discoverability/clarity for devs with S3ByteArrayFetching Dec 13, 2024
Introducing:

* **New type alias `S3ByteArrayFetching`** for `Fetching[ObjectId, Array[Byte]]` - when developers encounter a method needing `S3ByteArrayFetching`, the type alias now documents how to create that precise thing.
* **New method `S3ObjectFetching.byteArraysWith(s3AsyncClient)`** as a convenience to create a `S3ByteArrayFetching` for AWS SDK v2, removing the need to know about the `Byte` transformer, or the need to map with `_.asByteArray()`

This prompted by this PR (aiming to support the PR and make the changes in it easier to adopt):

* guardian/facia-scala-client#287

## How calling code changes

Given calling code that already has `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` as a dependency, and has this setup:

```scala
import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching
val s3AsyncClient: software.amazon.awssdk.services.s3.S3AsyncClient = ??? // AWS SDK v2
```

...the code to get a `Fetching[ObjectId, Array[Byte]]` looks like this:

#### Before

```scala
S3ObjectFetching(s3AsyncClient, Bytes).mapResponse(_.asByteArray())
```

#### After

```scala
S3ObjectFetching.byteArraysWith(s3AsyncClient)
```
@rtyley rtyley force-pushed the add-convenience-for-fetching-s3-byte-arrays branch from e294eae to a635ab5 Compare December 13, 2024 12:23
@gu-scala-library-release
Copy link
Contributor

@rtyley has published a preview version of this PR with release workflow run #123, based on commit a635ab5:

6.0.0-PREVIEW.add-convenience-for-fetching-s3-byte-arrays.2024-12-13T1224.a635ab54

Want to make another preview release?

Click 'Run workflow' in the GitHub UI, specifying the add-convenience-for-fetching-s3-byte-arrays branch, or use the GitHub CLI command:

gh workflow run release.yml --ref add-convenience-for-fetching-s3-byte-arrays

Want to make a full release after this PR is merged?

Click 'Run workflow' in the GitHub UI, leaving the branch as the default, or use the GitHub CLI command:

gh workflow run release.yml

@rtyley rtyley marked this pull request as ready for review December 13, 2024 18:38
@rtyley rtyley merged commit 390a678 into main Dec 13, 2024
8 checks passed
@rtyley rtyley deleted the add-convenience-for-fetching-s3-byte-arrays branch December 13, 2024 18:38
rtyley added a commit to guardian/facia-scala-client that referenced this pull request Jan 8, 2025
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching))
* Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286

The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338.

### Usage

```scala
import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching
import com.gu.facia.client.{ApiClient, Environment}
import software.amazon.awssdk.services.s3.S3AsyncClient

val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build()

val apiClient = ApiClient.withCaching(
  "facia-tool-store",
  Environment.Prod,
  S3ObjectFetching.byteArraysWith(s3AsyncClient)
)
```

_PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_

### Independence from AWS SDK version (v1 vs v2)

Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version.

For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version.

Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations:

* **`ApiClient()`**  - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65)

### Solved problems

* **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
rtyley added a commit to guardian/facia-scala-client that referenced this pull request Jan 8, 2025
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching))
* Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286

The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338.

### Usage

```scala
import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching
import com.gu.facia.client.{ApiClient, Environment}
import software.amazon.awssdk.services.s3.S3AsyncClient

val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build()

val apiClient = ApiClient.withCaching(
  "facia-tool-store",
  Environment.Prod,
  S3ObjectFetching.byteArraysWith(s3AsyncClient)
)
```

_PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_

### Independence from AWS SDK version (v1 vs v2)

Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version.

For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version.

Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations:

* **`ApiClient()`**  - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65)

### Solved problems

* **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
rtyley added a commit to guardian/frontend that referenced this pull request Jan 12, 2025
With guardian/facia-scala-client#287, the Facia client
has been updated so that it can take advantage of ETag-caching, using the
https://github.com/guardian/etag-caching library already used in Frontend since
August 2023 with #26338.

As some changes were necessary in the `etag-caching` library to facilitate this,
we need to update the version of `etag-caching` used by Frontend to v7.0.0.

We're also upgrading the S3 client (used by Frontend's instance of Facia's
`ApiClient`) to AWS SDK v2, as `etag-caching` already provides a convenient
implementation of `S3ByteArrayFetching` (see guardian/etag-caching#65)
for AWS SDK v2 - if we continued using AWS SDK v1 here in Frontend, we'd have
to create an implementation for it.

## See also

* guardian/maintaining-scala-projects#9
* guardian/facia-scala-client#287
rtyley added a commit to guardian/frontend that referenced this pull request Jan 16, 2025
With guardian/facia-scala-client#287, the Facia client
has been updated so that it can take advantage of ETag-caching, using the
https://github.com/guardian/etag-caching library already used in Frontend since
August 2023 with #26338.

As some changes were necessary in the `etag-caching` library to facilitate this,
we need to update the version of `etag-caching` used by Frontend to v7.0.0.

We're also upgrading the S3 client (used by Frontend's instance of Facia's
`ApiClient`) to AWS SDK v2, as `etag-caching` already provides a convenient
implementation of `S3ByteArrayFetching` (see guardian/etag-caching#65)
for AWS SDK v2 - if we continued using AWS SDK v1 here in Frontend, we'd have
to create an implementation for it.

## See also

* guardian/maintaining-scala-projects#9
* guardian/facia-scala-client#287
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant