Merge branch 'main' into shared-responsibility

guardian · Apr 17, 2024 · fb4244d · fb4244d
2 parents e316251 + 18441d3
commit fb4244d
Show file tree

Hide file tree

Showing 32 changed files with 343 additions and 352 deletions.
diff --git a/.alexrc.js b/.alexrc.js
@@ -0,0 +1,6 @@
+exports.profanitySureness = 1;
+exports.allow = [
+  "steward-stewardess", //Exclude this rule as we get false positives from references to Scala Steward
+  "special",
+  "actor-actress", // Used in Security, as in "threat actor"
+];
diff --git a/.github/workflows/inclusion.yml b/.github/workflows/inclusion.yml
@@ -0,0 +1,29 @@
+# Find full documentation here https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions
+name: Inclusive Language
+
+on:
+  pull_request:
+
+  # Manual invocation.
+  workflow_dispatch:
+
+  push:
+    branches:
+      - main
+
+jobs:
+  inclusion-lint:
+    timeout-minutes: 15
+    runs-on: ubuntu-latest
+
+    # See https://docs.github.com/en/actions/security-guides/automatic-token-authentication#permissions-for-the-github_token
+    permissions:
+      contents: read
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Node
+        uses: actions/setup-node@v4
+
+      - name: Run inclusion
+        run: npx alex -q *.md
diff --git a/AWS-costs.md b/AWS-costs.md
@@ -3,24 +3,24 @@ AWS Costs
 
 ### Trusted Advisor
 
-Use the [Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/dashboard) to identify instances that you can potentially downgrade to a smaller instance size or terminate. Trusted Advisor is a native AWS resource available to you when your account has Enterprise support. It gives recommendations for cost savings opportunities and also provides availability, security, and fault tolerance recommendations. Even simple tunings in CPU usage and provisioned IOPS can add up to significant savings.
+Use the [Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/dashboard) to identify instances that you can potentially downgrade to a smaller instance size or terminate. Trusted Advisor is a native AWS resource available to you when your account has Enterprise support. It gives recommendations for cost savings opportunities and also provides availability, security, and fault tolerance recommendations. Even the simplest tunings, such as to CPU usage and provisioned IOPS can add up to significant savings.
 
 On the TA dashboard, click on **Low Utilization Amazon EC2 Instances** and sort the low utilisation instances table by the highest **Estimated Monthly Savings**. 
 
 ### Billing & Cost management
 You can use the [Bills](https://console.aws.amazon.com/billing/home?region=eu-west-1#/bill) and [Cost explorer](https://console.aws.amazon.com/billing/home?region=eu-west-1#/bill) to understand the breakdown of your AWS usage and possible identify services you didn’t know you were using it.
 
 ### Unattached Volumes
-Volumes available but not in used costs the same price. You can easily find them in the [EC2 console](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:state=available;sort=size) under Volumes section by filtering by state (available).
+Volumes available but not in used costs the same price. You can find them in the [EC2 console](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:state=available;sort=size) under Volumes section by filtering by state (available).
 
 ### Unused AMIs
-Unused AMIs cost money.  You can easily clean them up using the [AMI cleanup tool](https://github.com/guardian/deploy-tools-platform/tree/master/cleanup)
+Unused AMIs cost money. You can clean them up using the [AMI cleanup tool](https://github.com/guardian/deploy-tools-platform/tree/master/cleanup)
 
 ### Unattached EIPs
-Unattached Elastic IP addresses costs money. You can easily find them using the trust advisor, or looking at your bills as they are free if they are attached (so in use).
+Unattached Elastic IP addresses costs money. You can find them using the trust advisor, or looking at your bills as they are free if they are attached (so in use).
 
 ### DynamoDB
-It’s very easy to overcommit the reserved capacity on this service. You should frequently review the reserved capacity of all your dynamodb tables. 
+You should frequently review the reserved capacity of all your dynamodb tables to make sure it's not over-committed.
 The easiest way to do this is to select the Metric tab and check the Provisioned vs. Consumed write and read capacity graphs and use the Capacity tab to adjust the Provisioned capacity accordingly. 
 Make sure the table capacity can handle traffic spikes. Use the time range on the graphs to see the past weeks usage.
 
@@ -38,7 +38,7 @@ Lower storage price, higher access price. Interesting for backups for instance.
 
 * [Reduce Redundancy Storage](https://aws.amazon.com/s3/reduced-redundancy/)
 
-Lower storage price, reduced redundancy. Interesting for easy reproducible data or non critical data such as logs for instance.  
+Lower storage price, reduced redundancy. Interesting for reproducible data or non-critical data such as logs.  
 
 * Glacier
 
@@ -51,9 +51,9 @@ Another useful feature to manage your buckets is the possibility to set [lifecyc
  S3’s multipart upload feature accelerates the uploading of large objects by allowing you to split them up into logical parts that can be uploaded in parallel. However if you initiate a multipart upload but never finish it, the in-progress upload occupies some storage space and will incur storage charges.
 And the thing is these uploads are not visible when you list the contents of a bucket through the console or the standard api (you have to use a special command)
 
-There is 2 easy ways to solve this now and prevent it to happen in the future:
+There are two ways to solve this now and prevent it from happening in the future:
 
-* a [simple script](https://gist.github.com/mchv/9dccbd9245287b26e34ab78bad43ea6c) that can list them with size and potentially delete existing (based on [AWS API](http://docs.aws.amazon.com/cli/latest/reference/s3api/list-parts.html?highlight=list%20parts))
+* a [script](https://gist.github.com/mchv/9dccbd9245287b26e34ab78bad43ea6c) that can list them with size and potentially delete existing (based on [AWS API](http://docs.aws.amazon.com/cli/latest/reference/s3api/list-parts.html?highlight=list%20parts))
 * [Add a lifecycle rule](https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/) to each bucket to delete automatically incomplete multipart uploads after a few days  ([official AWS doc](http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html#mpu-abort-incomplete-mpu-lifecycle-config)) 
 
 An example of how to cloud-form the lifecycle rule:
@@ -81,7 +81,7 @@ You can see savings of over `50%` on reserved instances vs. on-demand instances.
 [More info on reserving instances](https://aws.amazon.com/ec2/purchasing-options/reserved-instances/getting-started/).
 
 Reservations are set to a particular AWS region and to a particular instances type.
-Therefore after making a reservation you are committing to run that particular region/instances combination until the reservation period finishes or you will swipe off all the financial benefits.
+Therefore, after making a reservation you are committing to run that particular region/instances combination until the reservation period finishes, or you will swipe off all the financial benefits.
 
 ### Spot Instances
 

diff --git a/AWS-lambda-metrics.md b/AWS-lambda-metrics.md
@@ -3,15 +3,15 @@ Metrics for Lambdas
 * AWS Embedded Metrics are an ideal solution for generating metrics for Lambda functions that will track historical data.
 * They are a method for capturing Cloudwatch metrics as part of a logging request. 
 * This is good because it avoids the financial and performance cost of making a putMetricData() request.
-* It also makes it easy to find the point at which the metric is updated in both the logs and in the code itself.
+* It also makes it easier to find the point at which the metric is updated in both the logs and in the code itself.
 * This does not work at all for our EC2 apps as their logs do not pass through Cloudwatch.
 * [This pull request](https://github.com/guardian/mobile-n10n/pull/696) gives a working example of how to embed metrics in your logging request
 * [This document](https://docs.google.com/document/d/1cL_t5NhO8J9Bwiu4rghoGh8i_um_sXDyKuq4COhdLEc/edit?usp=sharing) gives a good summary of why AWS embedded metrics are so useful
 * Full details can be found in the [AWS Documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html), but here are the highlights:
 * To use AWS Embedded metrics, logs must be in JSON format.
 * A metric is embedded in a JSON logging request by adding a root node named “_aws” to the start of the log request.
 * The metric details are defined within this "_aws" node.
-* The following code snippet shows a simple logging request updating a single metric:  
+* The following code snippet shows a logging request updating a single metric:
 
 ```json 
   {"_aws": {  

diff --git a/AWS.md b/AWS.md
@@ -41,14 +41,14 @@ VPC
 
 * To follow best practice for VPCs, ensure you have a single CDK-generated VPC in your account that is used to house your applications. You can find the docs for it [here](https://github.com/guardian/cdk/blob/main/src/constructs/vpc/vpc.ts#L32-L59). 
 * While generally discouraged, in some exceptional cases, such as security-sensitive services, you may want to use the construct to generate further VPCs in order to isolate specific applications. It is worth discussing with DevX Security and InfoSec if you think you have a service that requires this.
-* Avoid using the default VPC - The default VPC is designed to make it easy to get up and running but with many negative tradeoffs:
+* Avoid using the default VPC - The default VPC is designed to get you up and running quickly, but with many negative tradeoffs:
   - It lacks the proper security and auditing controls. 
   - Network Access Control Lists (NACLs) are unrestricted.
   - The default VPC does not enable flow logs. Flow logs allow users to track network flows in the VPC for auditing and troubleshooting purposes
   - No tagging
   - The default VPC enables the assignment of public addresses in public subnets by default. This is a security issue as a small mistake in setup could 
     then allow the instance to be reachable by the Internet. 
-* The account should be allocated a block of our IP address space to support peering. Often you may not know you need peering up front, so better to plan for it just in case. See [here](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-basics.html) for more info on AWS peering rules.
+* The account should be allocated a block of our IP address space to support peering. Often you may not know you need peering up front, so better to plan for it regardless. See [here](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-basics.html) for more info on AWS peering rules.
 * If it is likely that AWS resources will need to communicate with our on-prem infrastructure, then contact the networking team to request a CIDR allocation for the VPC.
 * Ensure you have added the correct [Gateway Endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-gateway.html) for the AWS services being accessed from your private subnets to avoid incurring unnecessary networking costs. 
 * Security of the VPC and security groups must be considered. See [here](https://github.com/guardian/security-recommendations/blob/main/recommendations/aws.md#vpc--security-groups) for details.
@@ -116,7 +116,7 @@ and the the function does one or more of the following:
 
 This started happening after a change in how the event loop works between NodeJS 8 and 10. The method AWS uses to freeze the lambda runtime after it has not been invoked for a while may not work correctly in the cases above.
 
-The workaround is simple (if a little silly). Wrap your root handler in a setTimeout:
+The workaround is to wrap your root handler in a setTimeout:
 
 ```javascript
 exports.handler = function (event, context, callback) {
@@ -145,7 +145,7 @@ Your lambda will get triggered multiple times you trigger it synchronously using
 #### Details
 [`--cli-read-timeout`](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-options.html#:~:text=cli%2Dread%2Dtimeout) is a general CLI param that applies to all subcommands and determines how long it will wait for data to be read from a socket. It seems to default to 60 seconds.
 
-In the case of a synchronously executed long-running lambda, this timeout can be exceeded. The first lambda invocation "fails" (though not in a way that is visible in any lambda metrics or logs), and the CLI will abort the request and retry. The first lambda invocation hasn't really failed though - it will continue to run, possibly successfully - it's just that the CLI client that initiated it has stopped waiting for a response.
+In the case of a synchronously executed long-running lambda, this timeout can be exceeded. The first lambda invocation "fails" (though not in a way that is visible in any lambda metrics or logs), and the CLI will abort the request and retry. The first lambda invocation hasn't really failed though - it will continue to run, possibly successfully - but the CLI client that initiated it has stopped waiting for a response.
 
 Setting `--cli-read-timeout` to `0` removes the timeout and make the socket read wait indefinitely, meaning the CLI command will block until the lambda completes or times out.