Skip to content

Commit

Permalink
Add GCP datalake example (#130)
Browse files Browse the repository at this point in the history
Signed-off-by: Jim Enright <[email protected]>
  • Loading branch information
jimright authored Nov 20, 2023
1 parent 6d83599 commit bcf6e21
Show file tree
Hide file tree
Showing 7 changed files with 291 additions and 0 deletions.
17 changes: 17 additions & 0 deletions public-cloud/gcp/datalake/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2023 Cloudera, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

ansible-navigator.log
runs
context
64 changes: 64 additions & 0 deletions public-cloud/gcp/datalake/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# CDP Public Cloud - Environment and Datalake on GCP Base Example

> Constructs a CDP Public Cloud Environment and Datalake. Generates via Ansible the GCP infrastructure and CDP artifacts, including SSH key, cross-account Service Account, GCS buckets, etc.
## Requirements

To run, you need:

* Docker (or a Docker alternative)
* GCP Service Account provisioning credentials (set via `GCP_SERVICE_ACCOUNT_FILE`)
* CDP credentials (set via `CDP_PROFILE`)

## Set Up

First, set up your `ansible-navigator` aka `cdp-navigator` environment -- follow the instructions in the [NAVIGATOR document](https://github.com/cloudera-labs/cldr-runner/blob/main/NAVIGATOR.md) in `cloudera-labs/cldr-runner`.

Then, clone this project and change your working directory.

```bash
git clone https://github.com/cloudera-labs/cloudera-deploy.git; cd cloudera-deploy/public-cloud/gcp/datalake
```

## Configure

Set the required environment variables:

```bash
export GCP_SERVICE_ACCOUNT_FILE=absolute-path-to-service-account-file
export CDP_PROFILE=your-cdp-profile
```

Tweak the `definition.yml` parameters to your liking. Notably, you should add and/or change:

```yaml
name_prefix: ex01 # Keep this short (4-7 characters)
admin_password: "BadPass@1" # 1 upper, 1 special, 1 number, 8-64 chars.
infra_region: us-east1
gcp_project_id: gcp-project-id # GCP Project ID
```
> [!NOTE]
> You can override these parameters with any typical Ansible _extra variables_ flags, i.e. `-e admin_password=my_password`. See the [cldr-runner FAQ](https://github.com/cloudera-labs/cldr-runner/blob/main/FAQ.md#how-do-i-add-extra-variables-and-tags-to-ansible-navigator) for details.

### SSH Keys

This definition will create a new SSH keypair on the host in your `~/.ssh` directory if you do not specify a SSH public key.

If you wish to use an existing SSH key, set `public_key_file` to the key's local path.

## Execute

Then set up the CDP Public Cloud by running the playbook:

```bash
ansible-navigator run main.yml
```

## Tear Down

Tear down the CDP Public Cloud by running the playbook:

```bash
ansible-navigator run teardown.yml
```
61 changes: 61 additions & 0 deletions public-cloud/gcp/datalake/ansible-navigator.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---

# Copyright 2023 Cloudera, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

ansible-navigator:
playbook-artifact:
save-as: "runs/{playbook_name}-{time_stamp}.json"

ansible-runner:
artifact-dir: runs
rotate-artifacts-count: 3

logging:
level: debug
append: False

ansible:
inventory:
entries:
- inventory.ini

execution-environment:
container-engine: docker
enabled: True
environment-variables:
pass:
- GCP_SERVICE_ACCOUNT_FILE
- CDP_PROFILE
set:
ANSIBLE_CALLBACK_WHITELIST: "ansible.posix.profile_tasks"
ANSIBLE_GATHERING: "smart"
ANSIBLE_DEPRECATION_WARNINGS: False
ANSIBLE_HOST_KEY_CHECKING: False
ANSIBLE_SSH_RETRIES: 10
GCP_AUTH_KIND: "serviceaccount"
image: ghcr.io/cloudera-labs/cldr-runner:gcp-latest
pull:
policy: missing
volume-mounts:
- src: "${GCP_SERVICE_ACCOUNT_FILE}"
dest: "${GCP_SERVICE_ACCOUNT_FILE}"
- src: "~/.cdp"
dest: "/runner/.cdp"
options: "Z"
- src: "~/.ssh"
dest: "/runner/.ssh"
options: "Z"
container-options:
- "--network=host"
44 changes: 44 additions & 0 deletions public-cloud/gcp/datalake/definition.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---

# Copyright 2023 Cloudera, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

################################################################################
# Global variables
################################################################################
# Either define here or override using _extra variables_ in the CLI or AWX.
# For example, '-e name_prefix=basex'
# name_prefix: # You must specify a name prefix
# admin_password: # You must specify an admin password
infra_region: us-east1 # CSP region for infra
infra_type: gcp # CSP

#gcp_project_id: # You must specify a GCP Project ID

# Limit to the caller/controller
allowed_cidrs: "{{ lookup('ansible.builtin.url', 'https://api.ipify.org', wantlist=True) | product(['32']) | map('join', '/') | list }}"

################################################################################
# CDP Environment and Datalake variables
################################################################################
env:
tunnel: no
public_endpoint_access: yes

infra:
gcp:
project: "{{ gcp_project_id }}"
vpc:
extra_cidr: "{{ allowed_cidrs }}"
extra_ports: [22, 443]
15 changes: 15 additions & 0 deletions public-cloud/gcp/datalake/inventory.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright 2023 Cloudera, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

localhost ansible_connection=local ansible_python_interpreter="{{ ansible_playbook_python }}"
45 changes: 45 additions & 0 deletions public-cloud/gcp/datalake/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---

# Copyright 2023 Cloudera, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

- name: Set up the cloudera-deploy variables
hosts: localhost
connection: local
gather_facts: yes
tasks:
- name: Read definition variables
ansible.builtin.include_role:
name: cloudera.exe.init_deployment
public: yes
when: init__completed is undefined

- name: Initialization of GCP deployment
block:
- name: GCloud Auth using the Service Account
command: >
gcloud auth activate-service-account
--key-file={{ lookup('env', 'GCP_SERVICE_ACCOUNT_FILE') }}
- name: Set the GCP project for GCloud
command: >
gcloud config set project {{ gcp_project_id }}
tags:
- always

- name: Set up CDP Public Cloud infrastructure (Ansible-based)
ansible.builtin.import_playbook: cloudera.exe.pbc_infra_setup.yml

- name: Set up CDP Public Cloud (Env and DL example)
ansible.builtin.import_playbook: cloudera.exe.pbc_setup.yml
45 changes: 45 additions & 0 deletions public-cloud/gcp/datalake/teardown.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---

# Copyright 2023 Cloudera, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

- name: Set up the cloudera-deploy variables
hosts: localhost
connection: local
gather_facts: yes
tasks:
- name: Read definition variables
ansible.builtin.include_role:
name: cloudera.exe.init_deployment
public: yes
when: init__completed is undefined

- name: Initialization of GCP deployment
block:
- name: GCloud Auth using the Service Account
command: >
gcloud auth activate-service-account
--key-file={{ lookup('env', 'GCP_SERVICE_ACCOUNT_FILE') }}
- name: Set the GCP project for GCloud
command: >
gcloud config set project {{ gcp_project_id }}
tags:
- always

- name: Tear down CDP Public Cloud (Env and DL example)
ansible.builtin.import_playbook: cloudera.exe.pbc_teardown.yml

- name: Tear down CDP Public Cloud infrastructure (Ansible-based)
ansible.builtin.import_playbook: cloudera.exe.pbc_infra_teardown.yml

0 comments on commit bcf6e21

Please sign in to comment.