NVIDIA Jetson Nano and NVIDIA Jetson AGX Xavier for Kubernetes (K8s) and machine learning (ML) for smart IoT

Experimenting with arm64 based NVIDIA Jetson (Nano and AGX Xavier) edge devices running Kubernetes (K8s) for machine learning (ML) including Jupyter Notebooks, TensorFlow Training and TensorFlow Serving using CUDA for smart IoT.

Author: Helmut Hoffer von Ankershoffen né Oertel

Hints:

Assumes an NVIDIA Jetson Nano, TX1, TX2 or AGX Xavier as edge device.
Assumes a macOS workstation for development such as MacBook Pro
Assumes access to a bare-metal Kubernetes cluster the Jetson devices can join e.g. set up using Project Max.
Assumes basic knowledge of Ansible, Docker and Kubernetes.

Mission

Evaluate feasibility and complexity of Kubernetes + machine learning on edge devices for future smart IoT projects
Provide patterns to the Jetson community on how to use Ansible and Kubernetes with Jetson devices for automation and orchestration
Lower the entry barrier for some machine learning students by using cheap edge devices available offline with this starter instead of requiring commercial clouds available online
Provide a modern Jupyter based infrastructure for students of the Stanford CS229 course using Octave as kernel
Remove some personal rust regarding deep learning, multi ARM ,-) bandits, artificial intelligence in general and have fun

Features

Hardware

Shopping list (for Jetson Nano)

Jetson Nano Developer Kit - ca. $110
Power supply - ca. $14
Jumper - ca. $12 - you need only one jumper
micro SDHC card - ca. $10
Ethernet cable - ca. $4
optional: M.2/SATA SSD disk - ca. $49
optional: USB 3.0 to M.2/SATA enclosure - ca. $12

Total ca. $210 including options.

Hints:

Assumes a USB mouse, USB keyboard, HDMI cable and monitor is given as required for initial boot of the root OS
Assumes an existing bare-metal Kubernetes cluster including an Ethernet switch to connect to - have a look at Project Max for how to build one with mini PCs or Project Ceil for the Raspberry PI variant.
Except for injecting the SSD into its enclosure and shortening one jumper (as described below) no assembly is required.

Shopping list (for Jetson Xavier)

Jetson Xavier Developer Kit - ca. $712
M.2/NVMe SSD disk - ca. $65

Total ca. $767.

Hints:

A USB mouse, USB keyboard, HDMI cable and monitor is not required for initial boot of the root OS as the headless OEM-setup is supported
Assumes an existing bare-metal Kubernetes cluster including an Ethernet switch to connect to - have a look at Project Max for how to build one with mini PCs or Project Ceil for the Raspberry PI variant.
Except for installing the NVMe SSD (as described below) no assembly is required.

Install NVMe SSD (required)

As the eMMC soldered onto a Xavier board is 32GB only an SSD is required to provide adequate disk space for Docker images and volume.

Install the NVMe SSD by connecting to the M.2 port of the daughterboard as described in this article - you can skip the section about setting up the installed SSD in the referenced article as integration (mounting, creation of filesystem, synching of files etc.) is done automatically during provisioning as described in part 2 below.
Provisioning as described below will automatically integrate the SSD to provide the /var/lib/docker directory

Bootstrap macOS development environment

Execute make bootstrap-environment to install requirements on your macOS device and setup hostnames such as nano-one.local in your /etc/hosts

Hint:

During bootstrap you will have to enter the passsword of the current user on your macOS device to allow software installation multiple times
The system / security settings of your macOS device must allow installation of software coming from outside of the macOS AppStore
make is used as a facade for all workflows triggered from the development workstation - execute make help on your macOS device to list all targets and see the Makefile in this directory for details
Ansible is used for configuration management on macOS and edge devices - all Ansible roles are idempotent thus can be executed repeatedly e.g. after you make changes in the configuration in workflow/provision/group_vars/all.yml
Have a look at workflow/requirements/macOS/ansible/requirements.yml for a list of packages and applications installed

Provision Jetson devices and join Kubernetes cluster as nodes

Part 1 (for Jetson Nanos): Manually flash root os, create `provision` account as part of oem-setup and automatically establish secure access

Execute make nano-image-download on your macOS device to download and unzip the NVIDIA Jetpack image into workflow/provision/image/
Start the balenaEtcher application and flash your micro sd card with the downloaded image
Wire the nano up with the ethernet switch of your Kubernetes cluster
Insert the designated micro sd card in your NVIDIA Jetson nano and power up
Create a user account with username provision and "Administrator" rights via the UI and set nano-one as hostname - wait until the login screen appears
Execute make setup-access-secure and enter the password you set for the provision user in the step above when asked - passwordless ssh access and sudo will be set up

Hints:

The balenaEtcher application was installed as part of bootstrap - see above
Step 5 requires to wire up your nano with a USB mouse, keyboard and monitor - after that you can unplug and use the nano with headless access using ssh, VNC, RDP or http
In case of a Jetson Nano you might experience intermittent operation - make sure to provide an adequate power supply - the best option is to put a Jumper on pins J48 and use a DC barrel jack power supply (with 5.5mm OD / 2.1mm ID / 9.5mm length, center pin positive) than can supply 5V with 4A - see NVIDIA DevTalk for a schema of the board layout.
Depending on how the DHCP server of your cluster is configured you might have to adapt the IP address of nano-one.local in workflow/requirements/generic/ansible/playbook.yml - run make requirements-hosts after updating the IP address which in turn updates the /etc/hosts file of your mac
SSH into your Nano using make nano-one-ssh - your ssh public key was uploaded in step 6 above so no password is asked for

Part 1 (for Jetson AGX Xaviers): Provision guest VM (Ubuntu), build custom Kernel and rootfs in guest VM, flash Xavier, create `provision` account on Xavier as part of oem-setup and automatically establish secure access

Execute make guest-sdk-manager-download on your macOS device and follow instructions shown to download the NVIDIA SDK Manager installer into workflow/guest/download/ - if you fail to download the NVIDIA SDK manager you will be instructed in the next step on how to do it.
Execute make guest-build to automatically a) create, boot and provision a Ubuntu guest VM on your macOS device using Vagrant, Virtual Box and Ansible and b) build a custom kernel and rootfs inside the guest VM for flashing the Xavier device - the Linux kernel is built with requirements for Kubernetes + Weave networking - such as activating IP sets and the Open vSwitch datapath module, SDK components are added to the rootfs for automatic installation during provisioning (see part 2). You will be prompted to download SDK components via the NVIDIA SDK manager that was automatically installed during provisioning of the guest VM - please do as instructed on-screen.
Execute make guest-flash to flash the Xavier with the custom kernel and rootfs - wire up the Xavier with your macOS device using USB-C and enter the recovery mode by powering up and pressing the buttons as described in the printed user manual that was part of your Jetson AGX Xavier shipment before execution
Execute make guest-oem-setup to start the headless oem-setup process. Follow the on-screen instructions to setup your locale and timezone, create a user account called provision and set an initial password - press the reset button of your Xavier after flashing before triggering the oem-setup.
Execute make setup-access-secure and enter the password you set for the user provision in the step above when asked - passwordless ssh access and sudo will be set up

Hints:

There is no need to wire up your Xavier with a USB mouse, keyboard and monitor for oem-setup as headless oem-setup is implemented as described in step 4
Depending on how the DHCP server of your cluster is configured you might have to adapt the IP address of xavier-one.local in workflow/requirements/generic/ansible/playbook.yml - you can check the assigned IP after step 4 by logging in as user provision with the password you set, executing ifconfig eth0 | grep 'inet' and checking the IP address shown - run make requirements-hosts after updating the IP address which in turn updates the /etc/hosts file of your mac
SSH into your Xavier using make xavier-one-ssh - your ssh public key was uploaded in step 5 above so no password is asked for

Part 2 (for all Jetson devices): Automatically provision tools, services, kernel and Kubernetes on Jetson host

Execute make provision - amongst others services will provisioned, kernel will be compiled (on Nanos only), Kubernetes cluster will be joined

Hints:

You might want to check settings in workflow/provision/group_var/all.yml
You might want to use make provision-nanos and make provision-xaviers to provision one type of Jetson model only
For Nanos the Linux kernel is built with requirements for Kubernetes + Weave networking on device during provisioning - such as activating IP sets and the Open vSwitch datapath module - thus initial provisioning takes ca. one hour - for Xaviers a custom kernel is built in the guest VM and flashed in part 1 (see above)
Execute kubectl get nodes to check that your edge devices joined your cluster and are ready
To easily inspect the Kubernetes cluster in depth execute the lovely click from the terminal which was installed as part of bootstrap - enter nodes to list the nodes with the nano being one of them - see Click for details
VNC into your Nano by starting the VNC Viewer application which was installed as part of bootstrap and connect to nano-one.local:5901 or xavier-one.local:5901 respectively - the password is secret
For Xaviers the NVMe SSD is automatically integrated during provisioning to provide /var/lib/docker. For Nanos you can optionally use a SATA SSD as boot device as described below.
If you want to provision step by step execute make help | grep "provision-" and execute the desired make target e.g. make provision-kernel
Provisioning is implemented idempotent thus it is safe to repeat provisioning as a whole or individual steps

Part 3 (for all devices): Automatically build Docker images and deploy services to Kubernetes cluster

Execute make ml-base-build-and-test once to build the Docker base image jetson/[nano,xavier]/ml-base, test via container structure tests and push to the private registry of your cluster, which, amongst others, includes CUDA, CUDNN, TensorRT, TensorFlow, python bindings and Anaconda - have a look at the directory workflow/deploy/ml-base for details - most images below derive from this image
Execute make device-query-deploy to build and deploy a pod into the Kubernetes cluster that queries CUDA capabilities thus validating GPU and Tensor Core access from inside Docker and correct labeling of Jetson/arm64 based Kubernetes nodes - execute make device-query-log-show to show the result after deploying
Execute make jupyter-deploy to build and deploy a Jupyter server as a Kubernetes pod running on nano supporting CUDA accelerated TensorFlow + Keras including support for Octave as an alternative Jupyter Kernel in addition to iPython - execute make jupyter-open to open a browser tab pointing to the Jupyter server to execute the bundled Tensorflow Jupyter notebooks for deep learning
Execute make tensorflow-serving-base-build-and-test once to build the TensorFlow Serving base image jetson/[nano,xavier]/tensorflow-serving-base test via container structure tests and push to the private registry of your cluster - have a look at the directory workflow/deploy/tensorflow-serving-base for details - most images below derive from this image
Execute make tensorflow-serving-deploy to build and deploy TensorFlow Serving plus a Python/Fast API based Webservice for getting predictions as a Kubernetes pod running on nano - execute make tensorflow-serving-docs-open to open browser tabs pointing to the interactive OAS3 documentation Webservice API; execute make tensorflow-serving-health-check to check the health as used in K8s readiness and liveness probes; execute make tensorflow-serving-predict to get predictions

Hints:

To target Xavier devices for build, test, deploy and publish execute export JETSON_MODEL=xavier in your shell before executing the make ... commands which will auto-activate the matching Skaffold profiles (see below) - if JETSON_MODEL is not set the Nanos will be targeted
All deployments automatically create a private Kubernetes namespace using the pattern jetson-$deployment - e.g. jetson-jupyter for the Jupyter deployment - thus easing inspection in the Kubernetes dashboard, click or similar
All deployments provide a target for deletion such as make device-query-delete which will automatically delete the respective namespace, persistent volumes, pods, services, ingress and loadbalancer on the cluster
All builds on the Jetson device run as user build which was created during provisioning - use make nano-one-ssh-build or make xavier-one-ssh-build to login as user build to monitor intermediate results
Remote building on the Jetson device is implemented using Skaffold and a custom builder - have a look at workflow/deploy/device-query/skaffold.yaml and workflow/deploy/tools/builder for the approach
Skaffold supports a nice build, deploy, tail, watch cycle - execute make device-query-dev as an example
Containers mount devices /dev/nv* at runtime to access the GPU and Tensor Cores - see workflow/deploy/device-query/kustomize/base/deployment.yaml for details
Kubernetes deployments are defined using kustomize - you can thus define kustomize overlays for deployments on other clusters or scale-out easily
Kustomize overlays can be easily referenced using Skaffold profiles - have a look at workflow/deploy/device-query/skaffold.yaml and workflow/deploy/device-query/kustomize/overlays/xavier for an example - in this case the xavier profile is auto-activated respecting the JETSOON_MODEL environment variable (see above) with the profile in turn activating the xavier Kustomize overlay
All containers provide targets for Google container structure tests - execute make device-query-build-and-test as an example
For easier consumption all container images are published on Docker Hub - if you want to publish your own create a file called .docker-hub.auth in this directory (see docker-hub.auth.template) and execute the approriate make target, e.g. make ml-base-publish
If you did not use Project Max to provision your bare-metal Kubernetes cluster make sure your cluster provides a DNSMASQ and DHCP server, firewalling, VPN and private Docker registry as well as support for Kubernetes features such as persistent volumes, ingress and loadbalancing as required during build and in deployments - adapt occurrences of the name max-one accordingly to wire up with your infrastructure
The webservice of tensorflow-serving accesses TensorFlow Serving via its REST or alternatively the Python bindings of the gRPC API - have a look at the directory workflow/deploy/tensorflow-serving/src/webservice for details of the implementation

Optional: Configure swap

Execute provision-swap

Hints:

You can configure the required swap size in workflow/provision/group_vars/all.yml

Optional (Jetson Nanos only): Boot from USB 3.0 SATA SSD - for advanced users only

Wire up your SATA SSD with one of the USB 3.0 ports, unplug all other block devices connected via USB ports and reboot via make nano-one-reboot
Set the ssd.id_serial_short of the SSD in workflow/provision/group_vars given the info provided by executing make nano-one-ssd-id-serial-short-show
Execute nano-one-ssd-prepare to assign the stable device name /dev/ssd, wipe and partition the SSD, create an ext4 filesystem and sync the micro SD card to the SSD
Set the ssd.uuid of the SSD in workflow/provision/group_vars given the info provided by executing make nano-one-ssd-uuid-show
Execute nano-one-ssd-activate to configure the boot menu to use the SSD as the default boot device and reboot

Hints:

A SATA SSD connected via USB 3.0 is more durable and typically faster by a factor of three relative to a micro sd card - tested using a WD Blue M.2 SSD versus a SanDisk Extreme microSD card
The micro sd card is mounted as /mnt/mmc after step 5 in case you want to update the kernel later which now resides in /mnt/mmc/boot/Image
~~You can wire up additional block devices via USB again after step 5 as the UUID is used for referencing the SSD designated for boot~~ (wip)

Optional: Cross-build Docker images for Jetson on macOS (wip)

Execute make l4t-deploy to cross-build Docker image on macOS using buildkit based on official base image from NVIDIA and deploy - functionality is identical to device-query - see above

Hints:

Have a look at the custom Skaffold builder for Mac in workflow/deploy/l4t/builder.mac on how this is achieved
A daemon.json switching on experimental Docker features on your macOS device required for this was automatically installed as part of bootstrap - see above

Optional: Setup firewall (wip)

Execute provision-firewall

Additional labeled references

https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit (intro)
https://developer.nvidia.com/embedded/jetpack (jetpack)
https://blog.hackster.io/getting-started-with-the-nvidia-jetson-nano-developer-kit-43aa7c298797 (jetpack,vnc)
https://devtalk.nvidia.com/default/topic/1051327/jetson-nano-jetpack-4-2-firewall-broken-possible-kernel-compilation-issue-missing-iptables-modules/ (jetpack,firewall,ufw,bug)
https://devtalk.nvidia.com/default/topic/1052748/jetson-nano/egx-nvidia-docker-runtime-on-nano/ (docker,nvidia,missing)
https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-Container-Runtime-on-Jetson#mount-plugins (docker,nvidia)
https://blog.hypriot.com/post/nvidia-jetson-nano-build-kernel-docker-optimized/ (docker,workaround)
https://github.com/Technica-Corporation/Tegra-Docker (docker,workaround)
https://medium.com/@jerry**liang/deploy-gpu-enabled-kubernetes-pod-on-nvidia-jetson-nano-ce738e3bcda9 (k8s)
https://gist.github.com/buptliuwei/8a340cc151507cb48a071cda04e1f882 (k8s)
https://github.com/dusty-nv/jetson-inference/ (ml)
https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html (tensorflow)
https://devtalk.nvidia.com/default/topic/1043951/jetson-agx-xavier/docker-gpu-acceleration-on-jetson-agx-for-ubuntu-18-04-image/post/5296647/#5296647 (docker,tensorflow)
https://towardsdatascience.com/deploying-keras-models-using-tensorflow-serving-and-flask-508ba00f1037 (tensorflow,keras,serving)
https://jkjung-avt.github.io/build-tensorflow-1.8.0/ (bazel,build)
https://oracle.github.io/graphpipe/#/ (tensorflow,serving,graphpipe)
https://towardsdatascience.com/how-to-deploy-jupyter-notebooks-as-components-of-a-kubeflow-ml-pipeline-part-2-b1df77f4e5b3 (kubeflow,jupyter)
https://rapids.ai/start.html (rapids ai)
https://syonyk.blogspot.com/2019/04/nvidia-jetson-nano-desktop-use-kernel-builds.html (boot,usb)
https://kubeedge.io/en/ (kube,edge)
https://towardsdatascience.com/bringing-the-best-out-of-jupyter-notebooks-for-data-science-f0871519ca29 (jupyter,data-science)
https://github.com/jetsistant/docker-cuda-jetpack/blob/master/Dockerfile.base (cuda,deb,alternative-approach)
https://docs.bazel.build/versions/master/user-manual.html (bazel,manual)
https://docs.bitnami.com/google/how-to/enable-nvidia-gpu-tensorflow-serving/ (tf-serving,build,cuda)
https://github.com/tensorflow/serving/blob/master/tensorflow_serving/tools/docker/Dockerfile.devel-gpu (tf-serving,docker)
https://www.kubeflow.org/docs/components/serving/tfserving_new/ (kubeflow,tf-serving)
https://www.seldon.io/open-source/ (seldon,ml)
https://www.electronicdesign.com/industrial-automation/nvidia-egx-spreads-ai-cloud-edge (egx,edge-ai)
https://min.io/ (s3,lambda)
https://github.com/NVAITC/ai-lab (ai-lab,container)
https://www.altoros.com/blog/kubeflow-automating-deployment-of-tensorflow-models-on-kubernetes/ (kubeflow)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
doc/assets		doc/assets
workflow		workflow
.docker-hub.auth.template		.docker-hub.auth.template
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVIDIA Jetson Nano and NVIDIA Jetson AGX Xavier for Kubernetes (K8s) and machine learning (ML) for smart IoT

Mission

Features

Hardware

Shopping list (for Jetson Nano)

Shopping list (for Jetson Xavier)

Install NVMe SSD (required)

Bootstrap macOS development environment

Provision Jetson devices and join Kubernetes cluster as nodes

Part 1 (for Jetson Nanos): Manually flash root os, create `provision` account as part of oem-setup and automatically establish secure access

Part 1 (for Jetson AGX Xaviers): Provision guest VM (Ubuntu), build custom Kernel and rootfs in guest VM, flash Xavier, create `provision` account on Xavier as part of oem-setup and automatically establish secure access

Part 2 (for all Jetson devices): Automatically provision tools, services, kernel and Kubernetes on Jetson host

Part 3 (for all devices): Automatically build Docker images and deploy services to Kubernetes cluster

Optional: Configure swap

Optional (Jetson Nanos only): Boot from USB 3.0 SATA SSD - for advanced users only

Optional: Cross-build Docker images for Jetson on macOS (wip)

Optional: Setup firewall (wip)

Additional labeled references

About

Releases 8

Packages

Languages

License

helmut-hoffer-von-ankershoffen/jetson

Folders and files

Latest commit

History

Repository files navigation

NVIDIA Jetson Nano and NVIDIA Jetson AGX Xavier for Kubernetes (K8s) and machine learning (ML) for smart IoT

Mission

Features

Hardware

Shopping list (for Jetson Nano)

Shopping list (for Jetson Xavier)

Install NVMe SSD (required)

Bootstrap macOS development environment

Provision Jetson devices and join Kubernetes cluster as nodes

Part 1 (for Jetson Nanos): Manually flash root os, create provision account as part of oem-setup and automatically establish secure access

Part 1 (for Jetson AGX Xaviers): Provision guest VM (Ubuntu), build custom Kernel and rootfs in guest VM, flash Xavier, create provision account on Xavier as part of oem-setup and automatically establish secure access

Part 2 (for all Jetson devices): Automatically provision tools, services, kernel and Kubernetes on Jetson host

Part 3 (for all devices): Automatically build Docker images and deploy services to Kubernetes cluster

Optional: Configure swap

Optional (Jetson Nanos only): Boot from USB 3.0 SATA SSD - for advanced users only

Optional: Cross-build Docker images for Jetson on macOS (wip)

Optional: Setup firewall (wip)

Additional labeled references

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 8

Packages 0

Languages

Part 1 (for Jetson Nanos): Manually flash root os, create `provision` account as part of oem-setup and automatically establish secure access

Part 1 (for Jetson AGX Xaviers): Provision guest VM (Ubuntu), build custom Kernel and rootfs in guest VM, flash Xavier, create `provision` account on Xavier as part of oem-setup and automatically establish secure access

Packages