404
Not Found
diff --git a/v0.15/404.html b/v0.15/404.html index 7e698bb7fc..13ba9a532b 100644 --- a/v0.15/404.html +++ b/v0.15/404.html @@ -1 +1 @@ -
Not Found
Not Found
You can reach us via the following channels:
This is a SIG-node subproject, hosted under the Kubernetes SIGs organization in Github. The project was established in 2016 and was migrated to Kubernetes SIGs in 2018.
This is open source software released under the Apache 2.0 License.
You can reach us via the following channels:
This is a SIG-node subproject, hosted under the Kubernetes SIGs organization in Github. The project was established in 2016 and was migrated to Kubernetes SIGs in 2018.
This is open source software released under the Apache 2.0 License.
NFD offers two variants of the container image. Released container images are\navailable for x86_64 and Arm64 architectures.
\n\nThe default is a minimal image based on\nscratch\nand only supports running statically linked binaries.
\n\nFor backwards compatibility a container image tag with suffix -minimal
\n(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.6-minimal
) is provided.
This image is based on debian:bookworm-slim\nand contains a full Linux system for running shell-based nfd-worker hooks and\ndoing live debugging and diagnosis of the NFD images.
\n\nThe container image tag has suffix -full
\n(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.6-full
).
Welcome to Node Feature Discovery – a Kubernetes add-on for detecting hardware\nfeatures and system configuration!
\n\nContinue to:
\n\nIntroduction for more details on the\nproject.
\nQuick start for quick step-by-step\ninstructions on how to get NFD running on your cluster.
\n$ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6\n namespace/node-feature-discovery created\n serviceaccount/nfd-master created\n clusterrole.rbac.authorization.k8s.io/nfd-master created\n clusterrolebinding.rbac.authorization.k8s.io/nfd-master created\n configmap/nfd-worker-conf created\n service/nfd-master created\n deployment.apps/nfd-master created\n daemonset.apps/nfd-worker created\n\n$ kubectl -n node-feature-discovery get all\n NAME READY STATUS RESTARTS AGE\n pod/nfd-master-555458dbbc-sxg6w 1/1 Running 0 56s\n pod/nfd-worker-mjg9f 1/1 Running 0 17s\n...\n\n$ kubectl get nodes -o json | jq '.items[].metadata.labels'\n {\n \"kubernetes.io/arch\": \"amd64\",\n \"kubernetes.io/os\": \"linux\",\n \"feature.node.kubernetes.io/cpu-cpuid.ADX\": \"true\",\n \"feature.node.kubernetes.io/cpu-cpuid.AESNI\": \"true\",\n...\n\n
This software enables node feature discovery for Kubernetes. It detects\nhardware features available on each node in a Kubernetes cluster, and\nadvertises those features using node labels and optionally node extended\nresources, annotations and node taints. Node Feature Discovery is compatible\nwith any recent version of Kubernetes (v1.21+).
\n\nNFD consists of four software components:
\n\nNFD-Master is the daemon responsible for communication towards the Kubernetes\nAPI. That is, it receives labeling requests from the worker and modifies node\nobjects accordingly.
\n\nNFD-Worker is a daemon responsible for feature detection. It then communicates\nthe information to nfd-master which does the actual node labeling. One\ninstance of nfd-worker is supposed to be running on each node of the cluster,
\n\nNFD-Topology-Updater is a daemon responsible for examining allocated\nresources on a worker node to account for resources available to be allocated\nto new pod on a per-zone basis (where a zone can be a NUMA node). It then\ncreates or updates a\nNodeResourceTopology custom\nresource object specific to this node. One instance of nfd-topology-updater is\nsupposed to be running on each node of the cluster.
\n\nNFD-GC is a daemon responsible for cleaning obsolete\nNodeFeature and\nNodeResourceTopology objects.
\n\nOne instance of nfd-gc is supposed to be running in the cluster.
\n\nFeature discovery is divided into domain-specific feature sources:
\n\nEach feature source is responsible for detecting a set of features which. in\nturn, are turned into node feature labels. Feature labels are prefixed with\nfeature.node.kubernetes.io/
and also contain the name of the feature source.\nNon-standard user-specific feature labels can be created with the local and\ncustom feature sources.
An overview of the default feature labels:
\n\n{\n \"feature.node.kubernetes.io/cpu-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/custom-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/kernel-<feature name>\": \"<feature value>\",\n \"feature.node.kubernetes.io/memory-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/network-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/pci-<device label>.present\": \"true\",\n \"feature.node.kubernetes.io/storage-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/system-<feature name>\": \"<feature value>\",\n \"feature.node.kubernetes.io/usb-<device label>.present\": \"<feature value>\",\n \"feature.node.kubernetes.io/<file name>-<feature name>\": \"<feature value>\"\n}\n
NFD also annotates nodes it is running on:
\n\nAnnotation | \nDescription | \n
---|---|
[<instance>.]nfd.node.kubernetes.io/feature-labels | \nComma-separated list of node labels managed by NFD. NFD uses this internally so must not be edited by users. | \n
[<instance>.]nfd.node.kubernetes.io/feature-annotations | \nComma-separated list of node annotations managed by NFD. NFD uses this internally so must not be edited by users. | \n
[<instance>.]nfd.node.kubernetes.io/extended-resources | \nComma-separated list of node extended resources managed by NFD. NFD uses this internally so must not be edited by users. | \n
[<instance>.]nfd.node.kubernetes.io/taints | \nComma-separated list of node taints managed by NFD. NFD uses this internally so must not be edited by users. | \n
\n\n\nNOTE: the
\n-instance
\ncommand line flag affects the annotation names
Unapplicable annotations are not created, i.e. for example\nnfd.node.kubernetes.io/extended-resources
is only placed if some extended\nresources were created by NFD.
NFD takes use of some Kubernetes Custom Resources.
\n\nNodeFeatures\nis be used for representing node features and requesting node labels to be\ngenerated.
\n\nNFD-Master uses NodeFeatureRules\nfor custom labeling of nodes.
\n\nNFD-Topology-Updater creates\nNodeResourceTopology objects\nthat describe the hardware topology of node resources.
\n","dir":"/get-started/","name":"introduction.md","path":"get-started/introduction.md","url":"/get-started/introduction.html"},{"title":"Master cmdline reference","layout":"default","sort":1,"content":"To quickly view available command line flags execute nfd-master -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 nfd-master -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -prune
flag is a sub-command like option for cleaning up the cluster. It\ncauses nfd-master to remove all NFD related labels, annotations and extended\nresources from all Node objects of the cluster and exit.
The -port
flag specifies the TCP port that nfd-master listens for incoming requests.
Default: 8080
\n\nExample:
\n\nnfd-master -port=443\n
The -metrics
flag specifies the port on which to expose\nPrometheus metrics. Setting this to 0 disables the\nmetrics server on nfd-master.
Default: 8081
\n\nExample:
\n\nnfd-master -metrics=12345\n
The -instance
flag makes it possible to run multiple NFD deployments in\nparallel. In practice, it separates the node annotations between deployments so\nthat each of them can store metadata independently. The instance name must\nstart and end with an alphanumeric character and may only contain alphanumeric\ncharacters, -
, _
or .
.
Default: empty
\n\nExample:
\n\nnfd-master -instance=network\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -ca-file
is one of the three flags (together with -cert-file
and\n-key-file
) controlling master-worker mutual TLS authentication on the\nnfd-master side. This flag specifies the TLS root certificate that is used for\nauthenticating incoming connections. NFD-Worker side needs to have matching key\nand cert files configured for the incoming requests to be accepted.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-key-file
Example:
\n\nnfd-master -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/master.crt -key-file=/opt/nfd/master.key\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -cert-file
is one of the three flags (together with -ca-file
and\n-key-file
) controlling master-worker mutual TLS authentication on the\nnfd-master side. This flag specifies the TLS certificate presented for\nauthenticating outgoing traffic towards nfd-worker.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-ca-file
and-key-file
Example:
\n\nnfd-master -cert-file=/opt/nfd/master.crt -key-file=/opt/nfd/master.key -ca-file=/opt/nfd/ca.crt\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -key-file
is one of the three flags (together with -ca-file
and\n-cert-file
) controlling master-worker mutual TLS authentication on the\nnfd-master side. This flag specifies the private key corresponding the given\ncertificate file (-cert-file
) that is used for authenticating outgoing\ntraffic.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-ca-file
Example:
\n\nnfd-master -key-file=/opt/nfd/master.key -cert-file=/opt/nfd/master.crt -ca-file=/opt/nfd/ca.crt\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -verify-node-name
flag controls the NodeName based authorization of\nincoming requests and only has effect when mTLS authentication has been enabled\n(with -ca-file
, -cert-file
and -key-file
). If enabled, the worker node\nname of the incoming must match with the CN or a SAN in its TLS certificate. Thus,\nworkers are only able to label the node they are running on (or the node whose\ncertificate they present).
Node Name based authorization is disabled by default.
\n\nDefault: false
\n\nExample:
\n\nnfd-master -verify-node-name -ca-file=/opt/nfd/ca.crt \\\n -cert-file=/opt/nfd/master.crt -key-file=/opt/nfd/master.key\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -enable-nodefeature-api
flag enables/disables the\nNodeFeature CRD API for receiving\nfeature requests. This will also automatically disable/enable the gRPC\ninterface.
Default: true
\n\nExample:
\n\nnfd-master -enable-nodefeature-api=false\n
The -enable-leader-election
flag enables leader election for NFD-Master.\nIt is advised to turn on this flag when running more than one instance of\nNFD-Master.
This flag takes effect only when combined with -enable-nodefeature-api
flag.
Default: false
\n\nnfd-master -enable-nodefeature-api -enable-leader-election\n
The -enable-taints
flag enables/disables node tainting feature of NFD.
Default: false
\n\nExample:
\n\nnfd-master -enable-taints=true\n
The -no-publish
flag disables updates to the Node objects in the Kubernetes\nAPI server, making a “dry-run” flag for nfd-master. No Labels, Annotations or\nExtendedResources of nodes are updated.
Default: false
\n\nExample:
\n\nnfd-master -no-publish\n
The -crd-controller
flag specifies whether the NFD CRD API controller is\nenabled or not. The controller is responsible for processing\nNodeFeature and\nNodeFeatureRule objects.
Default: true
\n\nExample:
\n\nnfd-master -crd-controller=false\n
DEPRECATED: use -crd-controller
instead.
The -label-whitelist
specifies a regular expression for filtering feature\nlabels based on their name. Each label must match against the given regular\nexpression or it will not be published.
\n\n\nNOTE: The regular expression is only matches against the “basename” part\nof the label, i.e. to the part of the name after ‘/’. The label namespace is\nomitted.
\n
Default: empty
\n\nExample:
\n\nnfd-master -label-whitelist='.*cpuid\\.'\n
The -extra-label-ns
flag specifies a comma-separated list of allowed feature\nlabel namespaces. This option can be used to allow\nother vendor or application specific namespaces for custom labels from the\nlocal and custom feature sources, even though these labels were denied using\nthe deny-label-ns
flag.
The same namespace control and this flag applies Extended Resources (created\nwith -resource-labels
), too.
Default: empty
\n\nExample:
\n\nnfd-master -extra-label-ns=vendor-1.com,vendor-2.io\n
The -deny-label-ns
flag specifies a comma-separated list of excluded\nlabel namespaces. By default, nfd-master allows creating labels in all\nnamespaces, excluding kubernetes.io
namespace and its sub-namespaces\n(i.e. *.kubernetes.io
). However, you should note that\nkubernetes.io
and its sub-namespaces are always denied.\nFor example, nfd-master -deny-label-ns=\"\"
would still disallow\nkubernetes.io
and *.kubernetes.io
.\nThis option can be used to exclude some vendors or application specific\nnamespaces.\nNote that the namespaces feature.node.kubernetes.io
and profile.node.kubernetes.io
\nand their sub-namespaces are always allowed and cannot be denied.
Default: empty
\n\nExample:
\n\nnfd-master -deny-label-ns=*.vendor.com,vendor-2.io\n
DEPRECATED: NodeFeatureRule\nshould be used for managing extended resources in NFD.
\n\nThe -resource-labels
flag specifies a comma-separated list of features to be\nadvertised as extended resources instead of labels. Features that have integer\nvalues can be published as Extended Resources by listing them in this flag.
Default: empty
\n\nExample:
\n\nnfd-master -resource-labels=vendor-1.com/feature-1,vendor-2.io/feature-2\n
The -config
flag specifies the path of the nfd-master configuration file to\nuse.
Default: /etc/kubernetes/node-feature-discovery/nfd-master.conf
\n\nExample:
\n\nnfd-master -config=/opt/nfd/master.conf\n
The -options
flag may be used to specify and override configuration file\noptions directly from the command line. The required format is the same as in\nthe config file i.e. JSON or YAML. Configuration options specified via this\nflag will override those from the configuration file:
Default: empty
\n\nExample:
\n\nnfd-master -options='{\"noPublish\": true}'\n
The -nfd-api-parallelism
flag can be used to specify the maximum\nnumber of concurrent node updates.
It takes effect only when -enable-nodefeature-api
has been set.
Default: 10
\n\nExample:
\n\nnfd-master -nfd-api-parallelism=1\n
The following logging-related flags are inherited from the\nklog package.
\n\nIf true, adds the file directory to the header of the log messages.
\n\nDefault: false
\n\nLog to standard error as well as files.
\n\nDefault: false
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
\n\nLog to standard error instead of files
\n\nDefault: true
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
\n\nLogs at or above this threshold go to stderr.
\n\nDefault: 2
\n\nNumber for the log level verbosity.
\n\nDefault: 0
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n\nThe -resync-period
flag specifies the NFD API controller resync period.\nThe resync means nfd-master replaying all NodeFeature and NodeFeatureRule objects,\nthus effectively re-syncing all nodes in the cluster (i.e. ensuring labels, annotations,\nextended resources and taints are in place).\nOnly has effect when the NodeFeature\nCRD API has been enabled with -enable-nodefeature-api
.
Default: 1 hour.
\n\nExample:
\n\nnfd-master -resync-period=2h\n
Features are advertised as labels in the Kubernetes Node object.
\n\nLabel creation in nfd-worker is performed by a set of separate modules called\nlabel sources. The\ncore.labelSources
\nconfiguration option (or\n-label-sources
\nflag) of nfd-worker controls which sources to enable for label generation.
All built-in labels use the feature.node.kubernetes.io
label namespace and\nhave the following format.
feature.node.kubernetes.io/<feature> = <value>\n
\n\n\nNOTE: Consecutive runs of nfd-worker will update the labels on a given\nnode. If features are not discovered on a consecutive run, the corresponding\nlabel will be removed. This includes any restrictions placed on the\nconsecutive run, such as restricting discovered features with the\n
\n-label-whitelist
\nflag of nfd-master or\ncore.labelWhiteList
\noption of nfd-worker.
Feature name | \nValue | \nDescription | \n
---|---|---|
cpu-cpuid.<cpuid-flag> | \n true | \nCPU capability is supported. NOTE: the capability might be supported but not enabled. | \n
cpu-hardware_multithreading | \n true | \nHardware multithreading, such as Intel HTT, enabled (number of logical CPUs is greater than physical CPUs) | \n
cpu-coprocessor.nx_gzip | \n true | \nNest Accelerator for GZIP is supported(Power). | \n
cpu-power.sst_bf.enabled | \n true | \nIntel SST-BF (Intel Speed Select Technology - Base frequency) enabled | \n
cpu-pstate.status | \n string | \nThe status of the Intel pstate driver when in use and enabled, either ‘active’ or ‘passive’. | \n
cpu-pstate.turbo | \n bool | \nSet to ‘true’ if turbo frequencies are enabled in Intel pstate driver, set to ‘false’ if they have been disabled. | \n
cpu-pstate.scaling_governor | \n string | \nThe value of the Intel pstate scaling_governor when in use, either ‘powersave’ or ‘performance’. | \n
cpu-cstate.enabled | \n bool | \nSet to ‘true’ if cstates are set in the intel_idle driver, otherwise set to ‘false’. Unset if intel_idle cpuidle driver is not active. | \n
cpu-rdt.<rdt-flag> | \n true | \nDEPRECATED Intel RDT capability is supported. See RDT flags for details. | \n
cpu-security.sgx.enabled | \n true | \nSet to ‘true’ if Intel SGX is enabled in BIOS (based on a non-zero sum value of SGX EPC section sizes). | \n
cpu-security.se.enabled | \n true | \nSet to ‘true’ if IBM Secure Execution for Linux (IBM Z & LinuxONE) is available and enabled (requires /sys/firmware/uv/prot_virt_host facility) | \n
cpu-security.tdx.enabled | \n true | \nSet to ‘true’ if Intel TDX is available on the host and has been enabled (requires /sys/module/kvm_intel/parameters/tdx ). | \n
cpu-security.tdx.protected | \n true | \nSet to ‘true’ if Intel TDX was used to start the guest node, based on the existence of the “TDX_GUEST” information as part of cpuid features. | \n
cpu-security.sev.enabled | \n true | \nSet to ‘true’ if ADM SEV is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev ). | \n
cpu-security.sev.es.enabled | \n true | \nSet to ‘true’ if ADM SEV-ES is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_es ). | \n
cpu-security.sev.snp.enabled | \n true | \nSet to ‘true’ if ADM SEV-SNP is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_snp ). | \n
cpu-model.vendor_id | \n string | \nComparable CPU vendor ID. | \n
cpu-model.family | \n int | \nCPU family. | \n
cpu-model.id | \n int | \nCPU model number. | \n
\n\n\nNOTE: the
\ncpu-rdt.<rdt-flag>
labels are deprecated and will be removed\nin a future release. They will remain to be available as features\nfor NodeFeatureRule to consume.\nSee customization guide\nfor details how to use NodeFeatureRule objects to create labels.
The CPU label source is configurable, see\nworker configuration and\nsources.cpu
\nconfiguration options for details.
Flag | \nDescription | \n
---|---|
ADX | \nMulti-Precision Add-Carry Instruction Extensions (ADX) | \n
AESNI | \nAdvanced Encryption Standard (AES) New Instructions (AES-NI) | \n
APX_F | \nIntel Advanced Performance Extensions (APX) | \n
AVX10 | \nIntel Advanced Vector Extensions 10 (AVX10) | \n
AVX10_256, AVX10_512 | \nIntel AVX10 256-bit and 512-bit vector support | \n
AVX | \nAdvanced Vector Extensions (AVX) | \n
AVX2 | \nAdvanced Vector Extensions 2 (AVX2) | \n
AVXIFMA | \nAVX-IFMA instructions | \n
AVXVNNI | \nAVX (VEX encoded) VNNI neural network instructions | \n
AMXBF16 | \nAdvanced Matrix Extension, tile multiplication operations on BFLOAT16 numbers | \n
AMXINT8 | \nAdvanced Matrix Extension, tile multiplication operations on 8-bit integers | \n
AMXFP16 | \nAdvanced Matrix Extension, tile multiplication operations on FP16 numbers | \n
AMXTILE | \nAdvanced Matrix Extension, base tile architecture support | \n
AVX512BF16 | \nAVX-512 BFLOAT16 instructions | \n
AVX512BITALG | \nAVX-512 bit Algorithms | \n
AVX512BW | \nAVX-512 byte and word Instructions | \n
AVX512CD | \nAVX-512 conflict detection instructions | \n
AVX512DQ | \nAVX-512 doubleword and quadword instructions | \n
AVX512ER | \nAVX-512 exponential and reciprocal instructions | \n
AVX512F | \nAVX-512 foundation | \n
AVX512FP16 | \nAVX-512 FP16 instructions | \n
AVX512IFMA | \nAVX-512 integer fused multiply-add instructions | \n
AVX512PF | \nAVX-512 prefetch instructions | \n
AVX512VBMI | \nAVX-512 vector bit manipulation instructions | \n
AVX512VBMI2 | \nAVX-512 vector bit manipulation instructions, version 2 | \n
AVX512VL | \nAVX-512 vector length extensions | \n
AVX512VNNI | \nAVX-512 vector neural network instructions | \n
AVX512VP2INTERSECT | \nAVX-512 intersect for D/Q | \n
AVX512VPOPCNTDQ | \nAVX-512 vector population count doubleword and quadword | \n
AVXNECONVERT | \nAVX-NE-CONVERT instructions | \n
AVXVNNIINT8 | \nAVX-VNNI-INT8 instructions | \n
CMPCCXADD | \nCMPCCXADD instructions | \n
ENQCMD | \nEnqueue Command | \n
GFNI | \nGalois Field New Instructions | \n
HYPERVISOR | \nRunning under hypervisor | \n
MSRLIST | \nRead/Write List of Model Specific Registers | \n
PREFETCHI | \nPREFETCHIT0/1 instructions | \n
VAES | \nAVX-512 vector AES instructions | \n
VPCLMULQDQ | \nCarry-less multiplication quadword | \n
WRMSRNS | \nNon-Serializing Write to Model Specific Register | \n
By default, the following CPUID flags have been blacklisted: BMI1, BMI2, CLMUL,\nCMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED,\nRDTSCP, SGX, SSE, SSE2, SSE3, SSE4, SSE42, SSSE3 and TDX_GUEST. See\nsources.cpu
\nconfiguration options to change the behavior.
See the full list in github.com/klauspost/cpuid.
\n\nFlag | \nDescription | \n
---|---|
IDIVA | \nInteger divide instructions available in ARM mode | \n
IDIVT | \nInteger divide instructions available in Thumb mode | \n
THUMB | \nThumb instructions | \n
FASTMUL | \nFast multiplication | \n
VFP | \nVector floating point instruction extension (VFP) | \n
VFPv3 | \nVector floating point extension v3 | \n
VFPv4 | \nVector floating point extension v4 | \n
VFPD32 | \nVFP with 32 D-registers | \n
HALF | \nHalf-word loads and stores | \n
EDSP | \nDSP extensions | \n
NEON | \nNEON SIMD instructions | \n
LPAE | \nLarge Physical Address Extensions | \n
Flag | \nDescription | \n
---|---|
AES | \nAnnouncing the Advanced Encryption Standard | \n
EVSTRM | \nEvent Stream Frequency Features | \n
FPHP | \nHalf Precision(16bit) Floating Point Data Processing Instructions | \n
ASIMDHP | \nHalf Precision(16bit) Asimd Data Processing Instructions | \n
ATOMICS | \nAtomic Instructions to the A64 | \n
ASIMRDM | \nSupport for Rounding Double Multiply Add/Subtract | \n
PMULL | \nOptional Cryptographic and CRC32 Instructions | \n
JSCVT | \nPerform Conversion to Match Javascript | \n
DCPOP | \nPersistent Memory Support | \n
Feature | \nValue | \nDescription | \n
---|---|---|
kernel-config.<option> | \n true | \nKernel config option is enabled (set ‘y’ or ‘m’). Default options are NO_HZ , NO_HZ_IDLE , NO_HZ_FULL and PREEMPT | \n
kernel-selinux.enabled | \n true | \nSelinux is enabled on the node | \n
kernel-version.full | \n string | \nFull kernel version as reported by /proc/sys/kernel/osrelease (e.g. ‘4.5.6-7-g123abcde’) | \n
kernel-version.major | \n string | \nFirst component of the kernel version (e.g. ‘4’) | \n
kernel-version.minor | \n string | \nSecond component of the kernel version (e.g. ‘5’) | \n
kernel-version.revision | \n string | \nThird component of the kernel version (e.g. ‘6’) | \n
The kernel label source is configurable, see\nworker configuration and\nsources.kernel
\nconfiguration options for details.
Feature | \nValue | \nDescription | \n
---|---|---|
memory-numa | \n true | \nMultiple memory nodes i.e. NUMA architecture detected | \n
memory-nv.present | \n true | \nNVDIMM device(s) are present | \n
memory-nv.dax | \n true | \nNVDIMM region(s) configured in DAX mode are present | \n
Feature | \nValue | \nDescription | \n
---|---|---|
network-sriov.capable | \n true | \nSingle Root Input/Output Virtualization (SR-IOV) enabled Network Interface Card(s) present | \n
network-sriov.configured | \n true | \nSR-IOV virtual functions have been configured | \n
Feature | \nValue | \nDescription | \n
---|---|---|
pci-<device label>.present | \n true | \nPCI device is detected | \n
pci-<device label>.sriov.capable | \n true | \nSingle Root Input/Output Virtualization (SR-IOV) enabled PCI device present | \n
\n | \n | \n |
<device label>
is format is configurable and set to <class>_<vendor>
by\ndefault. For more more details about configuration of the pci labels, see\nsources.pci
options\nand worker configuration\ninstructions.
Feature | \nValue | \nDescription | \n
---|---|---|
usb-<device label>.present | \n true | \nUSB device is detected | \n
<device label>
is format is configurable and set to\n<class>_<vendor>_<device>
by default. For more more details about\nconfiguration of the usb labels, see\nsources.usb
options\nand worker configuration\ninstructions.
Feature | \nValue | \nDescription | \n
---|---|---|
storage-nonrotationaldisk | \n true | \nNon-rotational disk, like SSD, is present in the node | \n
Feature | \nValue | \nDescription | \n
---|---|---|
system-os_release.ID | \n string | \nOperating system identifier | \n
system-os_release.VERSION_ID | \n string | \nOperating system version identifier (e.g. ‘6.7’) | \n
system-os_release.VERSION_ID.major | \n string | \nFirst component of the OS version id (e.g. ‘6’) | \n
system-os_release.VERSION_ID.minor | \n string | \nSecond component of the OS version id (e.g. ‘7’) | \n
The custom label source is designed for creating\nuser defined labels. However, it has a few statically\ndefined built-in labels:
\n\nFeature | \nValue | \nDescription | \n
---|---|---|
custom-rdma.capable | \n true | \nThe node has an RDMA capable Network adapter | \n
custom-rdma.enabled | \n true | \nThe node has the needed RDMA modules loaded to run RDMA traffic | \n
\n | \n | \n |
NFD has many extension points for creating vendor and application specific\nlabels. See the customization guide for\ndetailed documentation.
\n\nNFD is able to create extended resources, see the\nNodeFeatureRule CRD and its\nextendedResources field for more\ndetails.
\n\nNote that NFD is not a replacement for the usage of device plugins.
\n\nAn example use-case for extended resources could be based on custom feature\n(created e.g. with feature files that\nexposes the node SGX EPC memory section size. This value will then be turned\ninto an extended resource of the node, allowing PODs to request that resource\nand the Kubernetes scheduler to schedule such PODs to only those nodes which\nhave a sufficient capacity of said resource left.
\n\n\n","dir":"/usage/","name":"features.md","path":"usage/features.md","url":"/usage/features.html"},{"title":"Deployment","layout":"default","sort":2,"content":"Node Feature Discovery can be deployed on any recent version of Kubernetes\n(v1.21+).
\n\nSee Image variants for description of the different NFD\ncontainer images available.
\n\nUsing Kustomize provides straightforward deployment with\nkubectl
integration and declarative customization.
Using Helm provides easy management of NFD deployments with nice\nconfiguration management and easy upgrades.
\n\nUsing Operator provides deployment and configuration management via\nCRDs.
\n","dir":"/deployment/","name":"index.md","path":"deployment/index.md","url":"/deployment/"},{"title":"Kustomize","layout":"default","sort":2,"content":"Kustomize can be used to\ndeploy NFD. Customization of the deployment is done by maintaining\ndeclarative overlays on top of the base overlays in NFD.
\n\nTo follow the deployment instructions here,\nkubectl v1.21 or\nlater is required.
\n\nThe kustomize overlays provided in the repo can be used directly:
\n\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6\n
This will required RBAC rules and deploy nfd-master (as a deployment) and\nnfd-worker (as daemonset) in the node-feature-discovery
namespace.
\n\n\nNOTE: nfd-topology-updater is not deployed as part of the
\ndefault
\noverlay. Refer to the Master Worker Topologyupdater\nand Topologyupdater below.
Alternatively you can clone the repository and customize the deployment by\ncreating your own overlays. See kustomize for more information\nabout managing deployment configurations.
\n\nThe NFD repository hosts a set of overlays for different usages and deployment\nscenarios under\ndeployment/overlays
default
:\ndefault deployment of nfd-worker as a daemonset, described abovedefault-job
:\nsee Worker one-shot belowmaster-worker-topologyupdater
:\nsee Master Worker Topologyupdater belowtopologyupdater
:\nsee Topology Updater belowprometheus
:\nsee Metrics belowprune
:\nclean up the cluster after uninstallation, see\nRemoving feature labelssamples/cert-manager
:\nan example for supplementing the default deployment with cert-manager for TLS\nauthentication, see\nAutomated TLS certificate management using cert-manager\nfor detailssamples/custom-rules
:\nan example for spicing up the default deployment with a separately managed\nconfigmap of custom labeling rules, see\nCustom feature source for more information about\ncustom node labelsFeature discovery can alternatively be configured as a one-shot job.\nThe default-job
overlay may be used to achieve this:
NUM_NODES=$(kubectl get no -o jsonpath='{.items[*].metadata.name}' | wc -w)\nkubectl kustomize https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default-job?ref=v0.15.6 | \\\n sed s\"/NUM_NODES/$NUM_NODES/\" | \\\n kubectl apply -f -\n
The example above launches as many jobs as there are non-master nodes. Note that\nthis approach does not guarantee running once on every node. For example,\ntainted, non-ready nodes or some other reasons in Job scheduling may cause some\nnode(s) will run extra job instance(s) to satisfy the request.
\n\nNFD-Master, nfd-worker and nfd-topology-updater can be configured to be\ndeployed as separate pods. The master-worker-topologyupdater
overlay may be\nused to achieve this:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/master-worker-topologyupdater?ref=v0.15.6\n\n
To deploy just nfd-topology-updater (without nfd-master and nfd-worker)\nuse the topologyupdater
overlay:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.6\n\n
NFD-Topology-Updater can be configured along with the default
overlay\n(which deploys nfd-worker and nfd-master) where all the software components\nare deployed as separate pods;
\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.6\n\n
To allow prometheus operator\nto scrape metrics from node-feature-discovery,\nrun the following command:
\n\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prometheus?ref=v0.15.6\n
Simplest way is to invoke kubectl delete
on the overlay that was used for\ndeployment. Beware that this will also delete the namespace that NFD is\nrunning in. For example, in case the default overlay from the repo was used:
kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6\n
Alternatively you can delete create objects one-by-one, depending on the type\nof deployment, for example:
\n\nNFD_NS=node-feature-discovery\nkubectl -n $NFD_NS delete ds nfd-worker\nkubectl -n $NFD_NS delete deploy nfd-master\nkubectl -n $NFD_NS delete svc nfd-master\nkubectl -n $NFD_NS delete sa nfd-master\nkubectl delete clusterrole nfd-master\nkubectl delete clusterrolebinding nfd-master\n
Minimal steps to deploy latest released version of NFD in your cluster.
\n\nDeploy with kustomize – creates a new namespace, service and required RBAC\nrules and deploys nfd-master and nfd-worker daemons.
\n\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6\n
Wait until NFD master and NFD worker are running.
\n\n$ kubectl -n node-feature-discovery get ds,deploy\nNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE\ndaemonset.apps/nfd-worker 2 2 2 2 2 <none> 10s\n\nNAME READY UP-TO-DATE AVAILABLE AGE\ndeployment.apps/nfd-master 1/1 1 1 17s\n\n
Check that NFD feature labels have been created
\n\n$ kubectl get no -o json | jq '.items[].metadata.labels'\n{\n \"kubernetes.io/arch\": \"amd64\",\n \"kubernetes.io/os\": \"linux\",\n \"feature.node.kubernetes.io/cpu-cpuid.ADX\": \"true\",\n \"feature.node.kubernetes.io/cpu-cpuid.AESNI\": \"true\",\n \"feature.node.kubernetes.io/cpu-cpuid.AVX\": \"true\",\n...\n
Create a pod targeting a distinguishing feature (select a valid feature from\nthe list printed on the previous step)
\n\n$ cat << EOF | kubectl apply -f -\napiVersion: v1\nkind: Pod\nmetadata:\n name: feature-dependent-pod\nspec:\n containers:\n - image: registry.k8s.io/pause\n name: pause\n nodeSelector:\n # Select a valid feature\n feature.node.kubernetes.io/cpu-cpuid.AESNI: 'true'\nEOF\npod/feature-dependent-pod created\n
See that the pod is running on a desired node
\n\n$ kubectl get po feature-dependent-pod -o wide\nNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES\nfeature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 <none> <none>\n
To deploy nfd-topology-updater use the topologyupdater
kustomize\noverlay.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.6\n
Wait until nfd-topology-updater is running.
\n\n$ kubectl -n node-feature-discovery get ds\nNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE\ndaemonset.apps/nfd-topology-updater 2 2 2 2 2 <none> 5s\n\n
Check that the NodeResourceTopology objects are created
\n\n$ kubectl get noderesourcetopologies.topology.node.k8s.io\nNAME AGE\nkind-control-plane 23s\nkind-worker 23s\n
To quickly view available command line flags execute nfd-worker -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 nfd-worker -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -config
flag specifies the path of the nfd-worker configuration file to\nuse.
Default: /etc/kubernetes/node-feature-discovery/nfd-worker.conf
\n\nExample:
\n\nnfd-worker -config=/opt/nfd/worker.conf\n
The -options
flag may be used to specify and override configuration file\noptions directly from the command line. The required format is the same as in\nthe config file i.e. JSON or YAML. Configuration options specified via this\nflag will override those from the configuration file:
Default: empty
\n\nExample:
\n\nnfd-worker -options='{\"sources\":{\"cpu\":{\"cpuid\":{\"attributeWhitelist\":[\"AVX\",\"AVX2\"]}}}}'\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -server
flag specifies the address of the nfd-master endpoint where to\nconnect to.
Default: localhost:8080
\n\nExample:
\n\nnfd-worker -server=nfd-master.nfd.svc.cluster.local:443\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -ca-file
is one of the three flags (together with -cert-file
and\n-key-file
) controlling the mutual TLS authentication on the worker side.\nThis flag specifies the TLS root certificate that is used for verifying the\nauthenticity of nfd-master.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-key-file
Example:
\n\nnfd-worker -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/worker.crt -key-file=/opt/nfd/worker.key\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -cert-file
is one of the three flags (together with -ca-file
and\n-key-file
) controlling mutual TLS authentication on the worker side. This\nflag specifies the TLS certificate presented for authenticating outgoing\nrequests.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-ca-file
and-key-file
Example:
\n\nnfd-workerr -cert-file=/opt/nfd/worker.crt -key-file=/opt/nfd/worker.key -ca-file=/opt/nfd/ca.crt\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -key-file
is one of the three flags (together with -ca-file
and\n-cert-file
) controlling the mutual TLS authentication on the worker side.\nThis flag specifies the private key corresponding the given certificate file\n(-cert-file
) that is used for authenticating outgoing requests.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-ca-file
Example:
\n\nnfd-worker -key-file=/opt/nfd/worker.key -cert-file=/opt/nfd/worker.crt -ca-file=/opt/nfd/ca.crt\n
The -kubeconfig
flag specifies the kubeconfig to use for connecting to the\nKubernetes API server. It is only needed for manipulating\nNodeFeature objects, and thus the flag\nonly takes effect when\n-enable-nodefeature-api
) is specified. An empty\nvalue (which is also the default) implies in-cluster kubeconfig.
Default: empty
\n\nExample:
\n\nnfd-worker -kubeconfig ${HOME}/.kube/config\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -server-name-override
flag specifies the common name (CN) which to\nexpect from the nfd-master TLS certificate. This flag is mostly intended for\ndevelopment and debugging purposes.
Default: empty
\n\nExample:
\n\nnfd-worker -server-name-override=localhost\n
The -feature-sources
flag specifies a comma-separated list of enabled feature\nsources. A special value all
enables all sources. Prefixing a source name\nwith -
indicates that the source will be disabled instead - this is only\nmeaningful when used in conjunction with all
. This command line flag allows\ncompletely disabling the feature detection so that neither standard feature\nlabels are generated nor the raw feature data is available for custom rule\nprocessing. Consider using the core.featureSources
config file option,\ninstead, allowing dynamic configurability.
\n\n\nNOTE: This flag takes precedence over the
\ncore.featureSources
\nconfiguration file option.
Default: all
\n\nExample:
\n\nnfd-worker -feature-sources=all,-pci\n
The -label-sources
flag specifies a comma-separated list of enabled label\nsources. A special value all
enables all sources. Prefixing a source name\nwith -
indicates that the source will be disabled instead - this is only\nmeaningful when used in conjunction with all
. Consider using the\ncore.labelSources
config file option, instead, allowing dynamic\nconfigurability.
\n\n\nNOTE: This flag takes precedence over the
\ncore.labelSources
\nconfiguration file option.
Default: all
\n\nExample:
\n\nnfd-worker -label-sources=kernel,system,local\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -enable-nodefeature-api
flag enables/disables the\nNodeFeature CRD API\nfor communicating with nfd-master. When enabled nfd-worker creates per-node\nNodeFeature objects the contain all discovered node features and the set of\nfeature labels to be created. Setting the flag to false will enable\ngRPC communication to nfd-master.
Default: true
\n\nExample:
\n\nnfd-worker -enable-nodefeature-api=false\n
The -metrics
flag specifies the port on which to expose\nPrometheus metrics. Setting this to 0 disables the\nmetrics server on nfd-worker.
Default: 8081
\n\nExample:
\n\nnfd-worker -metrics=12345\n
The -no-publish
flag disables all communication with the nfd-master and the\nKubernetes API server. It is effectively a “dry-run” flag for nfd-worker.\nNFD-Worker runs feature detection normally, but no labeling requests are sent\nto nfd-master and no NodeFeature objects are created or updated in the API\nserver.
\n\n\nNOTE: This flag takes precedence over the\n
\ncore.noPublish
\nconfiguration file option.
Default: false
\n\nExample:
\n\nnfd-worker -no-publish\n
The -oneshot
flag causes nfd-worker to exit after one pass of feature\ndetection.
Default: false
\n\nExample:
\n\nnfd-worker -oneshot -no-publish\n
The following logging-related flags are inherited from the\nklog package.
\n\n\n\n\nNOTE: The logger setup can also be specified via the
\ncore.klog
\nconfiguration file options. However, the command line flags take precedence\nover any corresponding config file options specified.
If true, adds the file directory to the header of the log messages.
\n\nDefault: false
\n\nLog to standard error as well as files.
\n\nDefault: false
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
\n\nLog to standard error instead of files
\n\nDefault: true
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
\n\nLogs at or above this threshold go to stderr.
\n\nDefault: 2
\n\nNumber for the log level verbosity.
\n\nDefault: 0
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n","dir":"/reference/","name":"worker-commandline-reference.md","path":"reference/worker-commandline-reference.md","url":"/reference/worker-commandline-reference.html"},{"title":"Using node labels","layout":"default","sort":2,"content":"Nodes with specific features can be targeted using the nodeSelector
field. The\nfollowing example shows how to target nodes with Intel TurboBoost enabled.
apiVersion: v1\nkind: Pod\nmetadata:\n labels:\n env: test\n name: golang-test\nspec:\n containers:\n - image: golang\n name: go1\n nodeSelector:\n feature.node.kubernetes.io/cpu-pstate.turbo: 'true'\n
For more details on targeting nodes, see\nnode selection.
\n","dir":"/usage/","name":"using-labels.md","path":"usage/using-labels.md","url":"/usage/using-labels.html"},{"title":"Helm","layout":"default","sort":3,"content":"Node Feature Discovery provides a Helm chart to manage its deployment.
\n\n\n\n\nNOTE: NFD is not ideal for other Helm charts to depend on as that may\nresult in multiple parallel NFD deployments in the same cluster which is not\nfully supported by the NFD Helm chart.
\n
Helm package manager should be installed.
\n\nTo install the latest stable version:
\n\nexport NFD_NS=node-feature-discovery\nhelm repo add nfd https://kubernetes-sigs.github.io/node-feature-discovery/charts\nhelm repo update\nhelm install nfd/node-feature-discovery --namespace $NFD_NS --create-namespace --generate-name\n
To install the latest development version you need to clone the NFD Git\nrepository and install from there.
\n\ngit clone https://github.com/kubernetes-sigs/node-feature-discovery/\ncd node-feature-discovery/deployment/helm\nexport NFD_NS=node-feature-discovery\nhelm install node-feature-discovery ./node-feature-discovery/ --namespace $NFD_NS --create-namespace\n
See the configuration section below for instructions how to\nalter the deployment parameters.
\n\nYou can override values from values.yaml
and provide a file with custom values:
export NFD_NS=node-feature-discovery\nhelm install nfd/node-feature-discovery -f <path/to/custom/values.yaml> --namespace $NFD_NS --create-namespace\n
To specify each parameter separately you can provide them to helm install command:
\n\nexport NFD_NS=node-feature-discovery\nhelm install nfd/node-feature-discovery --set nameOverride=NFDinstance --set master.replicaCount=2 --namespace $NFD_NS --create-namespace\n
To uninstall the node-feature-discovery
deployment:
export NFD_NS=node-feature-discovery\nhelm uninstall node-feature-discovery --namespace $NFD_NS\n
The command removes all the Kubernetes components associated with the chart and\ndeletes the release.
\n\nTo tailor the deployment of the Node Feature Discovery to your needs following\nChart parameters are available.
\n\nName | \nType | \nDefault | \nDescription | \n
---|---|---|---|
image.repository | \n string | \nregistry.k8s.io/nfd/node-feature-discovery | \n NFD image repository | \n
image.tag | \n string | \nv0.15.6 | \n NFD image tag | \n
image.pullPolicy | \n string | \nAlways | \n Image pull policy | \n
imagePullSecrets | \n list | \n[] | \nImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info | \n
nameOverride | \n string | \n\n | Override the name of the chart | \n
fullnameOverride | \n string | \n\n | Override a default fully qualified app name | \n
tls.enable | \n bool | \nfalse | \nSpecifies whether to use TLS for communications between components. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
tls.certManager | \n bool | \nfalse | \nIf enabled, requires cert-manager to be installed and will automatically create the required TLS certificates. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
enableNodeFeatureApi | \n bool | \ntrue | \nEnable the NodeFeature CRD API for communicating node features. This will automatically disable the gRPC communication. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
prometheus.enable | \n bool | \nfalse | \nSpecifies whether to expose metrics using prometheus operator | \n
prometheus.labels | \n dict | \n{} | \nSpecifies labels for use with the prometheus operator to control how it is selected | \n
Metrics are configured to be exposed using prometheus operator API’s by\ndefault. If you want to expose metrics using the prometheus operator\nAPI’s you need to install the prometheus operator in your cluster.
\n\nName | \nType | \nDefault | \ndescription | \n
---|---|---|---|
master.* | \n dict | \n\n | NFD master deployment configuration | \n
master.enable | \n bool | \ntrue | \nSpecifies whether nfd-master should be deployed | \n
master.port | \n integer | \n\n | Specifies the TCP port that nfd-master listens for incoming requests. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
master.metricsPort | \n integer | \n8081 | \nPort on which to expose metrics from components to prometheus operator | \n
master.instance | \n string | \n\n | Instance name. Used to separate annotation namespaces for multiple parallel deployments | \n
master.resyncPeriod | \n string | \n\n | NFD API controller resync period. | \n
master.extraLabelNs | \n array | \n[] | \nList of allowed extra label namespaces | \n
master.resourceLabels | \n array | \n[] | \nList of labels to be registered as extended resources | \n
master.enableTaints | \n bool | \nfalse | \nSpecifies whether to enable or disable node tainting | \n
master.crdController | \n bool | \nnull | \nSpecifies whether the NFD CRD API controller is enabled. If not set, controller will be enabled if master.instance is empty. | \n
master.featureRulesController | \n bool | \nnull | \nDEPRECATED: use master.crdController instead | \n
master.replicaCount | \n integer | \n1 | \nNumber of desired pods. This is a pointer to distinguish between explicit zero and not specified | \n
master.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
master.securityContext | \n dict | \n{} | \nContainer security settings | \n
master.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether a service account should be created | \n
master.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account | \n
master.serviceAccount.name | \n string | \n\n | The name of the service account to use. If not set and create is true, a name is generated using the fullname template | \n
master.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for nfd-master | \n
master.service.type | \n string | \nClusterIP | \nNFD master service type. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
master.service.port | \n integer | \n8080 | \nNFD master service port. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
master.resources | \n dict | \n{} | \nNFD master pod resources management | \n
master.nodeSelector | \n dict | \n{} | \nNFD master pod node selector | \n
master.tolerations | \n dict | \nScheduling to master node is disabled | \nNFD master pod tolerations | \n
master.annotations | \n dict | \n{} | \nNFD master pod annotations | \n
master.affinity | \n dict | \n\n | NFD master pod required node affinity | \n
master.deploymentAnnotations | \n dict | \n{} | \nNFD master deployment annotations | \n
master.nfdApiParallelism | \n integer | \n10 | \nSpecifies the maximum number of concurrent node updates. | \n
master.config | \n dict | \n\n | NFD master configuration | \n
Name | \nType | \nDefault | \ndescription | \n
---|---|---|---|
worker.* | \n dict | \n\n | NFD worker daemonset configuration | \n
worker.enable | \n bool | \ntrue | \nSpecifies whether nfd-worker should be deployed | \n
worker.metricsPort* | \n int | \n8081 | \nPort on which to expose metrics from components to prometheus operator | \n
worker.config | \n dict | \n\n | NFD worker configuration | \n
worker.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
worker.securityContext | \n dict | \n{} | \nContainer security settings | \n
worker.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether a service account for nfd-worker should be created | \n
worker.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account for nfd-worker | \n
worker.serviceAccount.name | \n string | \n\n | The name of the service account to use for nfd-worker. If not set and create is true, a name is generated using the fullname template (suffixed with -worker ) | \n
worker.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for nfd-worker | \n
worker.mountUsrSrc | \n bool | \nfalse | \nSpecifies whether to allow users to mount the hostpath /user/src. Does not work on systems without /usr/src AND a read-only /usr | \n
worker.resources | \n dict | \n{} | \nNFD worker pod resources management | \n
worker.nodeSelector | \n dict | \n{} | \nNFD worker pod node selector | \n
worker.tolerations | \n dict | \n{} | \nNFD worker pod node tolerations | \n
worker.priorityClassName | \n string | \n\n | NFD worker pod priority class | \n
worker.annotations | \n dict | \n{} | \nNFD worker pod annotations | \n
worker.daemonsetAnnotations | \n dict | \n{} | \nNFD worker daemonset annotations | \n
Name | \nType | \nDefault | \ndescription | \n
---|---|---|---|
topologyUpdater.* | \n dict | \n\n | NFD Topology Updater configuration | \n
topologyUpdater.enable | \n bool | \nfalse | \nSpecifies whether the NFD Topology Updater should be created | \n
topologyUpdater.createCRDs | \n bool | \nfalse | \nSpecifies whether the NFD Topology Updater CRDs should be created | \n
topologyUpdater.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether the service account for topology updater should be created | \n
topologyUpdater.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account for topology updater | \n
topologyUpdater.serviceAccount.name | \n string | \n\n | The name of the service account for topology updater to use. If not set and create is true, a name is generated using the fullname template and -topology-updater suffix | \n
topologyUpdater.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for topology updater | \n
topologyUpdater.metricsPort | \n integer | \n8081 | \nPort on which to expose prometheus metrics | \n
topologyUpdater.kubeletConfigPath | \n string | \n”” | \nSpecifies the kubelet config host path | \n
topologyUpdater.kubeletPodResourcesSockPath | \n string | \n”” | \nSpecifies the kubelet sock path to read pod resources | \n
topologyUpdater.updateInterval | \n string | \n60s | \nTime to sleep between CR updates. Non-positive value implies no CR update. | \n
topologyUpdater.watchNamespace | \n string | \n* | \n Namespace to watch pods, * for all namespaces | \n
topologyUpdater.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
topologyUpdater.securityContext | \n dict | \n{} | \nContainer security settings | \n
topologyUpdater.resources | \n dict | \n{} | \nTopology updater pod resources management | \n
topologyUpdater.nodeSelector | \n dict | \n{} | \nTopology updater pod node selector | \n
topologyUpdater.tolerations | \n dict | \n{} | \nTopology updater pod node tolerations | \n
topologyUpdater.annotations | \n dict | \n{} | \nTopology updater pod annotations | \n
topologyUpdater.daemonsetAnnotations | \n dict | \n{} | \nTopology updater daemonset annotations | \n
topologyUpdater.affinity | \n dict | \n{} | \nTopology updater pod affinity | \n
topologyUpdater.config | \n dict | \n\n | configuration | \n
topologyUpdater.podSetFingerprint | \n bool | \nfalse | \nEnables compute and report of pod fingerprint in NRT objects. | \n
topologyUpdater.kubeletStateDir | \n string | \n/var/lib/kubelet | \nSpecifies kubelet state directory path for watching state and checkpoint files. Empty value disables kubelet state tracking. | \n
Name | \nType | \nDefault | \ndescription | \n
---|---|---|---|
gc.* | \n dict | \n\n | NFD Garbage Collector configuration | \n
gc.enable | \n bool | \ntrue | \nSpecifies whether the NFD Garbage Collector should be created | \n
gc.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether the service account for garbage collector should be created | \n
gc.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account for garbage collector | \n
gc.serviceAccount.name | \n string | \n\n | The name of the service account for garbage collector to use. If not set and create is true, a name is generated using the fullname template and -gc suffix | \n
gc.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for garbage collector | \n
gc.interval | \n string | \n1h | \nTime between periodic garbage collector runs | \n
gc.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
gc.resources | \n dict | \n{} | \nGarbage collector pod resources management | \n
gc.metricsPort | \n integer | \n8081 | \nPort on which to serve Prometheus metrics | \n
gc.nodeSelector | \n dict | \n{} | \nGarbage collector pod node selector | \n
gc.tolerations | \n dict | \n{} | \nGarbage collector pod node tolerations | \n
gc.annotations | \n dict | \n{} | \nGarbage collector pod annotations | \n
gc.deploymentAnnotations | \n dict | \n{} | \nGarbage collector deployment annotations | \n
gc.affinity | \n dict | \n{} | \nGarbage collector pod affinity | \n
See the\nsample configuration file\nfor a full example configuration.
\n\nnoPublish
option disables updates to the Node objects in the Kubernetes\nAPI server, making a “dry-run” flag for nfd-master. No Labels, Annotations, Taints\nor ExtendedResources of nodes are updated.
Default: false
Example:
\n\nnoPublish: true\n
extraLabelNs
specifies a list of allowed feature\nlabel namespaces. This option can be used to allow\nother vendor or application specific namespaces for custom labels from the\nlocal and custom feature sources, even though these labels were denied using\nthe denyLabelNs
parameter.
The same namespace control and this option applies to Extended Resources (created\nwith resourceLabels
), too.
Default: empty
\n\nExample:
\n\nextraLabelNs: [\"added.ns.io\",\"added.kubernets.io\"]\n
denyLabelNs
specifies a list of excluded\nlabel namespaces. By default, nfd-master allows creating labels in all\nnamespaces, excluding kubernetes.io
namespace and its sub-namespaces\n(i.e. *.kubernetes.io
). However, you should note that\nkubernetes.io
and its sub-namespaces are always denied.\nThis option can be used to exclude some vendors or application specific\nnamespaces.
Default: empty
\n\nExample:
\n\ndenyLabelNs: [\"denied.ns.io\",\"denied.kubernetes.io\"]\n
The autoDefaultNs
option controls the automatic prefixing of names. When set\nto true (the default in NFD version v0.15) nfd-master\nautomatically adds the default feature.node.kubernetes.io/
prefix to\nunprefixed labels, annotations and extended resources - this is also the\ndefault behavior in NFD v0.15 and earlier. When the option is set to false
,\nno prefix will be prepended to unprefixed names, effectively causing them to be\nfiltered out (as NFD does not allow unprefixed names of labels, annotations or\nextended resources). The default will be changed to false
in a future\nrelease.
For example, with the autoDefaultNs
set to true
, a NodeFeatureRule with
labels:\n foo: bar\n
Will turn into feature.node.kubernetes.io/foo=bar
node label. With\nautoDefaultNs
set to false
, no prefix is added and the label will be\nfiltered out.
Note that taint keys are not affected by this option.
\n\nDefault: true
Example:
\n\nautoDefaultNs: false\n
DEPRECATED: NodeFeatureRule\nshould be used for managing extended resources in NFD.
\n\nThe resourceLabels
option specifies a list of features to be\nadvertised as extended resources instead of labels. Features that have integer\nvalues can be published as Extended Resources by listing them in this option.
Default: empty
\n\nExample:
\n\nresourceLabels: [\"vendor-1.com/feature-1\",\"vendor-2.io/feature-2\"]\n
enableTaints
enables/disables node tainting feature of NFD.
Default: false
\n\nExample:
\n\nenableTaints: true\n
labelWhiteList
specifies a regular expression for filtering feature\nlabels based on their name. Each label must match against the given regular\nexpression or it will not be published.
\n\n\n** NOTE:** The regular expression is only matches against the “basename” part\nof the label, i.e. to the part of the name after ‘/’. The label namespace is\nomitted.
\n
Default: empty
\n\nExample:
\n\nlabelWhiteList: \"foo\"\n
The resyncPeriod
option specifies the NFD API controller resync period.\nThe resync means nfd-master replaying all NodeFeature and NodeFeatureRule objects,\nthus effectively re-syncing all nodes in the cluster (i.e. ensuring labels, annotations,\nextended resources and taints are in place).\nOnly has effect when the NodeFeature\nCRD API has been enabled with -enable-nodefeature-api
.
Default: 1 hour.
\n\nExample:
\n\nresyncPeriod: 2h\n
The leaderElection
section exposes configuration to tweak leader election.
leaderElection.leaseDuration
is the duration that non-leader candidates will\nwait to force acquire leadership. This is measured against time of\nlast observed ack.
A client needs to wait a full LeaseDuration without observing a change to\nthe record before it can attempt to take over. When all clients are\nshutdown and a new set of clients are started with different names against\nthe same leader record, they must wait the full LeaseDuration before\nattempting to acquire the lease. Thus LeaseDuration should be as short as\npossible (within your tolerance for clock skew rate) to avoid a possible\nlong waits in the scenario.
\n\nDefault: 15 seconds.
\n\nExample:
\n\nleaderElection:\n leaseDurtation: 15s\n
leaderElection.renewDeadline
is the duration that the acting master will retry\nrefreshing leadership before giving up.
This value has to be lower than leaseDuration and greater than retryPeriod*1.2.
\n\nDefault: 10 seconds.
\n\nExample:
\n\nleaderElection:\n renewDeadline: 10s\n
leaderElection.retryPeriod
is the duration the LeaderElector clients should wait\nbetween tries of actions.
It has to be greater than 0.
\n\nDefault: 2 seconds.
\n\nExample:
\n\nleaderElection:\n retryPeriod: 2s\n
The nfdApiParallelism
option can be used to specify the maximum\nnumber of concurrent node updates.
It takes effect only when -enable-nodefeature-api
has been set.
Default: 10
\n\nExample:
\n\nnfdApiParallelism: 1\n
The following options specify the logger configuration. Most of which can be\ndynamically adjusted at run-time.
\n\n\n\n\nNOTE: The logger options can also be specified via command line flags\nwhich take precedence over any corresponding config file options.
\n
If true, adds the file directory to the header of the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nLog to standard error as well as files.
\n\nDefault: false
Run-time configurable: yes
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nRun-time configurable: yes
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
Run-time configurable: no
\n\nLog to standard error instead of files
\n\nDefault: true
Run-time configurable: yes
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
Run-time configurable: no
\n\nLogs at or above this threshold go to stderr (default 2)
\n\nRun-time configurable: yes
\n\nNumber for the log level verbosity.
\n\nDefault: 0
Run-time configurable: yes
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n\nRun-time configurable: yes
\n","dir":"/reference/","name":"master-configuration-reference.md","path":"reference/master-configuration-reference.md","url":"/reference/master-configuration-reference.html"},{"title":"Usage","layout":"default","sort":3,"content":"Usage instructions.
\n","dir":"/usage/","name":"index.md","path":"usage/index.md","url":"/usage/"},{"title":"NFD-Master","layout":"default","sort":3,"content":"NFD-Master is responsible for connecting to the Kubernetes API server and\nupdating node objects. More specifically, it modifies node labels, taints and\nextended resources based on requests from nfd-workers and 3rd party extensions.
\n\nThe NodeFeature Controller uses NodeFeature objects as\nthe input for the NodeFeatureRule\nprocessing pipeline. In addition, any labels listed in the NodeFeature object\nare created on the node (note the allowed\nlabel namespaces are controlled).
\n\nNFD-Master acts as the controller for\nNodeFeatureRule objects.\nIt applies the rules specified in NodeFeatureRule objects on raw feature data\nand creates node labels accordingly. The feature data used as the input is\nreceived from nfd-worker instances through\nNodeFeature objects.
\n\n\n\n\nNOTE: when gRPC (DEPRECATED) is used for communicating\nthe features (by setting the flag
\n-enable-nodefeature-api=false
on both\nnfd-master and nfd-worker, or via Helm values.enableNodeFeatureApi=false),\n(re-)labelling only happens when a request is received from nfd-worker.\nThat is, in practice rules are evaluated and labels for each node are created\non intervals specified by the\ncore.sleepInterval
\nconfiguration option of nfd-worker instances. This means that modification or\ncreation of NodeFeatureRule objects does not instantly cause the node\nlabels to be updated. Instead, the changes only come visible in node labels\nas nfd-worker instances send their labelling requests. This limitation is not\npresent when gRPC interface is disabled\nand NodeFeature API is used.
NFD-Master supports dynamic configuration through a configuration file. The\ndefault location is /etc/kubernetes/node-feature-discovery/nfd-master.conf
,\nbut, this can be changed by specifying the-config
command line flag.\nConfiguration file is re-read whenever it is modified which makes run-time\nre-configuration of nfd-master straightforward.
Master configuration file is read inside the container, and thus, Volumes and\nVolumeMounts are needed to make your configuration available for NFD. The\npreferred method is to use a ConfigMap which provides easy deployment and\nre-configurability.
\n\nThe provided nfd-master deployment templates create an empty configmap and\nmount it inside the nfd-master containers. In kustomize deployments,\nconfiguration can be edited with:
\n\nkubectl -n ${NFD_NS} edit configmap nfd-master-conf\n
In Helm deployments,\nMaster pod parameter\nmaster.config
can be used to edit the respective configuration.
See\nnfd-master configuration file reference\nfor more details.\nThe (empty-by-default)\nexample config\ncontains all available configuration options and can be used as a reference\nfor creating a configuration.
\n\nNFD-Master runs as a deployment, by default\nit prefers running on the cluster’s master nodes but will run on worker\nnodes if no master nodes are found.
\n\nFor High Availability, you should increase the replica count of\nthe deployment object. You should also look into adding\ninter-pod\naffinity to prevent masters from running on the same node.\nHowever note that inter-pod affinity is costly and is not recommended\nin bigger clusters.
\n\n\n\n\nNote: When NFD-Master is intended to run with more than one replica,\nit is advised to use
\n-enable-leader-election
flag. This flag turns on\nleader election for NFD-Master and let only one replica to act on changes\nin NodeFeature and NodeFeatureRule objects.
If you have RBAC authorization enabled (as is the default e.g. with clusters\ninitialized with kubeadm) you need to configure the appropriate ClusterRoles,\nClusterRoleBindings and a ServiceAccount for NFD to create node\nlabels. The provided template will configure these for you.
\n","dir":"/usage/","name":"nfd-master.md","path":"usage/nfd-master.md","url":"/usage/nfd-master.html"},{"title":"NFD Operator","layout":"default","sort":4,"content":"The Node Feature Discovery Operator automates installation,\nconfiguration and updates of NFD using a specific NodeFeatureDiscovery custom\nresource. This also provides good support for managing NFD as a dependency of\nother operators.
\n\nDeployment using the\nNode Feature Discovery Operator\nis recommended to be done via\noperatorhub.io.
\n\nInstall the operator:
\n\nkubectl create -f https://operatorhub.io/install/nfd-operator.yaml\n
Create NodeFeatureDiscovery
object (in nfd
namespace here):
cat << EOF | kubectl apply -f -\napiVersion: v1\nkind: Namespace\nmetadata:\n name: nfd\n---\napiVersion: nfd.kubernetes.io/v1\nkind: NodeFeatureDiscovery\nmetadata:\n name: my-nfd-deployment\n namespace: nfd\nspec:\n operand:\n image: registry.k8s.io/nfd/node-feature-discovery:v0.15.6\n imagePullPolicy: IfNotPresent\nEOF\n
If you followed the deployment instructions above you can uninstall NFD with:
\n\nkubectl -n nfd delete NodeFeatureDiscovery my-nfd-deployment\n
Optionally, you can also remove the namespace:
\n\nkubectl delete ns nfd\n
See the node-feature-discovery-operator and OLM project\ndocumentation for instructions for uninstalling the operator and operator\nlifecycle manager, respectively.
\n\n\n","dir":"/deployment/","name":"operator.md","path":"deployment/operator.md","url":"/deployment/operator.html"},{"title":"Reference","layout":"default","sort":4,"content":"Command line and configuration reference.
\n","dir":"/reference/","name":"index.md","path":"reference/index.md","url":"/reference/"},{"title":"Worker config reference","layout":"default","sort":4,"content":"See the\nsample configuration file\nfor a full example configuration.
\n\nThe core
section contains common configuration settings that are not specific\nto any particular feature source.
core.sleepInterval
specifies the interval between consecutive passes of\nfeature (re-)detection, and thus also the interval between node re-labeling. A\nnon-positive value implies infinite sleep interval, i.e. no re-detection or\nre-labeling is done.
Default: 60s
Example:
\n\ncore:\n sleepInterval: 60s\n
core.featureSources
specifies the list of enabled feature sources. A special\nvalue all
enables all sources. Prefixing a source name with -
indicates\nthat the source will be disabled instead - this is only meaningful when used in\nconjunction with all
. This option allows completely disabling the feature\ndetection so that neither standard feature labels are generated nor the raw\nfeature data is available for custom rule processing.
Default: [all]
Example:
\n\ncore:\n # Enable all but cpu and local sources\n featureSources:\n - \"all\"\n - \"-cpu\"\n - \"-local\"\n
core:\n # Enable only cpu and local sources\n featureSources:\n - \"cpu\"\n - \"local\"\n
core.labelSources
specifies the list of enabled label sources. A special\nvalue all
enables all sources. Prefixing a source name with -
indicates\nthat the source will be disabled instead - this is only meaningful when used in\nconjunction with all
. This configuration option affects the generation of\nnode labels but not the actual discovery of the underlying feature data that is\nused e.g. in custom/NodeFeatureRule
rules.
\n\n\nNOTE: Overridden by the
\n-label-sources
command line flag and the\ncore.sources
configurations option (if either of them is specified).
Default: [all]
Example:
\n\ncore:\n # Enable all but cpu and system sources\n labelSources:\n - \"all\"\n - \"-cpu\"\n - \"-system\"\n
core:\n # Enable only cpu and system sources\n labelSources:\n - \"cpu\"\n - \"system\"\n
DEPRECATED: use core.labelSources
instead.
\n\n\nNOTE:
\ncore.sources
takes precedence over thecore.labelSources
\nconfiguration file option.
core.labelWhiteList
specifies a regular expression for filtering feature\nlabels based on the label name. Non-matching labels are not published.
\n\n\nNOTE: The regular expression is only matches against the “basename” part\nof the label, i.e. to the part of the name after ‘/’. The label prefix (or\nnamespace) is omitted.
\n
Default: null
Example:
\n\ncore:\n labelWhiteList: '^cpu-cpuid'\n
Setting core.noPublish
to true
disables all communication with the\nnfd-master and the Kubernetes API server. It is effectively a “dry-run” option.\nNFD-Worker runs feature detection normally, but no labeling requests are sent\nto nfd-master and no NodeFeature\nobjects are created or updated in the API server.
\n\n\nNOTE: Overridden by the\n
\n-no-publish
\ncommand line flag (if specified).
Default: false
Example:
\n\ncore:\n noPublish: true\n
The following options specify the logger configuration. Most of which can be\ndynamically adjusted at run-time.
\n\n\n\n\nNOTE: The logger options can also be specified via command line flags\nwhich take precedence over any corresponding config file options.
\n
If true, adds the file directory to the header of the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nLog to standard error as well as files.
\n\nDefault: false
Run-time configurable: yes
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nRun-time configurable: yes
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
Run-time configurable: no
\n\nLog to standard error instead of files
\n\nDefault: true
Run-time configurable: yes
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
Run-time configurable: no
\n\nLogs at or above this threshold go to stderr (default 2)
\n\nRun-time configurable: yes
\n\nNumber for the log level verbosity.
\n\nDefault: 0
Run-time configurable: yes
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n\nRun-time configurable: yes
\n\nThe sources
section contains feature source specific configuration parameters.
Prevent publishing cpuid features listed in this option.
\n\n\n\n\nNOTE: overridden by
\nsources.cpu.cpuid.attributeWhitelist
(if specified)
Default: [BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT,\nNX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SGXLC, SSE, SSE2, SSE3, SSE4.1,\nSSE4.2, SSSE3, TDX_GUEST]
Example:
\n\nsources:\n cpu:\n cpuid:\n attributeBlacklist: [MMX, MMXEXT]\n
Only publish the cpuid features listed in this option.
\n\n\n\n\nNOTE: takes precedence over
\nsources.cpu.cpuid.attributeBlacklist
Default: empty
\n\nExample:
\n\nsources:\n cpu:\n cpuid:\n attributeWhitelist: [AVX512BW, AVX512CD, AVX512DQ, AVX512F, AVX512VL]\n
Path of the kernel config file. If empty, NFD runs a search in the well-known\nstandard locations.
\n\nDefault: empty
\n\nExample:
\n\nsources:\n kernel:\n kconfigFile: \"/path/to/kconfig\"\n
Kernel configuration options to publish as feature labels.
\n\nDefault: [NO_HZ, NO_HZ_IDLE, NO_HZ_FULL, PREEMPT]
Example:
\n\nsources:\n kernel:\n configOpts: [NO_HZ, X86, DMI]\n
Configuration option to disable/enable hooks execution. Enabled by default.\nHooks are DEPRECATED since v0.12.0 release and support will be removed in a\nfuture release. Use\nfeature files instead.
\n\n\n\n\nNOTE: The default NFD container image only supports statically linked\nbinaries. Use the full image variant\nfor a slightly more extensive environment that additionally supports bash and\nperl runtimes.
\n
Related tracking issues:
\n\nDefault: false
\n\nExample:
\n\nsources:\n local:\n hooksEnabled: true\n
List of PCI device class IDs for which to\npublish a label. Can be specified as a main class only (e.g. 03
) or full\nclass-subclass combination (e.g. 0300
) - the former implies that all\nsubclasses are accepted. The format of the labels can be further configured\nwith deviceLabelFields.
Default: [\"03\", \"0b40\", \"12\"]
Example:
\n\nsources:\n pci:\n deviceClassWhitelist: [\"0200\", \"03\"]\n
The set of PCI ID fields to use when constructing the name of the feature\nlabel. Valid fields are class
, vendor
, device
, subsystem_vendor
and\nsubsystem_device
.
Default: [class, vendor]
Example:
\n\nsources:\n pci:\n deviceLabelFields: [class, vendor, device]\n
With the example config above NFD would publish labels like:\nfeature.node.kubernetes.io/pci-<class-id>_<vendor-id>_<device-id>.present=true
List of USB device class IDs for\nwhich to publish a feature label. The format of the labels can be further\nconfigured with deviceLabelFields.
\n\nDefault: [\"0e\", \"ef\", \"fe\", \"ff\"]
Example:
\n\nsources:\n usb:\n deviceClassWhitelist: [\"ef\", \"ff\"]\n
The set of USB ID fields from which to compose the name of the feature label.\nValid fields are class
, vendor
, device
and serial
.
Default: [class, vendor, device]
Example:
\n\nsources:\n pci:\n deviceLabelFields: [class, vendor]\n
With the example config above NFD would publish labels like:\nfeature.node.kubernetes.io/usb-<class-id>_<vendor-id>.present=true
List of rules to process in the custom feature source to create user-specific\nlabels. Refer to the documentation of the\ncustom feature source for\ndetails of the available rules and their configuration.
\n\nDefault: empty
\n\nExample:
\n\nsources:\n custom:\n - name: \"my custom rule\"\n labels:\n my-custom-feature: \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n e1000e: {op: Exists}\n - feature: pci.device\n matchExpressions:\n class: {op: In, value: [\"0200\"]}\n vendor: {op: In, value: [\"8086\"]}\n
NFD-Worker is preferably run as a Kubernetes DaemonSet. This assures\nre-labeling on regular intervals capturing changes in the system configuration\nand makes sure that new nodes are labeled as they are added to the cluster.\nWorker connects to the nfd-master service to advertise hardware features.
\n\nWhen run as a daemonset, nodes are re-labeled at an default interval of 60s.\nThis can be changed by using the\ncore.sleepInterval
\nconfig option.
The worker configuration file is watched and re-read on every change which\nprovides a mechanism of dynamic run-time reconfiguration. See\nworker configuration for more details.
\n\nNFD-Worker supports dynamic configuration through a configuration file. The\ndefault location is /etc/kubernetes/node-feature-discovery/nfd-worker.conf
,\nbut, this can be changed by specifying the-config
command line flag.\nConfiguration file is re-read whenever it is modified which makes run-time\nre-configuration of nfd-worker straightforward.
Worker configuration file is read inside the container, and thus, Volumes and\nVolumeMounts are needed to make your configuration available for NFD. The\npreferred method is to use a ConfigMap which provides easy deployment and\nre-configurability.
\n\nThe provided nfd-worker deployment templates create an empty configmap and\nmount it inside the nfd-worker containers. In kustomize deployments,\nconfiguration can be edited with:
\n\nkubectl -n ${NFD_NS} edit configmap nfd-worker-conf\n
In Helm deployments,\nWorker pod parameter\nworker.config
can be used to edit the respective configuration.
See\nnfd-worker configuration file reference\nfor more details.\nThe (empty-by-default)\nexample config\ncontains all available configuration options and can be used as a reference\nfor creating a configuration.
\n\nConfiguration options can also be specified via the -options
command line\nflag, in which case no mounts need to be used. The same format as in the config\nfile must be used, i.e. JSON (or YAML). For example:
-options='{\"sources\": { \"pci\": { \"deviceClassWhitelist\": [\"12\"] } } }'\n
Configuration options specified from the command line will override those read\nfrom the config file.
\n","dir":"/usage/","name":"nfd-worker.md","path":"usage/nfd-worker.md","url":"/usage/nfd-worker.html"},{"title":"TLS authentication","layout":"default","sort":5,"content":"\n\n\nDEPRECATED: this section only applies when the gRPC API is used, i.e.\nwhen the NodeFeature API is disabled via the
\n-enable-nodefeature-api=false
\nflag on both nfd-master and nfd-worker. The gRPC API is deprecated and will\nbe removed in a future release.
NFD supports mutual TLS authentication between the nfd-master and nfd-worker\ninstances. That is, nfd-worker and nfd-master both verify that the other end\npresents a valid certificate.
\n\nTLS authentication is enabled by specifying -ca-file
, -key-file
and\n-cert-file
args, on both the nfd-master and nfd-worker instances. The\ntemplate specs provided with NFD contain (commented out) example configuration\nfor enabling TLS authentication.
The Common Name (CN) of the nfd-master certificate must match the DNS name of\nthe nfd-master Service of the cluster. By default, nfd-master only check that\nthe nfd-worker has been signed by the specified root certificate (-ca-file).
\n\nAdditional hardening can be enabled by specifying -verify-node-name
in\nnfd-master args, in which case nfd-master verifies that the NodeName presented\nby nfd-worker matches the Common Name (CN) or a Subject Alternative Name (SAN)\nof its certificate. Note that -verify-node-name
complicates certificate\nmanagement and is not yet supported in the helm or kustomize deployment\nmethods.
cert-manager can be used to automate certificate\nmanagement between nfd-master and the nfd-worker pods.
\n\nThe NFD source code repository contains an example kustomize overlay and helm\nchart that can be used to deploy NFD with cert-manager supplied certificates\nenabled.
\n\nTo install cert-manager
itself, you can run:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml\n
Alternatively, you can refer to cert-manager documentation for other\ninstallation methods such as the Helm chart they provide.
\n\nTo use the kustomize overlay to install node-feature-discovery with TLS enabled,\nyou may use the following:
\n\nkubectl apply -k deployment/overlays/samples/cert-manager\n
To make use of the helm chart, override values.yaml
to enable both the\ntls.enabled
and tls.certManager
options. Note that if you do not enable\ntls.certManager
, helm will successfully install the application, but\ndeployment will wait until certificates are manually created, as demonstrated\nbelow.
See the sample installation commands in the Helm Deployment\nand Configuration sections above for how to either override\nindividual values, or provide a yaml file with which to override default\nvalues.
\n\nIf you do not with to make use of cert-manager, the certificates can be\nmanually created and stored as secrets within the NFD namespace.
\n\nCreate a CA certificate
\n\nopenssl req -x509 -newkey rsa:4096 -keyout ca.key -nodes \\\n -subj \"/CN=nfd-ca\" -days 10000 -out ca.crt\n
Create a common openssl config file.
\n\ncat <<EOF > nfd-common.conf\n[ req ]\ndefault_bits = 4096\nprompt = no\ndefault_md = sha256\nreq_extensions = req_ext\ndistinguished_name = dn\n\n[ dn ]\nC = XX\nST = some-state\nL = some-city\nO = some-company\nOU = node-feature-discovery\n\n[ req_ext ]\nsubjectAltName = @alt_names\n\n[ v3_ext ]\nauthorityKeyIdentifier=keyid,issuer:always\nbasicConstraints=CA:FALSE\nkeyUsage=keyEncipherment,dataEncipherment\nextendedKeyUsage=serverAuth,clientAuth\nsubjectAltName=@alt_names\nEOF\n
Now, create the nfd-master certificate.
\n\ncat <<EOF > nfd-master.conf\n.include nfd-common.conf\n\n[ dn ]\nCN = nfd-master\n\n[ alt_names ]\nDNS.1 = nfd-master\nDNS.2 = nfd-master.node-feature-discovery.svc.cluster.local\nDNS.3 = localhost\nEOF\n\nopenssl req -new -newkey rsa:4096 -keyout nfd-master.key -nodes -out nfd-master.csr -config nfd-master.conf\n
Create certificates for nfd-worker and nfd-topology-updater
\n\ncat <<EOF > nfd-worker.conf\n.include nfd-common.conf\n\n[ dn ]\nCN = nfd-worker\n\n[ alt_names ]\nDNS.1 = nfd-worker\nDNS.2 = nfd-worker.node-feature-discovery.svc.cluster.local\nEOF\n\n# Config for topology updater is identical except for the DN and alt_names\nsed -e 's/worker/topology-updater/g' < nfd-worker.conf > nfd-topology-updater.conf\n\nopenssl req -new -newkey rsa:4096 -keyout nfd-worker.key -nodes -out nfd-worker.csr -config nfd-worker.conf\nopenssl req -new -newkey rsa:4096 -keyout nfd-topology-updater.key -nodes -out nfd-topology-updater.csr -config nfd-topology-updater.conf\n
Now, sign the certificates with the CA created earlier.
\n\nfor cert in nfd-master nfd-worker nfd-topology-updater; do\n echo signing $cert\n openssl x509 -req -in $cert.csr -CA ca.crt -CAkey ca.key \\\n -CAcreateserial -out $cert.crt -days 10000 \\\n -extensions v3_ext -extfile $cert.conf\ndone\n
Finally, turn these certificates into secrets.
\n\nfor cert in nfd-master nfd-worker nfd-topology-updater; do\n echo creating secret for $cert in node-feature-discovery namespace\n cat <<EOF | kubectl create -n node-feature-discovery -f -\n---\napiVersion: v1\nkind: Secret\ntype: kubernetes.io/tls\nmetadata:\n name: ${cert}-cert\ndata:\n ca.crt: $( cat ca.crt | base64 -w 0 )\n tls.crt: $( cat $cert.crt | base64 -w 0 )\n tls.key: $( cat $cert.key | base64 -w 0 )\nEOF\n\ndone\n
git clone https://github.com/kubernetes-sigs/node-feature-discovery\ncd node-feature-discovery\n
See customizing the build below for altering the\ncontainer image registry, for example.
\n\nmake\n
Optional, this example with Docker.
\n\ndocker push <IMAGE_TAG>\n
The default set of architectures enabled for mulit-arch builds are linux/amd64
\nand linux/arm64
. If more architectures are needed one can override the\nIMAGE_ALL_PLATFORMS
variable with a comma separated list of OS/ARCH
tuples.
make image-all\n
Currently docker
does not support loading of manifest-lists meaning the images\nare not shown when executing docker images
, see:\nbuildx issue #59.
make push-all\n
The resulting container image can be used in the same way on each arch by pulling\ne.g. node-feature-discovery:v0.15.6
without specifying the\narchitecture. The manifest-list will take care of providing the right\narchitecture image.
To use your published image from the step above instead of the\nregistry.k8s.io/nfd/node-feature-discovery
image, edit image
\nattribute in the spec template(s) to the new location\n(<registry-name>/<image-name>[:<version>]
).
The yamls
makefile generates a kustomization.yaml
matching your locally\nbuilt image and using the deploy/overlays/default
deployment. See\nbuild customization below for configurability, e.g.\nchanging the deployment namespace.
K8S_NAMESPACE=my-ns make yamls\nkubectl apply -k .\n
You can use alternative deployment methods by modifying the auto-generated\nkustomization file.
\n\nYou can also build the binaries locally
\n\nmake build\n
This will compile binaries under bin/
There are several Makefile variables that control the build process and the\nname of the resulting container image. The following are targeted targeted for\nbuild customization and they can be specified via environment variables or\nmakefile overrides.
\n\nVariable | \nDescription | \nDefault value | \n
---|---|---|
HOSTMOUNT_PREFIX | \nPrefix of system directories for feature discovery (local builds) | \n/ (local builds) /host- (container builds) | \n
IMAGE_BUILD_CMD | \nCommand to build the image | \ndocker build | \n
IMAGE_BUILD_EXTRA_OPTS | \nExtra options to pass to build command | \nempty | \n
IMAGE_BUILDX_CMD | \nCommand to build and push multi-arch images with buildx | \nDOCKER_CLI_EXPERIMENTAL=enabled docker buildx build –platform=${IMAGE_ALL_PLATFORMS} –progress=auto –pull | \n
IMAGE_ALL_PLATFORMS | \nComma separated list of OS/ARCH tuples for mulit-arch builds | \nlinux/amd64,linux/arm64 | \n
IMAGE_PUSH_CMD | \nCommand to push the image to remote registry | \ndocker push | \n
IMAGE_REGISTRY | \nContainer image registry to use | \nregistry.k8s.io/nfd | \n
IMAGE_TAG_NAME | \nContainer image tag name | \n<nfd version> | \n
IMAGE_EXTRA_TAG_NAMES | \nAdditional container image tag(s) to create when building image | \nempty | \n
K8S_NAMESPACE | \nnfd-master and nfd-worker namespace | \nnode-feature-discovery | \n
For example, to use a custom registry:
\n\nmake IMAGE_REGISTRY=<my custom registry uri>\n
Or to specify a build tool different from Docker, It can be done in 2 ways:
\n\nvia environment
\n\n IMAGE_BUILD_CMD=\"buildah bud\" make\n
by overriding the variable value
\n\n make IMAGE_BUILD_CMD=\"buildah bud\"\n
Unit tests are automatically run as part of the container image build. You can\nalso run them manually in the source code tree by running:
\n\nmake test\n
End-to-end tests are built on top of the e2e test framework of Kubernetes, and,\nthey required a cluster to run them on. For running the tests on your test\ncluster you need to specify the kubeconfig to be used:
\n\nmake e2e-test KUBECONFIG=$HOME/.kube/config\n
There are several environment variables that can be used to customize the\ne2e-tests:
\n\nVariable | \nDescription | \nDefault value | \n
---|---|---|
KUBECONFIG | \nKubeconfig for running e2e-tests | \nempty | \n
E2E_TEST_CONFIG | \nParameterization file of e2e-tests (see example) | \nempty | \n
E2E_PULL_IF_NOT_PRESENT | \nTrue-ish value makes the image pull policy IfNotPresent (to be used only in e2e tests) | \nfalse | \n
E2E_TEST_FULL_IMAGE | \nRun e2e-test also against the Full Image tag | \nfalse | \n
E2E_GINKGO_LABEL_FILTER | \nGinkgo label filter to use for running e2e tests | \nempty | \n
OPENSHIFT | \nNon-empty value enables OpenShift specific support (only affects e2e tests) | \nempty | \n
\n\n\n**DEPRECATED: Running NFD locally is deprecated and will be removed in a\nfuture release. It depends on the gRPC API which is deprecated and will be\nremoved in a future release. To run NFD locally, use the\n
\n-enable-nodefeature-api=false
flag.
You can run NFD locally, either directly on your host OS or in containers for\ntesting and development purposes. This may be useful e.g. for checking\nfeatures-detection.
\n\nWhen running as a standalone container labeling is expected to fail because\nKubernetes API is not available. Thus, it is recommended to use -no-publish
\nAlso specify -crd-controller=false
and -enable-nodefeature-api=false
\ncommand line flags to disable CRD controller and enable gRPC. E.g.
$ export NFD_CONTAINER_IMAGE=registry.k8s.io/nfd/node-feature-discovery:v0.15.6\n$ docker run --rm --name=nfd-test ${NFD_CONTAINER_IMAGE} nfd-master -no-publish -crd-controller=false -enable-nodefeature-api=false\n2019/02/01 14:48:21 Node Feature Discovery Master <NFD_VERSION>\n2019/02/01 14:48:21 gRPC server serving on port: 8080\n
To run nfd-worker as a “stand-alone” container you need to run it in the same\nnetwork namespace as the nfd-master container:
\n\n$ docker run --rm --network=container:nfd-test ${NFD_CONTAINER_IMAGE} nfd-worker -enable-nodefeature-api=false\n2019/02/01 14:48:56 Node Feature Discovery Worker <NFD_VERSION>\n...\n
If you just want to try out feature discovery without connecting to nfd-master,\npass the -no-publish
flag to nfd-worker.
\n\n\nNOTE: Some feature sources need certain directories and/or files from the\nhost mounted inside the NFD container. Thus, you need to provide Docker with\nthe correct
\n--volume
options for them to work correctly when run\nstand-alone directly withdocker run
. See\nthe default deployment\nfor up-to-date information about the required volume mounts.
To run nfd-topology-updater as a “stand-alone” container\nyou need to run it in with the -no-publish
flag to disable communication to\nthe Kubernetes apiserver.
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-topology-updater -no-publish\n2019/02/01 14:48:56 Node Feature Discovery Topology Updater <NFD_VERSION>\n...\n
If you just want to try out resource topology discovery without connecting to\nthe Kubernetes API, pass the -no-publish
flag to nfd-topology-updater.
\n\n\nNOTE: NFD topology updater needs certain directories and/or files from\nthe host mounted inside the NFD container. Thus, you need to provide Docker\nwith the correct
\n\n--volume
options for them to work correctly when\nrun stand-alone directly withdocker run
. See\nthe template spec\nfor up-to-date information about the required volume mounts.PodResource API is a prerequisite for\nnfd-topology-updater. Preceding Kubernetes v1.23, the
\nkubelet
must be\nstarted with the following flag:\n--feature-gates=KubeletPodResourcesGetAllocatable=true
. Starting\nKubernetes v1.23, theGetAllocatableResources
is enabled by default through\nKubeletPodResourcesGetAllocatable
feature gate.
Another option for building NFD locally is via Tilt tool, which can build container\nimages, push them to a local registry and reload your Kubernetes pods automatically.\nWhen using Tilt, you don’t have to build container images and re-deploy your pods\nmanually but instead let the Tilt take care of it. Tiltfile is a configuration file\nfor the Tilt and is located at the root directory. To develop NFD with Tilt, follow\nthe steps below.
\n\nTo start up your Tilt development environment, run
\n\ntilt up\n
at the root of your local NFD codebase. Tilt will start a web interface in the\nlocalhost and port 10350. From the web interface, you are able to see how NFD worker\nand master are progressing, watch their build and runtime logs. Once your code changes\nare saved locally, Tilt will notice it and re-build the container image from the\ncurrent code, push the image to the registry and re-deploy NFD pods with the latest\ncontainer image.
\n\nTo override environment variables used in the Tiltfile during image build,\nexport them in your current terminal before starting Tilt.
\n\nexport IMAGE_TAG_NAME=\"v1\"\ntilt up\n
This will override the default value(master
) of IMAGE_TAG_NAME
variable defined\nin the Tiltfile.
All documentation resides under the\ndocs\ndirectory in the source tree. It is designed to be served as a html site by\nGitHub Pages.
\n\nBuilding the documentation is containerized to fix the build\nenvironment. The recommended way for developing documentation is to run:
\n\nmake site-serve\n
This will build the documentation in a container and serve it under\nlocalhost:4000/ making it easy to verify the results.\nAny changes made to the docs/
will automatically re-trigger a rebuild and are\nreflected in the served content and can be inspected with a browser refresh.
To just build the html documentation run:
\n\nmake site-build\n
This will generate html documentation under docs/_site/
.
To quickly view available command line flags execute nfd-topology-updater -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 \\\nnfd-topology-updater -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -config
flag specifies the path of the nfd-topology-updater\nconfiguration file to use.
Default: /etc/kubernetes/node-feature-discovery/nfd-topology-updater.conf
\n\nExample:
\n\nnfd-topology-updater -config=/opt/nfd/nfd-topology-updater.conf\n
The -no-publish
flag disables all communication with the nfd-master, making\nit a “dry-run” flag for nfd-topology-updater. NFD-Topology-Updater runs\nresource hardware topology detection normally, but no CR requests are sent to\nnfd-master.
Default: false
\n\nExample:
\n\nnfd-topology-updater -no-publish\n
The -oneshot
flag causes nfd-topology-updater to exit after one pass of\nresource hardware topology detection.
Default: false
\n\nExample:
\n\nnfd-topology-updater -oneshot -no-publish\n
The -metrics
flag specifies the port on which to expose\nPrometheus metrics. Setting this to 0 disables the\nmetrics server on nfd-topology-updater.
Default: 8081
\n\nExample:
\n\nnfd-topology-updater -metrics=12345\n
The -sleep-interval
specifies the interval between resource hardware\ntopology re-examination (and CR updates). zero means no CR updates on interval basis.
Default: 60s
\n\nExample:
\n\nnfd-topology-updater -sleep-interval=1h\n
The -watch-namespace
specifies the namespace to ensure that resource\nhardware topology examination only happens for the pods running in the\nspecified namespace. Pods that are not running in the specified namespace\nare not considered during resource accounting. This is particularly useful\nfor testing/debugging purpose. A “*” value would mean that all the pods would\nbe considered during the accounting process.
Default: “*”
\n\nExample:
\n\nnfd-topology-updater -watch-namespace=rte\n
The -kubelet-config-uri
specifies the path to the Kubelet’s configuration.\nNote that the URi could either be a local host file or an HTTP endpoint.
Default: https://${NODE_ADDRESS}:10250/configz
Example:
\n\nnfd-topology-updater -kubelet-config-uri=file:///var/lib/kubelet/config.yaml\n
The -api-auth-token-file
specifies the path to the api auth token file\nwhich is used to retrieve Kubelet’s configuration from Kubelet secure port,\nonly taking effect when -kubelet-config-uri
is https.\nNote that this token file must bind to a role that has the get
capability to\nnodes/proxy
resources.
Default: /var/run/secrets/kubernetes.io/serviceaccount/token
Example:
\n\nnfd-topology-updater -token-file=/var/run/secrets/kubernetes.io/serviceaccount/token\n
The -podresources-socket
specifies the path to the Unix socket where kubelet\nexports a gRPC service to enable discovery of in-use CPUs and devices, and to\nprovide metadata for them.
Default: /host-var/lib/kubelet/pod-resources/kubelet.sock
\n\nExample:
\n\nnfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock\n
Enables compute and report the pod set fingerprint in the NRT.\nA pod fingerprint is a compact representation of the “node state” regarding resources.
\n\nDefault: false
Example:
\n\nnfd-topology-updater -pods-fingerprint\n
The -kubelet-state-dir
specifies the path to the Kubelet state directory,\nwhere state and checkpoint files are stored.\nThe files are mount as read-only and cannot be change by the updater.\nEnabled by default.\nPassing an empty string will disable the watching.
Default: /host-var/lib/kubelet
\n\nExample:
\n\nnfd-topology-updater -kubelet-state-dir=/var/lib/kubelet\n
NFD-Topology-Updater is preferably run as a Kubernetes DaemonSet.\nThis assures re-examination on regular intervals\nand/or per pod life-cycle events, capturing changes in the allocated\nresources and hence the allocatable resources on a per-zone basis by updating\nNodeResourceTopology custom resources.\nIt makes sure that new NodeResourceTopology instances are created for each new\nnodes that get added to the cluster.
\n\nBecause of the design and implementation of Kubernetes, only resources exclusively\nallocated to Guaranteed Quality of Service\npods will be accounted.\nThis includes\nCPU cores,\nmemory\nand\ndevices.
\n\nWhen run as a daemonset, nodes are re-examined for the allocated resources\n(to determine the information of the allocatable resources on a per-zone basis\nwhere a zone can be a NUMA node) at an interval specified using the\n-sleep-interval
\noption. The default sleep interval is set to 60s\nwhich is the value when no -sleep-interval is specified.\nThe re-examination can be disabled by setting the sleep-interval to 0.
Another option is to configure the updater to update\nthe allocated resources per pod life-cycle events.\nThe updater will monitor the checkpoint file stated in\n-kubelet-state-dir
\nand triggers an update for every change occurs in the files.
In addition, it can avoid examining specific allocated resources\ngiven a configuration of resources to exclude via -excludeList
Kubelet PodResource API with the\nGetAllocatableResources functionality enabled is a\nprerequisite for nfd-topology-updater to be able to run (i.e. Kubernetes v1.21\nor later is required).
\n\nPreceding Kubernetes v1.23, the kubelet
must be started with\n--feature-gates=KubeletPodResourcesGetAllocatable=true
.
Starting from Kubernetes v1.23, the KubeletPodResourcesGetAllocatable
\nfeature gate. is enabled by default
NFD-Topology-Updater supports configuration through a configuration file. The\ndefault location is /etc/kubernetes/node-feature-discovery/topology-updater.conf
,\nbut, this can be changed by specifying the-config
command line flag.
\n\n\nNOTE: unlike nfd-worker, dynamic configuration updates are not supported.
\n
Topology-Updater configuration file is read inside the container,\nand thus, Volumes and VolumeMounts are needed\nto make your configuration available for NFD.\nThe preferred method is to use a ConfigMap\nwhich provides easy deployment and re-configurability.
\n\nThe provided nfd-topology-updater deployment templates\ncreate an empty configmap\nand mount it inside the nfd-topology-updater containers.\nIn kustomize deployments, configuration can be edited with:
\n\nkubectl -n ${NFD_NS} edit configmap nfd-topology-updater-conf\n
In Helm deployments,\nTopology Updater parameters\ntoplogyUpdater.config
can be used to edit the respective configuration.
See\nnfd-topology-updater configuration file reference\nfor more details.\nThe (empty-by-default)\nexample config\ncontains all available configuration options and can be used as a reference\nfor creating a configuration.
\n\n\n","dir":"/usage/","name":"nfd-topology-updater.md","path":"usage/nfd-topology-updater.md","url":"/usage/nfd-topology-updater.html"},{"title":"Contributing","layout":"default","sort":6,"content":"You can reach us via the following channels:
\n\nThis is a\nSIG-node\nsubproject, hosted under the\nKubernetes SIGs organization in Github.\nThe project was established in 2016 and was migrated to Kubernetes SIGs in 2018.
\n\nThis is open source software released under the Apache 2.0 License.
\n","dir":"/contributing/","name":"index.md","path":"contributing/index.md","url":"/contributing/"},{"title":"Uninstallation","layout":"default","sort":6,"content":"Follow the uninstallation instructions of the deployment method used\n(kustomize,\nhelm or\noperator).
\n\nNFD-Master has a special -prune
command line flag for removing all\nnfd-related node labels, annotations, extended resources and taints from the\ncluster.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.6\nkubectl -n node-feature-discovery wait job.batch/nfd-master --for=condition=complete && \\\n kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.6\n
\n\n","dir":"/deployment/","name":"uninstallation.md","path":"deployment/uninstallation.md","url":"/deployment/uninstallation.html"},{"title":"Topology-Updater config reference","layout":"default","sort":6,"content":"NOTE: You must run prune before removing the RBAC rules (serviceaccount,\nclusterrole and clusterrolebinding).
\n
See the\nsample configuration file\nfor a full example configuration.
\n\nThe excludeList
specifies a key-value map of allocated resources\nthat should not be examined by the topology-updater\nagent per node.\nEach key is a node name with a value as a list of resources\nthat should not be examined by the agent for that specific node.
Default: empty
\n\nExample:
\n\nexcludeList:\n nodeA: [hugepages-2Mi]\n nodeB: [memory]\n nodeC: [cpu, hugepages-2Mi]\n
excludeList.*
is a special value that use to specify all nodes.\nA resource that would be listed under this key, would be excluded from all nodes.
Default: empty
\n\nExample:
\n\nexcludeList:\n '*': [hugepages-2Mi]\n
NFD-GC (NFD Garbage-Collector) is preferably run as a Kubernetes deployment\nwith one replica. It makes sure that all\nNodeFeature and\nNodeResourceTopology objects\nhave corresponding nodes and removes stale objects for non-existent nodes.
\n\nThe daemon watches for Node deletion events and removes NodeFeature and\nNodeResourceTopology objects upon them. It also runs periodically to make sure\nno node delete event was missed and to remove any NodeFeature or\nNodeResourceTopology objects that were created without corresponding node. The\ndefault garbage collector interval is set to 1h which is the value when no\n-gc-interval is specified.
\n\nIn Helm deployments (see\ngarbage collector parameters)\nNFD-GC will only be deployed when enableNodeFeatureApi
or\ntopologyUpdater.enable
is set to true.
Metrics are configured to be exposed using prometheus operator\nAPI’s by default. If you want to expose metrics using the prometheus operator\nAPI’s you need to install the prometheus operator in your cluster.\nBy default NFD Master and Worker expose metrics on port 8081.
\n\nThe exposed metrics are
\n\nMetric | \nType | \nDescription | \n
---|---|---|
nfd_master_build_info | \n Gauge | \nVersion from which nfd-master was built | \n
nfd_worker_build_info | \n Gauge | \nVersion from which nfd-worker was built | \n
nfd_gc_build_info | \n Gauge | \nVersion from which nfd-gc was built | \n
nfd_topology_updater_build_info | \n Gauge | \nVersion from which nfd-topology-updater was built | \n
nfd_node_update_requests_total | \n Counter | \nNumber of node update requests received by the master over gRPC | \n
nfd_node_updates_total | \n Counter | \nNumber of nodes updated | \n
nfd_node_update_failures_total | \n Counter | \nNumber of nodes update failures | \n
nfd_node_labels_rejected_total | \n Counter | \nNumber of nodes labels rejected by nfd-master | \n
nfd_node_extendedresources_rejected_total | \n Counter | \nNumber of nodes extended resources rejected by nfd-master | \n
nfd_node_taints_rejected_total | \n Counter | \nNumber of nodes taints rejected by nfd-master | \n
nfd_nodefeaturerule_processing_duration_seconds | \n Histogram | \nTime taken to process NodeFeatureRule objects | \n
nfd_nodefeaturerule_processing_errors_total | \n Counter | \nNumber or errors encountered while processing NodeFeatureRule objects | \n
nfd_feature_discovery_duration_seconds | \n Histogram | \nTime taken to discover features on a node | \n
nfd_topology_updater_scan_errors_total | \n Counter | \nNumber of errors in scanning resource allocation of pods. | \n
nfd_gc_objects_deleted_total | \n Counter | \nNumber of NodeFeature and NodeResourceTopology objects garbage collected. | \n
nfd_gc_object_delete_failures_total | \n Counter | \nNumber of errors in deleting NodeFeature and NodeResourceTopology objects. | \n
To deploy NFD with metrics enabled using kustomize, you can use the\nprometheus overlay.
\n\nBy default metrics are enabled when deploying NFD via Helm. To enable Prometheus\nto scrape metrics from NFD, you need to pass the following values to Helm:
\n\n--set prometheus.enable=true\n
For more info on Helm deployment, see Helm.
\n\nIt is recommended to specify\n--set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false
\nwhen deploying prometheus-operator via Helm to enable the prometheus-operator\nto scrape metrics from any PodMonitor.
or setting labels on the PodMonitor via the helm parameter prometheus.labels
\nto control which Prometheus instances will scrape this PodMonitor.
NFD contains an example Grafana dashboard. You can import\nexamples/grafana-dashboard.json
\nto your Grafana instance to visualize the NFD metrics.
To quickly view available command line flags execute nfd-gc -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 \\\nnfd-gc -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -gc-interval
specifies the interval between periodic garbage collector runs.
Default: 1h
\n\nExample:
\n\nnfd-gc -gc-interval=1h\n
NFD uses some Kubernetes custom resources.
\n\nNodeFeature is an NFD-specific custom resource for communicating node\nfeatures and node labeling requests. The nfd-master pod watches for NodeFeature\nobjects, labels nodes as specified and uses the listed features as input when\nevaluating NodeFeatureRules. NodeFeature objects can be\nused for implementing 3rd party extensions (see\ncustomization guide for more\ndetails).
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeature\nmetadata:\n labels:\n nfd.node.kubernetes.io/node-name: node-1\n name: node-1-vendor-features\nspec:\n features:\n instances:\n vendor.device:\n elements:\n - attributes:\n model: \"xpu-1\"\n memory: \"4000\"\n type: \"fast\"\n - attributes:\n model: \"xpu-2\"\n memory: \"16000\"\n type: \"slow\"\n labels:\n vendor-xpu-present: \"true\"\n
NodeFeatureRule is an NFD-specific custom resource that is designed for\nrule-based custom labeling of nodes. NFD-Master watches for NodeFeatureRule\nobjects in the cluster and labels nodes according to the rules within. Some use\ncases are e.g. application specific labeling in a specific environments or\nbeing distributed by hardware vendors to create specific labels for their\ndevices.
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: example-rule\nspec:\n rules:\n - name: \"example rule\"\n labels:\n \"example-custom-feature\": \"true\"\n # Label is created if all of the rules below match\n matchFeatures:\n # Match if \"veth\" kernel module is loaded\n - feature: kernel.loadedmodule\n matchExpressions:\n veth: {op: Exists}\n # Match if any PCI device with vendor 8086 exists in the system\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"8086\"]}\n
See the\nCustomization guide\nfor full documentation of the NodeFeatureRule resource and its usage.
\n\nThe\ndeployment/nodefeaturerule/samples/
\ndirectory contains sample NodeFeatureRule objects that replicate the built-in\ndefault feature labels generated by NFD. The sample rules can be used as a base\nto customize NFD feature labels. To use them in place of the the NFD built-in\nlabels, the corresponding feature source(s) of nfd-worker should be disabled\nwith the\ncore.labelSources
\nconfiguration option.
When run with NFD-Topology-Updater, NFD creates NodeResourceTopology objects\ncorresponding to node resource hardware topology such as:
\n\napiVersion: topology.node.k8s.io/v1alpha1\nkind: NodeResourceTopology\nmetadata:\n name: node1\ntopologyPolicies: [\"SingleNUMANodeContainerLevel\"]\nzones:\n - name: node-0\n type: Node\n resources:\n - name: cpu\n capacity: 20\n allocatable: 16\n available: 10\n - name: vendor/nic1\n capacity: 3\n allocatable: 3\n available: 3\n - name: node-1\n type: Node\n resources:\n - name: cpu\n capacity: 30\n allocatable: 30\n available: 15\n - name: vendor/nic2\n capacity: 6\n allocatable: 6\n available: 6\n - name: node-2\n type: Node\n resources:\n - name: cpu\n capacity: 30\n allocatable: 30\n available: 15\n - name: vendor/nic1\n capacity: 3\n allocatable: 3\n available: 3\n
The NodeResourceTopology objects created by NFD can be used to gain insight\ninto the allocatable resources along with the granularity of those resources at\na per-zone level (represented by node-0 and node-1 in the above example) or can\nbe used by an external entity (e.g. topology-aware scheduler plugin) to take an\naction based on the gathered information.
\n\n\n","dir":"/usage/","name":"custom-resources.md","path":"usage/custom-resources.md","url":"/usage/custom-resources.html"},{"title":"Kubectl plugin cmdline reference","layout":"default","sort":8,"content":"To quickly view available command line flags execute kubectl nfd -help
.
Print usage and exit.
\n\nValidate a NodeFeatureRule file.
\n\nThe --nodefeature-file
flag specifies the path to the NodeFeatureRule file\nto validate.
Test a NodeFeatureRule file against a node without applying it.
\n\nThe --kubeconfig
flag specifies the path to the kubeconfig file to use for\nCLI requests.
The --namespace
flag specifies the namespace to use for CLI requests.\nDefault: default
.
The --nodename
flag specifies the name of the node to test the\nNodeFeatureRule against.
The --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file\nto test.
Process a NodeFeatureRule file against a NodeFeature file.
\n\nThe --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file\nto test.
The --nodefeature-file
flag specifies the path to the NodeFeature file to test.
NFD provides multiple extension points for vendor and application specific\nlabeling:
\n\nNodeFeature
objects can be\nused to communicate “raw” node features and node labeling requests to\nnfd-master.NodeFeatureRule
objects provide a way to\ndeploy custom labeling rules via the Kubernetes API.local
feature source of nfd-worker creates\nlabels by reading text files and executing hooks.custom
feature source of nfd-worker creates\nlabels based on user-specified rules.NodeFeature objects provide a way for 3rd party extensions to advertise custom\nfeatures, both as “raw” features that serve as input to\nNodeFeatureRule objects and as feature\nlabels directly.
\n\nNote that RBAC rules must be created for each extension for them to be able to\ncreate and manipulate NodeFeature objects in their namespace.
\n\nThe NodeFeature CRD API can be disabled with the\n-enable-nodefeature-api=false
command line flag. This flag must be specified\nfor both nfd-master and nfd-worker as it will enable the gRPC communication\nbetween them. Note that the gRPC API is DEPRECATED and will be removed in a\nfuture release, at which point the NodeFeature API cannot be disabled.
Consider the following referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeature\nmetadata:\n labels:\n nfd.node.kubernetes.io/node-name: node-1\n name: vendor-features-for-node-1\nspec:\n # Features for NodeFeatureRule matching\n features:\n flags:\n vendor.flags:\n elements:\n feature-x: {}\n feature-y: {}\n attributes:\n vendor.config:\n elements:\n setting-a: \"auto\"\n knob-b: \"123\"\n instances:\n vendor.devices:\n elements:\n - attributes:\n model: \"dev-1000\"\n vendor: \"acme\"\n - attributes:\n model: \"dev-2000\"\n vendor: \"acme\"\n # Labels to be created\n labels:\n vendor.io/feature.enabled: \"true\"\n
The object targets node named node-1
. It lists two “flag type” features under\nthe vendor.flags
domain, two “attribute type” features and under the\nvendor.config
domain and two “instance type” features under the\nvendor.devices
domain. These features will not be directly affecting the node\nlabels but they will be used as input when the\nNodeFeatureRule
objects are evaluated.
In addition, the example requests directly the\nvendor.io/feature.enabled=true
node label to be created.
The nfd.node.kubernetes.io/node-name=<node-name>
must be in place for each\nNodeFeature object as NFD uses it to determine the node which it is targeting.
Features are divided into three different types:
\n\nNodeFeatureRule
objects provide an easy way to create vendor or application\nspecific labels and taints. It uses a flexible rule-based mechanism for creating\nlabels and optionally taints based on node features.
Consider the following referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-sample-rule-object\nspec:\n rules:\n - name: \"my sample rule\"\n labels:\n \"feature.node.kubernetes.io/my-sample-feature\": \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n dummy: {op: Exists}\n - feature: kernel.config\n matchExpressions:\n X86: {op: In, value: [\"y\"]}\n
It specifies one rule which creates node label\nfeature.node.kubernetes.io/my-sample-feature=true
if both of the following\nconditions are true (matchFeatures
implements a logical AND over the\nmatchers):
dummy
network driver module has been loaded=y
Create a NodeFeatureRule
with a yaml file:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/v0.15.6/examples/nodefeaturerule.yaml\n
Now, on X86 platforms the feature label appears after doing modprobe dummy
on\na system and correspondingly the label is removed after rmmod dummy
. Note a\nre-labeling delay up to the sleep-interval of nfd-worker (1 minute by default).
See Feature rule format for detailed description of\navailable fields and how to write labeling rules.
\n\nThis feature is experimental.
\n\nIn some circumstances, it is desirable to keep nodes with specialized hardware\naway from running general workload and instead leave them for workloads that\nneed the specialized hardware. One way to achieve it is to taint the nodes with\nthe specialized hardware and add corresponding toleration to pods that require\nthe special hardware. NFD offers node tainting functionality which is disabled\nby default. User can define one or more custom taints via the taints
field of\nthe NodeFeatureRule CR. The same rule-based mechanism is applied here and the\nNFD taints only rule matching nodes.
To enable the tainting feature, --enable-taints
flag needs to be set to true
.\nIf the flag --enable-taints
is set to false
(i.e. disabled), taints defined in\nthe NodeFeatureRule CR have no effect and will be ignored by the NFD master.
See documentation of the taints field for detailed description how\nto specify taints in the NodeFeatureRule object.
\n\n\n\n\nNOTE: Before enabling any taints, make sure to edit nfd-worker daemonset\nto tolerate the taints to be created. Otherwise, already running pods that do\nnot tolerate the taint are evicted immediately from the node including the\nnfd-worker pod.
\n
NFD-Worker has a special feature source named local
which is an integration\npoint for external feature detectors. It provides a mechanism for pluggable\nextensions, allowing the creation of new user-specific features and even\noverriding built-in labels.
The local
feature source has two methods for detecting features, feature\nfiles and hooks (deprecated). The features discovered by the local
source can\nfurther be used in label rules specified in\nNodeFeatureRule
objects and the\ncustom
feature source.
\n\n\nNOTE: Be careful when creating and/or updating hook or feature files\nwhile NFD is running. To avoid race conditions you should write\ninto a temporary file, and atomically create/update the original file by\ndoing a file rename operation. NFD ignores dot files,\nso temporary file can be written to the same directory and renamed\n(
\n.my.feature
->my.feature
) once file is complete. Both file names should\n(obviously) be unique for the given application.
Consider a plaintext file\n/etc/kubernetes/node-feature-discovery/features.d/my-features
\nhaving the following contents (or alternatively a shell script\n/etc/kubernetes/node-feature-discovery/source.d/my-hook.sh
having the\nfollowing stdout output):
feature.node.kubernetes.io/my-feature.1\nfeature.node.kubernetes.io/my-feature.2=myvalue\nvendor.io/my-feature.3=456\n
This will translate into the following node labels:
\n\nfeature.node.kubernetes.io/my-feature.1: \"true\"\nfeature.node.kubernetes.io/my-feature.2: \"myvalue\"\nvendor.io/my-feature.3: \"456\"\n
The local
source reads files found in\n/etc/kubernetes/node-feature-discovery/features.d/
. File content is parsed\nand translated into node labels, see the input format below.
DEPRECATED The local
source executes hooks found in\n/etc/kubernetes/node-feature-discovery/source.d/
. The hook files must be\nexecutable and they are supposed to print all discovered features in stdout
.\nSince NFD v0.13 the default container image only supports statically linked ELF\nbinaries.
stderr
output of hooks is propagated to NFD log so it can be used for\ndebugging and logging.
NFD tries to execute any regular files found from the hooks directory.\nAny additional data files the hook might need (e.g. a configuration file)\nshould be placed in a separate directory to avoid NFD unnecessarily\ntrying to execute them. A subdirectory under the hooks directory can be used,\nfor example /etc/kubernetes/node-feature-discovery/source.d/conf/
.
\n\n\nNOTE: Hooks are being DEPRECATED and will be removed in a future release.\nStarting from release v0.14 hooks are disabled by default and can be enabled\nvia
\nsources.local.hooksEnabled
field in the worker configuration.
sources:\n local:\n hooksEnabled: true # true by default at this point\n
\n\n\nNOTE: NFD will blindly run any executables placed/mounted in the hooks\ndirectory. It is the user’s responsibility to review the hooks for e.g.\npossible security implications.
\n\nNOTE: The full image variant\nprovides backwards-compatibility with older NFD versions by including a more\nexpanded environment, supporting bash and perl runtimes.
\n
The hook stdout and feature files are expected to contain features in simple\nkey-value pairs, separated by newlines:
\n\n# This is a comment\n<key>[=<value>]\n
The label value defaults to true
, if not specified.
Label namespace must be specified with <namespace>/<name>[=<value>]
.
\n\n\nNOTE: The feature file size limit it 64kB. The feature file will be\nignored if the size limit is exceeded.
\n
Comment lines (starting with #
) are ignored.
Adding following line anywhere to feature file defines date when\nits content expires / is ignored:
\n\n# +expiry-time=2023-07-29T11:22:33Z\n
Also, the expiry-time value would stay the same during the processing of the\nfeature file until another expiry-time directive is encountered.\nConsidering the following file:
\n\n# +expiry-time=2012-07-28T11:22:33Z\nvendor.io/feature1=featureValue\n\n# +expiry-time=2080-07-28T11:22:33Z\nvendor.io/feature2=featureValue2\n\n# +expiry-time=2070-07-28T11:22:33Z\nvendor.io/feature3=featureValue3\n\n# +expiry-time=2002-07-28T11:22:33Z\nvendor.io/feature4=featureValue4\n
After processing the above file, only vendor.io/feature2
and\nvendor.io/feature3
would be included in the list of accepted features.
\n\n\nNOTE: The time format supported is RFC3339. Also, the
\nexpiry-time
\ntag is only evaluated in each re-discovery period, and the expiration of\nnode labels is not tracked.
To exclude specific features from the local.feature
Feature, you can use the\n# +no-feature
directive. The # +no-label
directive causes the feature to\nbe excluded from the local.label
Feature and a node label not to be generated.
Considering the following file:
\n\n# +no-feature\nvendor.io/label-only=value\n\nvendor.io/my-feature=value\n\nvendor.io/foo=bar\n\n# +no-label\nfoo=baz\n
Processing the above file would result in the following Features:
\n\nlocal.features:\n foo: baz\n vendor.io/my-feature: value\nlocal.labels:\n vendor.io/label-only: value\n vendor.io/my-feature: value\n
and the following labels added to the Node:
\n\nvendor.io/label-only=value\nvendor.io/my-feature=value\n
\n\n\nNOTE: use of unprefixed label names (like
\nfoo=bar
) should not be used.\nIn NFD v0.15 unprefixed names will be automatically prefixed\nwithfeature.node.kubernetes.io/
but this will change in a future version\n(see\nautoDefaultNs config option.\nUnprefixed names for plain Features (tagged with# +no-label
) can be used\nwithout restrictions, however.
The standard NFD deployments contain hostPath
mounts for\n/etc/kubernetes/node-feature-discovery/source.d/
and\n/etc/kubernetes/node-feature-discovery/features.d/
, making these directories\nfrom the host available inside the nfd-worker container.
One use case for the feature files and hooks is detecting features in other\nPods outside NFD, e.g. in Kubernetes device plugins. By using the same\nhostPath
mounts for /etc/kubernetes/node-feature-discovery/source.d/
and\n/etc/kubernetes/node-feature-discovery/features.d/
in the side-car (e.g.\ndevice plugin) creates a shared area for deploying feature files and hooks to\nNFD. NFD periodically scans the directories and reads any feature files and\nruns any hooks it finds.
The custom
feature source in nfd-worker provides a rule-based mechanism for\nlabel creation, similar to the\nNodeFeatureRule
objects. The difference is\nthat the rules are specified in the worker configuration instead of a\nKubernetes API object.
See worker configuration\nfor instructions how to set-up and manage the worker configuration.
\n\nConsider the following referential configuration for nfd-worker:
\n\ncore:\n labelSources: [\"custom\"]\nsources:\n custom:\n - name: \"my sample rule\"\n labels:\n \"feature.node.kubenernetes.io/my-sample-feature\": \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n dummy: {op: Exists}\n - feature: kernel.config\n matchExpressions:\n X86: {op: In, value: [\"y\"]}\n
It specifies one rule which creates node label\nfeature.node.kubenernetes.io/my-sample-feature=true
if both of the following\nconditions are true (matchFeatures
implements a logical AND over the\nmatchers):
dummy
network driver module has been loaded=y
In addition, the configuration only enables the custom
source, disabling all\nbuilt-in labels.
Now, on X86 platforms the feature label appears after doing modprobe dummy
on\na system and correspondingly the label is removed after rmmod dummy
. Note a\nre-labeling delay up to the sleep-interval of nfd-worker (1 minute by default).
In addition to the rules defined in the nfd-worker configuration file, the\ncustom
feature source can read more configuration files located in the\n/etc/kubernetes/node-feature-discovery/custom.d/
directory. This makes more\ndynamic and flexible configuration easier.
As an example, consider having file\n/etc/kubernetes/node-feature-discovery/custom.d/my-rule.yaml
with the\nfollowing content:
- name: \"my e1000 rule\"\n labels:\n \"feature.node.kubenernetes.io/e1000.present\": \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n e1000: {op: Exists}\n
This simple rule will create feature.node.kubenernetes.io/e1000.present=true
\nlabel if the e1000
kernel module has been loaded.
The\nsamples/custom-rules
\nkustomize overlay sample contains an example for deploying a custom rule from a\nConfigMap.
Feature labels have the following format:
\n\n<namespace>/<name> = <value>\n
The namespace part (i.e. prefix) of the labels is controlled by nfd:
\n\nfeature.node.kubernetes.io
.-deny-label-ns
\ncommand line flag of nfd-master\n -extra-label-ns
\ncommand line flag of nfd-master.\ne.g: nfd-master -deny-label-ns=\"*\" -extra-label-ns=example.com
This section describes the rule format used in\nNodeFeatureRule
objects and in the\nconfiguration of the custom
feature source.
It is based on a generic feature matcher that covers all features discovered by\nnfd-worker. The rules rely on a unified data model of the available features\nand a generic expression-based format. Features that can be used in the rules\nare described in detail in available features below.
\n\nTake this rule as a referential example:
\n\n - name: \"my feature rule\"\n labels:\n \"feature.node.kubernetes.io/my-special-feature\": \"my-value\"\n matchFeatures:\n - feature: cpu.cpuid\n matchExpressions:\n AVX512F: {op: Exists}\n - feature: kernel.version\n matchExpressions:\n major: {op: In, value: [\"5\"]}\n minor: {op: Gt, value: [\"1\"]}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"8086\"]}\n class: {op: In, value: [\"0200\"]}\n
This will yield feature.node.kubernetes.io/my-special-feature=my-value
node\nlabel if all of these are true (matchFeatures
implements a logical AND over\nthe matchers):
The .name
field is required and used as an identifier of the rule.
The .labels
is a map of the node labels to create if the rule matches.
Take this rule as a referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-sample-rule-object\nspec:\n rules:\n - name: \"my dynamic label value rule\"\n labels:\n feature.node.kubernetes.io/linux-lsm-enabled: \"@kernel.config.LSM\"\n feature.node.kubernetes.io/custom-label: \"customlabel\"\n
Label linux-lsm-enabled
uses the @
notation for dynamic values.\nThe value of the label will be the value of the attribute LSM
\nof the feature kernel.config
.
The @<feature-name>.<element-name>
format can be used to inject values of\ndetected features to the label. See\navailable features for possible values to use.
This will yield into the following node label:
\n\n labels:\n ...\n feature.node.kubernetes.io/linux-lsm-enabled: apparmor\n feature.node.kubernetes.io/custom-label: \"customlabel\"\n
The .labelsTemplate
field specifies a text template for dynamically creating\nlabels based on the matched features. See templating for\ndetails.
\n\n\nNOTE: The
\nlabels
field has priority overlabelsTemplate
, i.e.\nlabels specified in thelabels
field will override anything\noriginating fromlabelsTemplate
.
The .annotations
field is a list of features to be advertised as node\nannotations.
Take this rule as a referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: feature-annotations-example\nspec:\n rules:\n - name: \"annotation-example\"\n annotations:\n feature.node.kubernetes.io/defaul-ns-annotation: \"foo\"\n custom.vendor.io/feature: \"baz\"\n matchFeatures:\n - feature: kernel.version\n matchExpressions:\n major: {op: Exists}\n
This will yield into the following node annotations:
\n\n annotations:\n ...\n feature.node.kubernetes.io/defaul-ns-annotation: \"foo\"\n custom.vendor.io/feature: \"baz\"\n ...\n
NFD enforces some limitations to the namespace (or prefix)/ of the annotations:
\n\nkubernetes.io/
and its sub-namespaces (like sub.ns.kubernetes.io/
) cannot\ngenerally be usedfeature.node.kubernetes.io/
and its sub-namespaces\n(like sub.ns.feature.node.kubernetes.io
)my-annotation
) should not be used. In NFD v0.15 unprefixed names will be automatically prefixed with\nfeature.node.kubernetes.io/
but this will change in a future version (see\nautoDefaultNs config option.\n\n\nNOTE: The
\nannotations
field has will only advertise features via node\nannotations the features won’t be advertised as node labels unless they are\nspecified in thelabels
field.
taints is a list of taint entries and each entry can have key
, value
and effect
,\nwhere the value
is optional. Effect could be NoSchedule
, PreferNoSchedule
\nor NoExecute
. To learn more about the meaning of these effects, check out k8s documentation.
Example NodeFeatureRule with taints:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-sample-rule-object\nspec:\n rules:\n - name: \"my sample taint rule\"\n taints:\n - effect: PreferNoSchedule\n key: \"feature.node.kubernetes.io/special-node\"\n value: \"true\"\n - effect: NoExecute\n key: \"feature.node.kubernetes.io/dedicated-node\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n dummy: {op: Exists}\n - feature: kernel.config\n matchExpressions:\n X86: {op: In, value: [\"y\"]}\n
In this example, if the my sample taint rule
rule is matched,\nfeature.node.kubernetes.io/pci-0300_1d0f.present=true:NoExecute
\nand feature.node.kubernetes.io/cpu-cpuid.ADX:NoExecute
taints are set on the node.
There are some limitations to the namespace part (i.e. prefix/) of the taint\nkey:
\n\nkubernetes.io/
and its sub-namespaces (like sub.ns.kubernetes.io/
) cannot\ngenerally be usedfeature.node.kubernetes.io/
and its sub-namespaces\n(like sub.ns.feature.node.kubernetes.io
)foo
) keys are disallowed\n\n\nNOTE: taints field is not available for the custom rules of nfd-worker\nand only for NodeFeatureRule objects.
\n
The .vars
field is a map of values (key-value pairs) to store for subsequent\nrules to use. In other words, these are variables that are not advertised as\nnode labels. See backreferences for more details on the\nusage of vars.
The .extendedResources
field is a list of extended resources to advertise.\nSee extended resources for more details.
Take this rule as a referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-extended-resource-rule\nspec:\n rules:\n - name: \"my extended resource rule\"\n extendedResources:\n vendor.io/dynamic: \"@kernel.version.major\"\n vendor.io/static: \"123\"\n matchFeatures:\n - feature: kernel.version\n matchExpressions:\n major: {op: Exists}\n
The extended resource vendor.io/dynamic
is defined in the form @feature.attribute
.\nThe value of the extended resource will be the value of the attribute major
\nof the feature kernel.version
.
The @<feature-name>.<element-name>
format can be used to inject values of\ndetected features to the extended resource. See\navailable features for possible values to use. Note that\nthe value must be eligible as a\nKubernetes resource quantity.
This will yield into the following node status:
\n\n allocatable:\n ...\n vendor.io/dynamic: \"5\"\n vendor.io/static: \"123\"\n ...\n capacity:\n ...\n vendor.io/dynamic: \"5\"\n vendor.io/static: \"123\"\n ...\n
There are some limitations to the namespace part (i.e. prefix)/ of the Extended\nResources names:
\n\nkubernetes.io/
and its sub-namespaces (like sub.ns.kubernetes.io/
) cannot\ngenerally be usedfeature.node.kubernetes.io/
and its sub-namespaces\n(like sub.ns.feature.node.kubernetes.io
)my-er
) site.version }} unprefixed names will be\nautomatically prefixed with feature.node.kubernetes.io/
but this will\nchange in a future version (see\nautoDefaultNs config option.\n\n\nNOTE:
\n.extendedResources
is not supported by the\ncustom feature source – it can only be used in\nNodeFeatureRule objects.
The .varsTemplate
field specifies a text template for dynamically creating\nvars based on the matched features. See templating for details\non using templates and backreferences for more details on\nthe usage of vars.
\n\n\nNOTE: The
\nvars
field has priority overvarsTemplate
, i.e.\nvars specified in thevars
field will override anything originating from\nvarsTemplate
.
The .matchFeatures
field specifies a feature matcher, consisting of a list of\nfeature matcher terms. It implements a logical AND over the terms i.e. all\nof them must match for the rule to trigger.
matchFeatures:\n - feature: <feature-name>\n matchExpressions:\n <key>:\n op: <op>\n value:\n - <value-1>\n - ...\n matchName:\n op: <op>\n value:\n - <value-1>\n - ...\n
The .matchFeatures[].feature
field specifies the feature which to evaluate.
\n\n\nNOTE:If both
\nmatchExpressions
and\nmatchName
are specified, they both must match.
The .matchFeatures[].matchExpressions
field is used to match against the\nvalue(s) of a feature. The matchExpressions
field consists of a set of\nexpressions, each of which is evaluated against all elements of the specified\nfeature.
matchExpressions:\n <key>:\n op: <op>\n value:\n - <value-1>\n - ...\n
In each MatchExpression the key
specifies the name of of the feature element\n(flag and attribute features) or name of the attribute (instance\nfeatures) which to look for. The behavior of MatchExpression depends on the\nfeature type:
<key>
The op
field specifies the operator to apply. Valid values are described\nbelow.
Operator | \nNumber of values | \nMatches when | \n
---|---|---|
In | \n 1 or greater | \nInput is equal to one of the values | \n
NotIn | \n 1 or greater | \nInput is not equal to any of the values | \n
InRegexp | \n 1 or greater | \nValues of the MatchExpression are treated as regexps and input matches one or more of them | \n
Exists | \n 0 | \nThe key exists | \n
DoesNotExist | \n 0 | \nThe key does not exists | \n
Gt | \n 1 | \nInput is greater than the value. Both the input and value must be integer numbers. | \n
Lt | \n 1 | \nInput is less than the value. Both the input and value must be integer numbers. | \n
GtLt | \n 2 | \nInput is between two values. Both the input and value must be integer numbers. | \n
IsTrue | \n 0 | \nInput is equal to “true” | \n
IsFalse | \n 0 | \nInput is equal “false” | \n
The value
field of MatchExpression is a list of string arguments to the\noperator.
The .matchFeatures[].matchName
field is used to match against the\nname(s) of a feature (whereas the matchExpressions
field\nmatches against the value(s). The matchName
field consists of a single\nexpression which is evaulated against the name of each element of the specified\nfeature.
matchName:\n op: <op>\n value:\n - <value-1>\n - ...\n
The behavior of matchName
depends on the feature type:
The op
field specifies the operator to apply. Same operators as for\nmatchExpressions
above are available.
Operator | \nNumber of values | \nMatches | \n
---|---|---|
In | \n 1 or greater | \nAll name is equal to one of the values | \n
NotIn | \n 1 or greater | \nAll name that is not equal to any of the values | \n
InRegexp | \n 1 or greater | \nAll name that matches any of the values (treated as regexps) | \n
Exists | \n 0 | \nAll elements | \n
Other operators are not practical with matchName
(DoesNotExist
never\nmatches; Gt
,Lt
and GtLt
are only usable if feature names are integers;\nIsTrue
and IsFalse
are only usable if the feature name is true
or\nfalse
).
The value
field is a list of string arguments to the operator.
An example:
\n\n matchFeatures:\n - feature: cpu.cpuid\n matchName: {op: InRegexp, value: [\"^AVX\"]}\n
The snippet above would match if any CPUID feature starting with AVX is present\n(e.g. AVX1 or AVX2 or AVX512F etc).
\n\nThe .matchAny
field is a list of of matchFeatures
\nmatchers. A logical OR is applied over the matchers, i.e. at least one of them\nmust match for the rule to trigger.
Consider the following example:
\n\n matchAny:\n - matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n kmod-1: {op: Exists}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"0eee\"]}\n class: {op: In, value: [\"0200\"]}\n - matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n kmod-2: {op: Exists}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"0fff\"]}\n class: {op: In, value: [\"0200\"]}\n
This matches if kernel module kmod-1 is loaded and a network controller from\nvendor 0eee is present, OR, if kernel module kmod-2 has been loaded and a\nnetwork controller from vendor 0fff is present (OR both of these conditions are\ntrue).
\n\nThe following features are available for matching:
\n\nFeature | \nFeature type | \nElements | \nValue type | \nDescription | \n
---|---|---|---|---|
cpu.cpuid | \n flag | \n\n | \n | Supported CPU capabilities | \n
\n | \n | <cpuid-flag> | \n \n | CPUID flag is present | \n
cpu.cstate | \n attribute | \n\n | \n | Status of cstates in the intel_idle cpuidle driver | \n
\n | \n | enabled | \n bool | \n‘true’ if cstates are set, otherwise ‘false’. Does not exist of intel_idle driver is not active. | \n
cpu.model | \n attribute | \n\n | \n | CPU model related attributes | \n
\n | \n | family | \n int | \nCPU family | \n
\n | \n | vendor_id | \n string | \nCPU vendor ID | \n
\n | \n | id | \n int | \nCPU model ID | \n
cpu.pstate | \n attribute | \n\n | \n | State of the Intel pstate driver. Does not exist if the driver is not enabled. | \n
\n | \n | status | \n string | \nStatus of the driver, possible values are ‘active’ and ‘passive’ | \n
\n | \n | turbo | \n bool | \n‘true’ if turbo frequencies are enabled, otherwise ‘false’ | \n
\n | \n | scaling | \n string | \nActive scaling_governor, possible values are ‘powersave’ or ‘performance’. | \n
cpu.rdt | \n attribute | \n\n | \n | Intel RDT capabilities supported by the system | \n
\n | \n | <rdt-flag> | \n \n | RDT capability is supported, see RDT flags for details | \n
\n | \n | RDTL3CA_NUM_CLOSID | \n int | \nThe number or available CLOSID (Class of service ID) for Intel L3 Cache Allocation Technology | \n
cpu.security | \n attribute | \n\n | \n | Features related to security and trusted execution environments | \n
\n | \n | sgx.enabled | \n bool | \ntrue if Intel SGX (Software Guard Extensions) has been enabled, otherwise does not exist | \n
\n | \n | sgx.epc | \n int | \nThe total amount Intel SGX Encrypted Page Cache memory in bytes. It’s only present if sgx.enabled is true . | \n
\n | \n | se.enabled | \n bool | \ntrue if IBM Secure Execution for Linux is available and has been enabled, otherwise does not exist | \n
\n | \n | tdx.enabled | \n bool | \ntrue if Intel TDX (Trusted Domain Extensions) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | tdx.total_keys | \n int | \nThe total amount of keys an Intel TDX (Trusted Domain Extensions) host can provide. It’s only present if tdx.enabled is true . | \n
\n | \n | tdx.protected | \n bool | \ntrue if a guest VM was started using Intel TDX (Trusted Domain Extensions), otherwise does not exist. | \n
\n | \n | sev.enabled | \n bool | \ntrue if AMD SEV (Secure Encrypted Virtualization) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | sev.es.enabled | \n bool | \ntrue if AMD SEV-ES (Encrypted State supported) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | sev.snp.enabled | \n bool | \ntrue if AMD SEV-SNP (Secure Nested Paging supported) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | sev.asids | \n int | \nThe total amount of AMD SEV address-space identifiers (ASIDs), based on the /sys/fs/cgroup/misc.capacity information. | \n
\n | \n | sev.encrypted_state_ids | \n int | \nThe total amount of AMD SEV-ES and SEV-SNP supported, based on the /sys/fs/cgroup/misc.capacity information. | \n
cpu.sst | \n attribute | \n\n | \n | Intel SST (Speed Select Technology) capabilities | \n
\n | \n | bf.enabled | \n bool | \ntrue if Intel SST-BF (Intel Speed Select Technology - Base frequency) has been enabled, otherwise does not exist | \n
cpu.topology | \n attribute | \n\n | \n | CPU topology related features | \n
\n | \n | hardware_multithreading | \n bool | \nHardware multithreading, such as Intel HTT, is enabled | \n
\n | \n | socket_count | \n int | \nNumber of CPU Sockets | \n
cpu.coprocessor | \n attribute | \n\n | \n | CPU Coprocessor related features | \n
\n | \n | nx_gzip | \n bool | \nNest Accelerator GZIP support is enabled | \n
kernel.config | \n attribute | \n\n | \n | Kernel configuration options | \n
\n | \n | <config-flag> | \n string | \nValue of the kconfig option | \n
kernel.loadedmodule | \n flag | \n\n | \n | Kernel modules loaded on the node as reported by /proc/modules | \n
kernel.enabledmodule | \n flag | \n\n | \n | Kernel modules loaded on the node and available as built-ins as reported by modules.builtin | \n
\n | \n | mod-name | \n \n | Kernel module <mod-name> is loaded | \n
kernel.selinux | \n attribute | \n\n | \n | Kernel SELinux related features | \n
\n | \n | enabled | \n bool | \ntrue if SELinux has been enabled and is in enforcing mode, otherwise false | \n
kernel.version | \n attribute | \n\n | \n | Kernel version information | \n
\n | \n | full | \n string | \nFull kernel version (e.g. ‘4.5.6-7-g123abcde’) | \n
\n | \n | major | \n int | \nFirst component of the kernel version (e.g. ‘4’) | \n
\n | \n | minor | \n int | \nSecond component of the kernel version (e.g. ‘5’) | \n
\n | \n | revision | \n int | \nThird component of the kernel version (e.g. ‘6’) | \n
local.label | \n attribute | \n\n | \n | Labels from feature files and hooks, i.e. labels from the local feature source | \n
local.feature | \n attribute | \n\n | \n | Features from feature files and hooks, i.e. features from the local feature source | \n
\n | \n | <label-name> | \n string | \nLabel <label-name> created by the local feature source, value equals the value of the label | \n
memory.nv | \n instance | \n\n | \n | NVDIMM devices present in the system | \n
\n | \n | <sysfs-attribute> | \n string | \nValue of the sysfs device attribute, available attributes: devtype , mode | \n
memory.numa | \n attribute | \n\n | \n | NUMA nodes | \n
\n | \n | is_numa | \n bool | \ntrue if NUMA architecture, false otherwise | \n
\n | \n | node_count | \n int | \nNumber of NUMA nodes | \n
network.device | \n instance | \n\n | \n | Physical (non-virtual) network interfaces present in the system | \n
\n | \n | name | \n string | \nName of the network interface | \n
\n | \n | <sysfs-attribute> | \n string | \nSysfs network interface attribute, available attributes: operstate , speed , sriov_numvfs , sriov_totalvfs | \n
network.virtual | \n instance | \n\n | \n | Virtual network interfaces present in the system | \n
\n | \n | name | \n string | \nName of the network interface | \n
\n | \n | <sysfs-attribute> | \n string | \nSysfs network interface attribute, available attributes: operstate , speed | \n
pci.device | \n instance | \n\n | \n | PCI devices present in the system | \n
\n | \n | <sysfs-attribute> | \n string | \nValue of the sysfs device attribute, available attributes: class , vendor , device , subsystem_vendor , subsystem_device , sriov_totalvfs , iommu_group/type , iommu/intel-iommu/version | \n
storage.block | \n instance | \n\n | \n | Block storage devices present in the system | \n
\n | \n | name | \n string | \nName of the block device | \n
\n | \n | <sysfs-attribute> | \n string | \nSysfs network interface attribute, available attributes: dax , rotational , nr_zones , zoned | \n
system.osrelease | \n attribute | \n\n | \n | System identification data from /etc/os-release | \n
\n | \n | <parameter> | \n string | \nOne parameter from /etc/os-release | \n
system.name | \n attribute | \n\n | \n | System name information | \n
\n | \n | nodename | \n string | \nName of the kubernetes node object | \n
usb.device | \n instance | \n\n | \n | USB devices present in the system | \n
\n | \n | <sysfs-attribute> | \n string | \nValue of the sysfs device attribute, available attributes: class , vendor , device , serial | \n
rule.matched | \n attribute | \n\n | \n | Previously matched rules | \n
\n | \n | <label-or-var> | \n string | \nLabel or var from a preceding rule that matched | \n
Flag | \nDescription | \n
---|---|
RDTMON | \nIntel RDT Monitoring Technology | \n
RDTCMT | \nIntel Cache Monitoring (CMT) | \n
RDTMBM | \nIntel Memory Bandwidth Monitoring (MBM) | \n
RDTL3CA | \nIntel L3 Cache Allocation Technology | \n
RDTl2CA | \nIntel L2 Cache Allocation Technology | \n
RDTMBA | \nIntel Memory Bandwidth Allocation (MBA) Technology | \n
Rules support template-based creation of labels and vars with the\n.labelsTemplate
and .varsTemplate
fields. These makes it possible to\ndynamically generate labels and vars based on the features that matched.
The template must expand into a simple format with <key>=<value>
pairs\nseparated by newline.
Consider the following example:\n
\n\n labelsTemplate: |\n {{ range .pci.device }}vendor-{{ .class }}-{{ .device }}.present=true\n {{ end }}\n matchFeatures:\n - feature: pci.device\n matchExpressions:\n class: {op: InRegexp, value: [\"^02\"]}\n vendor: [\"0fff\"]\n
The rule above will create individual labels\nfeature.node.kubernetes.io/vendor-<class-id>-<device-id>.present=true
for\neach network controller device (device class starting with 02) from vendor\n0fff.
All the matched features of each feature matcher term under matchFeatures
\nfields are available for the template engine. Matched features can be\nreferenced with {{ .<feature-name> }}
in the template, and\nthe available data could be described in yaml as follows:
.\n <key-feature>:\n - Name: <matched-key>\n - ...\n\n <value-feature>:\n - Name: <matched-key>\n Value: <matched-value>\n - ...\n\n <instance-feature>:\n - <attribute-1-name>: <attribute-1-value>\n <attribute-2-name>: <attribute-2-value>\n ...\n - ...\n
That is, the per-feature data is a list of objects whose data fields depend on\nthe type of the feature:
\n\nA simple example of a template utilizing name and value from an attribute\nfeature:\n
\n\n labelsTemplate: |\n {{ range .system.osrelease }}system-{{ .Name }}={{ .Value }}\n {{ end }}\n matchFeatures:\n - feature: system.osRelease\n matchExpressions:\n ID: {op: Exists}\n VERSION_ID.major: {op: Exists}\n
\n\n\nNOTE:If both
\nmatchExpressions
andmatchName
for a feature matcher\nterm (seematchFeatures
) is specified, the list of\nmatched features (for the template engine) is the union from both of these.\n\nNOTE: In case of matchAny is specified, the template is executed\nseparately against each individualmatchFeatures
field and the final set of\nlabels will be superset of all these separate template expansions. E.g.\nconsider the following:
- name: <name>\n labelsTemplate: <template>\n matchFeatures: <matcher#1>\n matchAny:\n - matchFeatures: <matcher#2>\n - matchFeatures: <matcher#3>\n
In the example above (assuming the overall result is a match) the template\nwould be executed on matcher#1 as well as on matcher#2 and/or matcher#3\n(depending on whether both or only one of them match). All the labels from\nthese separate expansions would be created, i.e. the end result would be a\nunion of all the individual expansions.
\n\nRule templates use the Golang text/template\npackage and all its built-in functionality (e.g. pipelines and functions) can\nbe used. An example template taking use of the built-in len
function,\nadvertising the number of PCI network controllers from a specific vendor:\n
labelsTemplate: |\n num-intel-network-controllers={{ .pci.device | len }}\n matchFeatures:\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"8086\"]}\n class: {op: In, value: [\"0200\"]}\n\n
Imaginative template pipelines are possible, but care must be taken to\nproduce understandable and maintainable rule sets.
\n\nRules support referencing the output of preceding rules. This enables\nsophisticated scenarios where multiple rules are combined together\nto for more complex heuristics than a single rule can provide. The labels and\nvars created by the execution of preceding rules are available as a special\nrule.matched
feature.
Consider the following configuration:
\n\n - name: \"my kernel label rule\"\n labels:\n kernel-feature: \"true\"\n matchFeatures:\n - feature: kernel.version\n matchExpressions:\n major: {op: Gt, value: [\"4\"]}\n\n - name: \"my var rule\"\n vars:\n nolabel-feature: \"true\"\n matchFeatures:\n - feature: cpu.cpuid\n matchExpressions:\n AVX512F: {op: Exists}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"0fff\"]}\n device: {op: In, value: [\"1234\", \"1235\"]}\n\n - name: \"my high level feature rule\"\n labels:\n high-level-feature: \"true\"\n matchFeatures:\n - feature: rule.matched\n matchExpressions:\n kernel-feature: {op: IsTrue}\n nolabel-feature: {op: IsTrue}\n
The feature.node.kubernetes.io/high-level-feature = true
label depends on the\ntwo previous rules.
Note that when referencing rules across multiple\nNodeFeatureRule
objects attention must be\npaid to the ordering. NodeFeatureRule
objects are processed in alphabetical\norder (based on their .metadata.name
).
Some more configuration examples below.
\n\nMatch certain CPUID features:
\n\n - name: \"example cpuid rule\"\n labels:\n my-special-cpu-feature: \"true\"\n matchFeatures:\n - feature: cpu.cpuid\n matchExpressions:\n AESNI: {op: Exists}\n AVX: {op: Exists}\n
Require a certain loaded kernel module and OS version:
\n\n - name: \"my multi-feature rule\"\n labels:\n my-special-multi-feature: \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n e1000: {op: Exists}\n - feature: system.osrelease\n matchExpressions:\n NAME: {op: InRegexp, values: [\"^openSUSE\"]}\n VERSION_ID.major: {op: Gt, values: [\"14\"]}\n
Require a loaded kernel module and two specific PCI devices (both of which\nmust be present):
\n\n - name: \"my multi-device rule\"\n labels:\n my-multi-device-feature: \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n my-driver-module: {op: Exists}\n - pci.device:\n vendor: \"0fff\"\n device: \"1234\"\n - pci.device:\n vendor: \"0fff\"\n device: \"abcd\"\n
Node Feature Discovery follows semantic versioning where\nthe version number consists of three components, i.e. MAJOR.MINOR.PATCH.
\n\nThe most recent two minor releases (or release branches) of Node Feature\nDiscovery are supported. That is, with X being the latest release, X and X-1\nare supported and X-1 reaches end-of-life when X+1 is released.
\n\nBuilt-in feature labels and\nfeatures are supported\nfor 2 releases after being deprecated, at minimum. That is, if a feature label\nis deprecated in version X, it will be supported in X+1 and X+2 and\nmay be dropped in X+3.
\n\nCommand-line flags and configuration file options are supported for 1 more\nrelease after being deprecated, at minimum. That is, if option/flag is\ndeprecated in version X, it will be supported in X+1 and may be removed\nin X+2.
\n\nThe same policy (support for 1 release after deprecation) also applies to Helm\nchart parameters.
\n","dir":"/reference/","name":"versions.md","path":"reference/versions.md","url":"/reference/versions.html"},{"title":"Examples and demos","layout":"default","sort":9,"content":"This page contains usage examples and demos.
\n\nA demo on the benefits of using node feature discovery can be found in the\nsource code repository under\ndemo/.
\n","dir":"/usage/","name":"examples-and-demos.md","path":"usage/examples-and-demos.md","url":"/usage/examples-and-demos.html"},{"title":"Kubectl plugin","layout":"default","sort":10,"content":"\n\n\nDeveloper Preview This feature is currently in developer preview and\nsubject to change. It is not recommended to use it in production\nenvironments.
\n
The kubectl
plugin kubectl nfd
can be used to validate/dryrun and test\nNodeFeatureRule objects. It can be installed with the following command:
git clone https://github.com/kubernetes-sigs/node-feature-discovery\ncd node-feature-discovery\nmake build-kubectl-nfd\nKUBECTL_PATH=/usr/local/bin/\nmv ./bin/kubectl-nfd ${KUBECTL_PATH}\n
The plugin can be used to validate a NodeFeatureRule object:
\n\nkubectl nfd validate -f <nodefeaturerule.yaml>\n
The plugin can be used to test a NodeFeatureRule object against a node:
\n\nkubectl nfd test -f <nodefeaturerule.yaml> -n <node-name>\n
The plugin can be used to DryRun a NodeFeatureRule object against a NodeFeature\nfile:
\n\nkubectl get -n node-feature-discovery nodefeature <nodename> -o yaml > <nodefeature.yaml>\nkubectl nfd dryrun -f <nodefeaturerule.yaml> -n <nodefeature.yaml>\n
Or you can use the example NodeFeature file(it is a minimal NodeFeature file):
\n\n$ kubectl nfd dryrun -f examples/nodefeaturerule.yaml -n examples/nodefeature.yaml\nEvaluating NodeFeatureRule \"examples/nodefeaturerule.yaml\" against NodeFeature \"examples/nodefeature.yaml\"\nProcessing rule: my sample rule\n*** Labels ***\nvendor.io/my-sample-feature=true\nNodeFeatureRule \"examples/nodefeaturerule.yaml\" is valid for NodeFeature \"examples/nodefeature.yaml\"\n
NFD offers two variants of the container image. Released container images are\navailable for x86_64 and Arm64 architectures.
\n\nThe default is a minimal image based on\nscratch\nand only supports running statically linked binaries.
\n\nFor backwards compatibility a container image tag with suffix -minimal
\n(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.7-minimal
) is provided.
This image is based on debian:bookworm-slim\nand contains a full Linux system for running shell-based nfd-worker hooks and\ndoing live debugging and diagnosis of the NFD images.
\n\nThe container image tag has suffix -full
\n(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.7-full
).
Welcome to Node Feature Discovery – a Kubernetes add-on for detecting hardware\nfeatures and system configuration!
\n\nContinue to:
\n\nIntroduction for more details on the\nproject.
\nQuick start for quick step-by-step\ninstructions on how to get NFD running on your cluster.
\n$ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7\n namespace/node-feature-discovery created\n serviceaccount/nfd-master created\n clusterrole.rbac.authorization.k8s.io/nfd-master created\n clusterrolebinding.rbac.authorization.k8s.io/nfd-master created\n configmap/nfd-worker-conf created\n service/nfd-master created\n deployment.apps/nfd-master created\n daemonset.apps/nfd-worker created\n\n$ kubectl -n node-feature-discovery get all\n NAME READY STATUS RESTARTS AGE\n pod/nfd-master-555458dbbc-sxg6w 1/1 Running 0 56s\n pod/nfd-worker-mjg9f 1/1 Running 0 17s\n...\n\n$ kubectl get nodes -o json | jq '.items[].metadata.labels'\n {\n \"kubernetes.io/arch\": \"amd64\",\n \"kubernetes.io/os\": \"linux\",\n \"feature.node.kubernetes.io/cpu-cpuid.ADX\": \"true\",\n \"feature.node.kubernetes.io/cpu-cpuid.AESNI\": \"true\",\n...\n\n
This software enables node feature discovery for Kubernetes. It detects\nhardware features available on each node in a Kubernetes cluster, and\nadvertises those features using node labels and optionally node extended\nresources, annotations and node taints. Node Feature Discovery is compatible\nwith any recent version of Kubernetes (v1.21+).
\n\nNFD consists of four software components:
\n\nNFD-Master is the daemon responsible for communication towards the Kubernetes\nAPI. That is, it receives labeling requests from the worker and modifies node\nobjects accordingly.
\n\nNFD-Worker is a daemon responsible for feature detection. It then communicates\nthe information to nfd-master which does the actual node labeling. One\ninstance of nfd-worker is supposed to be running on each node of the cluster,
\n\nNFD-Topology-Updater is a daemon responsible for examining allocated\nresources on a worker node to account for resources available to be allocated\nto new pod on a per-zone basis (where a zone can be a NUMA node). It then\ncreates or updates a\nNodeResourceTopology custom\nresource object specific to this node. One instance of nfd-topology-updater is\nsupposed to be running on each node of the cluster.
\n\nNFD-GC is a daemon responsible for cleaning obsolete\nNodeFeature and\nNodeResourceTopology objects.
\n\nOne instance of nfd-gc is supposed to be running in the cluster.
\n\nFeature discovery is divided into domain-specific feature sources:
\n\nEach feature source is responsible for detecting a set of features which. in\nturn, are turned into node feature labels. Feature labels are prefixed with\nfeature.node.kubernetes.io/
and also contain the name of the feature source.\nNon-standard user-specific feature labels can be created with the local and\ncustom feature sources.
An overview of the default feature labels:
\n\n{\n \"feature.node.kubernetes.io/cpu-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/custom-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/kernel-<feature name>\": \"<feature value>\",\n \"feature.node.kubernetes.io/memory-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/network-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/pci-<device label>.present\": \"true\",\n \"feature.node.kubernetes.io/storage-<feature-name>\": \"true\",\n \"feature.node.kubernetes.io/system-<feature name>\": \"<feature value>\",\n \"feature.node.kubernetes.io/usb-<device label>.present\": \"<feature value>\",\n \"feature.node.kubernetes.io/<file name>-<feature name>\": \"<feature value>\"\n}\n
NFD also annotates nodes it is running on:
\n\nAnnotation | \nDescription | \n
---|---|
[<instance>.]nfd.node.kubernetes.io/feature-labels | \nComma-separated list of node labels managed by NFD. NFD uses this internally so must not be edited by users. | \n
[<instance>.]nfd.node.kubernetes.io/feature-annotations | \nComma-separated list of node annotations managed by NFD. NFD uses this internally so must not be edited by users. | \n
[<instance>.]nfd.node.kubernetes.io/extended-resources | \nComma-separated list of node extended resources managed by NFD. NFD uses this internally so must not be edited by users. | \n
[<instance>.]nfd.node.kubernetes.io/taints | \nComma-separated list of node taints managed by NFD. NFD uses this internally so must not be edited by users. | \n
\n\n\nNOTE: the
\n-instance
\ncommand line flag affects the annotation names
Unapplicable annotations are not created, i.e. for example\nnfd.node.kubernetes.io/extended-resources
is only placed if some extended\nresources were created by NFD.
NFD takes use of some Kubernetes Custom Resources.
\n\nNodeFeatures\nis be used for representing node features and requesting node labels to be\ngenerated.
\n\nNFD-Master uses NodeFeatureRules\nfor custom labeling of nodes.
\n\nNFD-Topology-Updater creates\nNodeResourceTopology objects\nthat describe the hardware topology of node resources.
\n","dir":"/get-started/","name":"introduction.md","path":"get-started/introduction.md","url":"/get-started/introduction.html"},{"title":"Master cmdline reference","layout":"default","sort":1,"content":"To quickly view available command line flags execute nfd-master -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 nfd-master -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -prune
flag is a sub-command like option for cleaning up the cluster. It\ncauses nfd-master to remove all NFD related labels, annotations and extended\nresources from all Node objects of the cluster and exit.
The -port
flag specifies the TCP port that nfd-master listens for incoming requests.
Default: 8080
\n\nExample:
\n\nnfd-master -port=443\n
The -metrics
flag specifies the port on which to expose\nPrometheus metrics. Setting this to 0 disables the\nmetrics server on nfd-master.
Default: 8081
\n\nExample:
\n\nnfd-master -metrics=12345\n
The -instance
flag makes it possible to run multiple NFD deployments in\nparallel. In practice, it separates the node annotations between deployments so\nthat each of them can store metadata independently. The instance name must\nstart and end with an alphanumeric character and may only contain alphanumeric\ncharacters, -
, _
or .
.
Default: empty
\n\nExample:
\n\nnfd-master -instance=network\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -ca-file
is one of the three flags (together with -cert-file
and\n-key-file
) controlling master-worker mutual TLS authentication on the\nnfd-master side. This flag specifies the TLS root certificate that is used for\nauthenticating incoming connections. NFD-Worker side needs to have matching key\nand cert files configured for the incoming requests to be accepted.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-key-file
Example:
\n\nnfd-master -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/master.crt -key-file=/opt/nfd/master.key\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -cert-file
is one of the three flags (together with -ca-file
and\n-key-file
) controlling master-worker mutual TLS authentication on the\nnfd-master side. This flag specifies the TLS certificate presented for\nauthenticating outgoing traffic towards nfd-worker.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-ca-file
and-key-file
Example:
\n\nnfd-master -cert-file=/opt/nfd/master.crt -key-file=/opt/nfd/master.key -ca-file=/opt/nfd/ca.crt\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -key-file
is one of the three flags (together with -ca-file
and\n-cert-file
) controlling master-worker mutual TLS authentication on the\nnfd-master side. This flag specifies the private key corresponding the given\ncertificate file (-cert-file
) that is used for authenticating outgoing\ntraffic.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-ca-file
Example:
\n\nnfd-master -key-file=/opt/nfd/master.key -cert-file=/opt/nfd/master.crt -ca-file=/opt/nfd/ca.crt\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -verify-node-name
flag controls the NodeName based authorization of\nincoming requests and only has effect when mTLS authentication has been enabled\n(with -ca-file
, -cert-file
and -key-file
). If enabled, the worker node\nname of the incoming must match with the CN or a SAN in its TLS certificate. Thus,\nworkers are only able to label the node they are running on (or the node whose\ncertificate they present).
Node Name based authorization is disabled by default.
\n\nDefault: false
\n\nExample:
\n\nnfd-master -verify-node-name -ca-file=/opt/nfd/ca.crt \\\n -cert-file=/opt/nfd/master.crt -key-file=/opt/nfd/master.key\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -enable-nodefeature-api
flag enables/disables the\nNodeFeature CRD API for receiving\nfeature requests. This will also automatically disable/enable the gRPC\ninterface.
Default: true
\n\nExample:
\n\nnfd-master -enable-nodefeature-api=false\n
The -enable-leader-election
flag enables leader election for NFD-Master.\nIt is advised to turn on this flag when running more than one instance of\nNFD-Master.
This flag takes effect only when combined with -enable-nodefeature-api
flag.
Default: false
\n\nnfd-master -enable-nodefeature-api -enable-leader-election\n
The -enable-taints
flag enables/disables node tainting feature of NFD.
Default: false
\n\nExample:
\n\nnfd-master -enable-taints=true\n
The -no-publish
flag disables updates to the Node objects in the Kubernetes\nAPI server, making a “dry-run” flag for nfd-master. No Labels, Annotations or\nExtendedResources of nodes are updated.
Default: false
\n\nExample:
\n\nnfd-master -no-publish\n
The -crd-controller
flag specifies whether the NFD CRD API controller is\nenabled or not. The controller is responsible for processing\nNodeFeature and\nNodeFeatureRule objects.
Default: true
\n\nExample:
\n\nnfd-master -crd-controller=false\n
DEPRECATED: use -crd-controller
instead.
The -label-whitelist
specifies a regular expression for filtering feature\nlabels based on their name. Each label must match against the given regular\nexpression or it will not be published.
\n\n\nNOTE: The regular expression is only matches against the “basename” part\nof the label, i.e. to the part of the name after ‘/’. The label namespace is\nomitted.
\n
Default: empty
\n\nExample:
\n\nnfd-master -label-whitelist='.*cpuid\\.'\n
The -extra-label-ns
flag specifies a comma-separated list of allowed feature\nlabel namespaces. This option can be used to allow\nother vendor or application specific namespaces for custom labels from the\nlocal and custom feature sources, even though these labels were denied using\nthe deny-label-ns
flag.
The same namespace control and this flag applies Extended Resources (created\nwith -resource-labels
), too.
Default: empty
\n\nExample:
\n\nnfd-master -extra-label-ns=vendor-1.com,vendor-2.io\n
The -deny-label-ns
flag specifies a comma-separated list of excluded\nlabel namespaces. By default, nfd-master allows creating labels in all\nnamespaces, excluding kubernetes.io
namespace and its sub-namespaces\n(i.e. *.kubernetes.io
). However, you should note that\nkubernetes.io
and its sub-namespaces are always denied.\nFor example, nfd-master -deny-label-ns=\"\"
would still disallow\nkubernetes.io
and *.kubernetes.io
.\nThis option can be used to exclude some vendors or application specific\nnamespaces.\nNote that the namespaces feature.node.kubernetes.io
and profile.node.kubernetes.io
\nand their sub-namespaces are always allowed and cannot be denied.
Default: empty
\n\nExample:
\n\nnfd-master -deny-label-ns=*.vendor.com,vendor-2.io\n
DEPRECATED: NodeFeatureRule\nshould be used for managing extended resources in NFD.
\n\nThe -resource-labels
flag specifies a comma-separated list of features to be\nadvertised as extended resources instead of labels. Features that have integer\nvalues can be published as Extended Resources by listing them in this flag.
Default: empty
\n\nExample:
\n\nnfd-master -resource-labels=vendor-1.com/feature-1,vendor-2.io/feature-2\n
The -config
flag specifies the path of the nfd-master configuration file to\nuse.
Default: /etc/kubernetes/node-feature-discovery/nfd-master.conf
\n\nExample:
\n\nnfd-master -config=/opt/nfd/master.conf\n
The -options
flag may be used to specify and override configuration file\noptions directly from the command line. The required format is the same as in\nthe config file i.e. JSON or YAML. Configuration options specified via this\nflag will override those from the configuration file:
Default: empty
\n\nExample:
\n\nnfd-master -options='{\"noPublish\": true}'\n
The -nfd-api-parallelism
flag can be used to specify the maximum\nnumber of concurrent node updates.
It takes effect only when -enable-nodefeature-api
has been set.
Default: 10
\n\nExample:
\n\nnfd-master -nfd-api-parallelism=1\n
The following logging-related flags are inherited from the\nklog package.
\n\nIf true, adds the file directory to the header of the log messages.
\n\nDefault: false
\n\nLog to standard error as well as files.
\n\nDefault: false
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
\n\nLog to standard error instead of files
\n\nDefault: true
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
\n\nLogs at or above this threshold go to stderr.
\n\nDefault: 2
\n\nNumber for the log level verbosity.
\n\nDefault: 0
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n\nThe -resync-period
flag specifies the NFD API controller resync period.\nThe resync means nfd-master replaying all NodeFeature and NodeFeatureRule objects,\nthus effectively re-syncing all nodes in the cluster (i.e. ensuring labels, annotations,\nextended resources and taints are in place).\nOnly has effect when the NodeFeature\nCRD API has been enabled with -enable-nodefeature-api
.
Default: 1 hour.
\n\nExample:
\n\nnfd-master -resync-period=2h\n
Features are advertised as labels in the Kubernetes Node object.
\n\nLabel creation in nfd-worker is performed by a set of separate modules called\nlabel sources. The\ncore.labelSources
\nconfiguration option (or\n-label-sources
\nflag) of nfd-worker controls which sources to enable for label generation.
All built-in labels use the feature.node.kubernetes.io
label namespace and\nhave the following format.
feature.node.kubernetes.io/<feature> = <value>\n
\n\n\nNOTE: Consecutive runs of nfd-worker will update the labels on a given\nnode. If features are not discovered on a consecutive run, the corresponding\nlabel will be removed. This includes any restrictions placed on the\nconsecutive run, such as restricting discovered features with the\n
\n-label-whitelist
\nflag of nfd-master or\ncore.labelWhiteList
\noption of nfd-worker.
Feature name | \nValue | \nDescription | \n
---|---|---|
cpu-cpuid.<cpuid-flag> | \n true | \nCPU capability is supported. NOTE: the capability might be supported but not enabled. | \n
cpu-hardware_multithreading | \n true | \nHardware multithreading, such as Intel HTT, enabled (number of logical CPUs is greater than physical CPUs) | \n
cpu-coprocessor.nx_gzip | \n true | \nNest Accelerator for GZIP is supported(Power). | \n
cpu-power.sst_bf.enabled | \n true | \nIntel SST-BF (Intel Speed Select Technology - Base frequency) enabled | \n
cpu-pstate.status | \n string | \nThe status of the Intel pstate driver when in use and enabled, either ‘active’ or ‘passive’. | \n
cpu-pstate.turbo | \n bool | \nSet to ‘true’ if turbo frequencies are enabled in Intel pstate driver, set to ‘false’ if they have been disabled. | \n
cpu-pstate.scaling_governor | \n string | \nThe value of the Intel pstate scaling_governor when in use, either ‘powersave’ or ‘performance’. | \n
cpu-cstate.enabled | \n bool | \nSet to ‘true’ if cstates are set in the intel_idle driver, otherwise set to ‘false’. Unset if intel_idle cpuidle driver is not active. | \n
cpu-rdt.<rdt-flag> | \n true | \nDEPRECATED Intel RDT capability is supported. See RDT flags for details. | \n
cpu-security.sgx.enabled | \n true | \nSet to ‘true’ if Intel SGX is enabled in BIOS (based on a non-zero sum value of SGX EPC section sizes). | \n
cpu-security.se.enabled | \n true | \nSet to ‘true’ if IBM Secure Execution for Linux (IBM Z & LinuxONE) is available and enabled (requires /sys/firmware/uv/prot_virt_host facility) | \n
cpu-security.tdx.enabled | \n true | \nSet to ‘true’ if Intel TDX is available on the host and has been enabled (requires /sys/module/kvm_intel/parameters/tdx ). | \n
cpu-security.tdx.protected | \n true | \nSet to ‘true’ if Intel TDX was used to start the guest node, based on the existence of the “TDX_GUEST” information as part of cpuid features. | \n
cpu-security.sev.enabled | \n true | \nSet to ‘true’ if ADM SEV is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev ). | \n
cpu-security.sev.es.enabled | \n true | \nSet to ‘true’ if ADM SEV-ES is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_es ). | \n
cpu-security.sev.snp.enabled | \n true | \nSet to ‘true’ if ADM SEV-SNP is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_snp ). | \n
cpu-model.vendor_id | \n string | \nComparable CPU vendor ID. | \n
cpu-model.family | \n int | \nCPU family. | \n
cpu-model.id | \n int | \nCPU model number. | \n
\n\n\nNOTE: the
\ncpu-rdt.<rdt-flag>
labels are deprecated and will be removed\nin a future release. They will remain to be available as features\nfor NodeFeatureRule to consume.\nSee customization guide\nfor details how to use NodeFeatureRule objects to create labels.
The CPU label source is configurable, see\nworker configuration and\nsources.cpu
\nconfiguration options for details.
Flag | \nDescription | \n
---|---|
ADX | \nMulti-Precision Add-Carry Instruction Extensions (ADX) | \n
AESNI | \nAdvanced Encryption Standard (AES) New Instructions (AES-NI) | \n
APX_F | \nIntel Advanced Performance Extensions (APX) | \n
AVX10 | \nIntel Advanced Vector Extensions 10 (AVX10) | \n
AVX10_256, AVX10_512 | \nIntel AVX10 256-bit and 512-bit vector support | \n
AVX | \nAdvanced Vector Extensions (AVX) | \n
AVX2 | \nAdvanced Vector Extensions 2 (AVX2) | \n
AVXIFMA | \nAVX-IFMA instructions | \n
AVXVNNI | \nAVX (VEX encoded) VNNI neural network instructions | \n
AMXBF16 | \nAdvanced Matrix Extension, tile multiplication operations on BFLOAT16 numbers | \n
AMXINT8 | \nAdvanced Matrix Extension, tile multiplication operations on 8-bit integers | \n
AMXFP16 | \nAdvanced Matrix Extension, tile multiplication operations on FP16 numbers | \n
AMXTILE | \nAdvanced Matrix Extension, base tile architecture support | \n
AVX512BF16 | \nAVX-512 BFLOAT16 instructions | \n
AVX512BITALG | \nAVX-512 bit Algorithms | \n
AVX512BW | \nAVX-512 byte and word Instructions | \n
AVX512CD | \nAVX-512 conflict detection instructions | \n
AVX512DQ | \nAVX-512 doubleword and quadword instructions | \n
AVX512ER | \nAVX-512 exponential and reciprocal instructions | \n
AVX512F | \nAVX-512 foundation | \n
AVX512FP16 | \nAVX-512 FP16 instructions | \n
AVX512IFMA | \nAVX-512 integer fused multiply-add instructions | \n
AVX512PF | \nAVX-512 prefetch instructions | \n
AVX512VBMI | \nAVX-512 vector bit manipulation instructions | \n
AVX512VBMI2 | \nAVX-512 vector bit manipulation instructions, version 2 | \n
AVX512VL | \nAVX-512 vector length extensions | \n
AVX512VNNI | \nAVX-512 vector neural network instructions | \n
AVX512VP2INTERSECT | \nAVX-512 intersect for D/Q | \n
AVX512VPOPCNTDQ | \nAVX-512 vector population count doubleword and quadword | \n
AVXNECONVERT | \nAVX-NE-CONVERT instructions | \n
AVXVNNIINT8 | \nAVX-VNNI-INT8 instructions | \n
CMPCCXADD | \nCMPCCXADD instructions | \n
ENQCMD | \nEnqueue Command | \n
GFNI | \nGalois Field New Instructions | \n
HYPERVISOR | \nRunning under hypervisor | \n
MSRLIST | \nRead/Write List of Model Specific Registers | \n
PREFETCHI | \nPREFETCHIT0/1 instructions | \n
VAES | \nAVX-512 vector AES instructions | \n
VPCLMULQDQ | \nCarry-less multiplication quadword | \n
WRMSRNS | \nNon-Serializing Write to Model Specific Register | \n
By default, the following CPUID flags have been blacklisted: BMI1, BMI2, CLMUL,\nCMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED,\nRDTSCP, SGX, SSE, SSE2, SSE3, SSE4, SSE42, SSSE3 and TDX_GUEST. See\nsources.cpu
\nconfiguration options to change the behavior.
See the full list in github.com/klauspost/cpuid.
\n\nFlag | \nDescription | \n
---|---|
IDIVA | \nInteger divide instructions available in ARM mode | \n
IDIVT | \nInteger divide instructions available in Thumb mode | \n
THUMB | \nThumb instructions | \n
FASTMUL | \nFast multiplication | \n
VFP | \nVector floating point instruction extension (VFP) | \n
VFPv3 | \nVector floating point extension v3 | \n
VFPv4 | \nVector floating point extension v4 | \n
VFPD32 | \nVFP with 32 D-registers | \n
HALF | \nHalf-word loads and stores | \n
EDSP | \nDSP extensions | \n
NEON | \nNEON SIMD instructions | \n
LPAE | \nLarge Physical Address Extensions | \n
Flag | \nDescription | \n
---|---|
AES | \nAnnouncing the Advanced Encryption Standard | \n
EVSTRM | \nEvent Stream Frequency Features | \n
FPHP | \nHalf Precision(16bit) Floating Point Data Processing Instructions | \n
ASIMDHP | \nHalf Precision(16bit) Asimd Data Processing Instructions | \n
ATOMICS | \nAtomic Instructions to the A64 | \n
ASIMRDM | \nSupport for Rounding Double Multiply Add/Subtract | \n
PMULL | \nOptional Cryptographic and CRC32 Instructions | \n
JSCVT | \nPerform Conversion to Match Javascript | \n
DCPOP | \nPersistent Memory Support | \n
Feature | \nValue | \nDescription | \n
---|---|---|
kernel-config.<option> | \n true | \nKernel config option is enabled (set ‘y’ or ‘m’). Default options are NO_HZ , NO_HZ_IDLE , NO_HZ_FULL and PREEMPT | \n
kernel-selinux.enabled | \n true | \nSelinux is enabled on the node | \n
kernel-version.full | \n string | \nFull kernel version as reported by /proc/sys/kernel/osrelease (e.g. ‘4.5.6-7-g123abcde’) | \n
kernel-version.major | \n string | \nFirst component of the kernel version (e.g. ‘4’) | \n
kernel-version.minor | \n string | \nSecond component of the kernel version (e.g. ‘5’) | \n
kernel-version.revision | \n string | \nThird component of the kernel version (e.g. ‘6’) | \n
The kernel label source is configurable, see\nworker configuration and\nsources.kernel
\nconfiguration options for details.
Feature | \nValue | \nDescription | \n
---|---|---|
memory-numa | \n true | \nMultiple memory nodes i.e. NUMA architecture detected | \n
memory-nv.present | \n true | \nNVDIMM device(s) are present | \n
memory-nv.dax | \n true | \nNVDIMM region(s) configured in DAX mode are present | \n
Feature | \nValue | \nDescription | \n
---|---|---|
network-sriov.capable | \n true | \nSingle Root Input/Output Virtualization (SR-IOV) enabled Network Interface Card(s) present | \n
network-sriov.configured | \n true | \nSR-IOV virtual functions have been configured | \n
Feature | \nValue | \nDescription | \n
---|---|---|
pci-<device label>.present | \n true | \nPCI device is detected | \n
pci-<device label>.sriov.capable | \n true | \nSingle Root Input/Output Virtualization (SR-IOV) enabled PCI device present | \n
\n | \n | \n |
<device label>
is format is configurable and set to <class>_<vendor>
by\ndefault. For more more details about configuration of the pci labels, see\nsources.pci
options\nand worker configuration\ninstructions.
Feature | \nValue | \nDescription | \n
---|---|---|
usb-<device label>.present | \n true | \nUSB device is detected | \n
<device label>
is format is configurable and set to\n<class>_<vendor>_<device>
by default. For more more details about\nconfiguration of the usb labels, see\nsources.usb
options\nand worker configuration\ninstructions.
Feature | \nValue | \nDescription | \n
---|---|---|
storage-nonrotationaldisk | \n true | \nNon-rotational disk, like SSD, is present in the node | \n
Feature | \nValue | \nDescription | \n
---|---|---|
system-os_release.ID | \n string | \nOperating system identifier | \n
system-os_release.VERSION_ID | \n string | \nOperating system version identifier (e.g. ‘6.7’) | \n
system-os_release.VERSION_ID.major | \n string | \nFirst component of the OS version id (e.g. ‘6’) | \n
system-os_release.VERSION_ID.minor | \n string | \nSecond component of the OS version id (e.g. ‘7’) | \n
The custom label source is designed for creating\nuser defined labels. However, it has a few statically\ndefined built-in labels:
\n\nFeature | \nValue | \nDescription | \n
---|---|---|
custom-rdma.capable | \n true | \nThe node has an RDMA capable Network adapter | \n
custom-rdma.enabled | \n true | \nThe node has the needed RDMA modules loaded to run RDMA traffic | \n
\n | \n | \n |
NFD has many extension points for creating vendor and application specific\nlabels. See the customization guide for\ndetailed documentation.
\n\nNFD is able to create extended resources, see the\nNodeFeatureRule CRD and its\nextendedResources field for more\ndetails.
\n\nNote that NFD is not a replacement for the usage of device plugins.
\n\nAn example use-case for extended resources could be based on custom feature\n(created e.g. with feature files that\nexposes the node SGX EPC memory section size. This value will then be turned\ninto an extended resource of the node, allowing PODs to request that resource\nand the Kubernetes scheduler to schedule such PODs to only those nodes which\nhave a sufficient capacity of said resource left.
\n\n\n","dir":"/usage/","name":"features.md","path":"usage/features.md","url":"/usage/features.html"},{"title":"Deployment","layout":"default","sort":2,"content":"Node Feature Discovery can be deployed on any recent version of Kubernetes\n(v1.21+).
\n\nSee Image variants for description of the different NFD\ncontainer images available.
\n\nUsing Kustomize provides straightforward deployment with\nkubectl
integration and declarative customization.
Using Helm provides easy management of NFD deployments with nice\nconfiguration management and easy upgrades.
\n\nUsing Operator provides deployment and configuration management via\nCRDs.
\n","dir":"/deployment/","name":"index.md","path":"deployment/index.md","url":"/deployment/"},{"title":"Kustomize","layout":"default","sort":2,"content":"Kustomize can be used to\ndeploy NFD. Customization of the deployment is done by maintaining\ndeclarative overlays on top of the base overlays in NFD.
\n\nTo follow the deployment instructions here,\nkubectl v1.21 or\nlater is required.
\n\nThe kustomize overlays provided in the repo can be used directly:
\n\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7\n
This will required RBAC rules and deploy nfd-master (as a deployment) and\nnfd-worker (as daemonset) in the node-feature-discovery
namespace.
\n\n\nNOTE: nfd-topology-updater is not deployed as part of the
\ndefault
\noverlay. Refer to the Master Worker Topologyupdater\nand Topologyupdater below.
Alternatively you can clone the repository and customize the deployment by\ncreating your own overlays. See kustomize for more information\nabout managing deployment configurations.
\n\nThe NFD repository hosts a set of overlays for different usages and deployment\nscenarios under\ndeployment/overlays
default
:\ndefault deployment of nfd-worker as a daemonset, described abovedefault-job
:\nsee Worker one-shot belowmaster-worker-topologyupdater
:\nsee Master Worker Topologyupdater belowtopologyupdater
:\nsee Topology Updater belowprometheus
:\nsee Metrics belowprune
:\nclean up the cluster after uninstallation, see\nRemoving feature labelssamples/cert-manager
:\nan example for supplementing the default deployment with cert-manager for TLS\nauthentication, see\nAutomated TLS certificate management using cert-manager\nfor detailssamples/custom-rules
:\nan example for spicing up the default deployment with a separately managed\nconfigmap of custom labeling rules, see\nCustom feature source for more information about\ncustom node labelsFeature discovery can alternatively be configured as a one-shot job.\nThe default-job
overlay may be used to achieve this:
NUM_NODES=$(kubectl get no -o jsonpath='{.items[*].metadata.name}' | wc -w)\nkubectl kustomize https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default-job?ref=v0.15.7 | \\\n sed s\"/NUM_NODES/$NUM_NODES/\" | \\\n kubectl apply -f -\n
The example above launches as many jobs as there are non-master nodes. Note that\nthis approach does not guarantee running once on every node. For example,\ntainted, non-ready nodes or some other reasons in Job scheduling may cause some\nnode(s) will run extra job instance(s) to satisfy the request.
\n\nNFD-Master, nfd-worker and nfd-topology-updater can be configured to be\ndeployed as separate pods. The master-worker-topologyupdater
overlay may be\nused to achieve this:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/master-worker-topologyupdater?ref=v0.15.7\n\n
To deploy just nfd-topology-updater (without nfd-master and nfd-worker)\nuse the topologyupdater
overlay:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.7\n\n
NFD-Topology-Updater can be configured along with the default
overlay\n(which deploys nfd-worker and nfd-master) where all the software components\nare deployed as separate pods;
\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.7\n\n
To allow prometheus operator\nto scrape metrics from node-feature-discovery,\nrun the following command:
\n\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prometheus?ref=v0.15.7\n
Simplest way is to invoke kubectl delete
on the overlay that was used for\ndeployment. Beware that this will also delete the namespace that NFD is\nrunning in. For example, in case the default overlay from the repo was used:
kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7\n
Alternatively you can delete create objects one-by-one, depending on the type\nof deployment, for example:
\n\nNFD_NS=node-feature-discovery\nkubectl -n $NFD_NS delete ds nfd-worker\nkubectl -n $NFD_NS delete deploy nfd-master\nkubectl -n $NFD_NS delete svc nfd-master\nkubectl -n $NFD_NS delete sa nfd-master\nkubectl delete clusterrole nfd-master\nkubectl delete clusterrolebinding nfd-master\n
Minimal steps to deploy latest released version of NFD in your cluster.
\n\nDeploy with kustomize – creates a new namespace, service and required RBAC\nrules and deploys nfd-master and nfd-worker daemons.
\n\nkubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7\n
Wait until NFD master and NFD worker are running.
\n\n$ kubectl -n node-feature-discovery get ds,deploy\nNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE\ndaemonset.apps/nfd-worker 2 2 2 2 2 <none> 10s\n\nNAME READY UP-TO-DATE AVAILABLE AGE\ndeployment.apps/nfd-master 1/1 1 1 17s\n\n
Check that NFD feature labels have been created
\n\n$ kubectl get no -o json | jq '.items[].metadata.labels'\n{\n \"kubernetes.io/arch\": \"amd64\",\n \"kubernetes.io/os\": \"linux\",\n \"feature.node.kubernetes.io/cpu-cpuid.ADX\": \"true\",\n \"feature.node.kubernetes.io/cpu-cpuid.AESNI\": \"true\",\n \"feature.node.kubernetes.io/cpu-cpuid.AVX\": \"true\",\n...\n
Create a pod targeting a distinguishing feature (select a valid feature from\nthe list printed on the previous step)
\n\n$ cat << EOF | kubectl apply -f -\napiVersion: v1\nkind: Pod\nmetadata:\n name: feature-dependent-pod\nspec:\n containers:\n - image: registry.k8s.io/pause\n name: pause\n nodeSelector:\n # Select a valid feature\n feature.node.kubernetes.io/cpu-cpuid.AESNI: 'true'\nEOF\npod/feature-dependent-pod created\n
See that the pod is running on a desired node
\n\n$ kubectl get po feature-dependent-pod -o wide\nNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES\nfeature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 <none> <none>\n
To deploy nfd-topology-updater use the topologyupdater
kustomize\noverlay.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.7\n
Wait until nfd-topology-updater is running.
\n\n$ kubectl -n node-feature-discovery get ds\nNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE\ndaemonset.apps/nfd-topology-updater 2 2 2 2 2 <none> 5s\n\n
Check that the NodeResourceTopology objects are created
\n\n$ kubectl get noderesourcetopologies.topology.node.k8s.io\nNAME AGE\nkind-control-plane 23s\nkind-worker 23s\n
To quickly view available command line flags execute nfd-worker -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 nfd-worker -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -config
flag specifies the path of the nfd-worker configuration file to\nuse.
Default: /etc/kubernetes/node-feature-discovery/nfd-worker.conf
\n\nExample:
\n\nnfd-worker -config=/opt/nfd/worker.conf\n
The -options
flag may be used to specify and override configuration file\noptions directly from the command line. The required format is the same as in\nthe config file i.e. JSON or YAML. Configuration options specified via this\nflag will override those from the configuration file:
Default: empty
\n\nExample:
\n\nnfd-worker -options='{\"sources\":{\"cpu\":{\"cpuid\":{\"attributeWhitelist\":[\"AVX\",\"AVX2\"]}}}}'\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -server
flag specifies the address of the nfd-master endpoint where to\nconnect to.
Default: localhost:8080
\n\nExample:
\n\nnfd-worker -server=nfd-master.nfd.svc.cluster.local:443\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -ca-file
is one of the three flags (together with -cert-file
and\n-key-file
) controlling the mutual TLS authentication on the worker side.\nThis flag specifies the TLS root certificate that is used for verifying the\nauthenticity of nfd-master.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-key-file
Example:
\n\nnfd-worker -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/worker.crt -key-file=/opt/nfd/worker.key\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -cert-file
is one of the three flags (together with -ca-file
and\n-key-file
) controlling mutual TLS authentication on the worker side. This\nflag specifies the TLS certificate presented for authenticating outgoing\nrequests.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-ca-file
and-key-file
Example:
\n\nnfd-workerr -cert-file=/opt/nfd/worker.crt -key-file=/opt/nfd/worker.key -ca-file=/opt/nfd/ca.crt\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -key-file
is one of the three flags (together with -ca-file
and\n-cert-file
) controlling the mutual TLS authentication on the worker side.\nThis flag specifies the private key corresponding the given certificate file\n(-cert-file
) that is used for authenticating outgoing requests.
Default: empty
\n\n\n\n\nNOTE: Must be specified together with
\n-cert-file
and-ca-file
Example:
\n\nnfd-worker -key-file=/opt/nfd/worker.key -cert-file=/opt/nfd/worker.crt -ca-file=/opt/nfd/ca.crt\n
The -kubeconfig
flag specifies the kubeconfig to use for connecting to the\nKubernetes API server. It is only needed for manipulating\nNodeFeature objects, and thus the flag\nonly takes effect when\n-enable-nodefeature-api
) is specified. An empty\nvalue (which is also the default) implies in-cluster kubeconfig.
Default: empty
\n\nExample:
\n\nnfd-worker -kubeconfig ${HOME}/.kube/config\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -server-name-override
flag specifies the common name (CN) which to\nexpect from the nfd-master TLS certificate. This flag is mostly intended for\ndevelopment and debugging purposes.
Default: empty
\n\nExample:
\n\nnfd-worker -server-name-override=localhost\n
The -feature-sources
flag specifies a comma-separated list of enabled feature\nsources. A special value all
enables all sources. Prefixing a source name\nwith -
indicates that the source will be disabled instead - this is only\nmeaningful when used in conjunction with all
. This command line flag allows\ncompletely disabling the feature detection so that neither standard feature\nlabels are generated nor the raw feature data is available for custom rule\nprocessing. Consider using the core.featureSources
config file option,\ninstead, allowing dynamic configurability.
\n\n\nNOTE: This flag takes precedence over the
\ncore.featureSources
\nconfiguration file option.
Default: all
\n\nExample:
\n\nnfd-worker -feature-sources=all,-pci\n
The -label-sources
flag specifies a comma-separated list of enabled label\nsources. A special value all
enables all sources. Prefixing a source name\nwith -
indicates that the source will be disabled instead - this is only\nmeaningful when used in conjunction with all
. Consider using the\ncore.labelSources
config file option, instead, allowing dynamic\nconfigurability.
\n\n\nNOTE: This flag takes precedence over the
\ncore.labelSources
\nconfiguration file option.
Default: all
\n\nExample:
\n\nnfd-worker -label-sources=kernel,system,local\n
\n\n\nNOTE the gRPC API is deprecated and will be removed in a future release.\nand this flag will be removed as well.
\n
The -enable-nodefeature-api
flag enables/disables the\nNodeFeature CRD API\nfor communicating with nfd-master. When enabled nfd-worker creates per-node\nNodeFeature objects the contain all discovered node features and the set of\nfeature labels to be created. Setting the flag to false will enable\ngRPC communication to nfd-master.
Default: true
\n\nExample:
\n\nnfd-worker -enable-nodefeature-api=false\n
The -metrics
flag specifies the port on which to expose\nPrometheus metrics. Setting this to 0 disables the\nmetrics server on nfd-worker.
Default: 8081
\n\nExample:
\n\nnfd-worker -metrics=12345\n
The -no-publish
flag disables all communication with the nfd-master and the\nKubernetes API server. It is effectively a “dry-run” flag for nfd-worker.\nNFD-Worker runs feature detection normally, but no labeling requests are sent\nto nfd-master and no NodeFeature objects are created or updated in the API\nserver.
\n\n\nNOTE: This flag takes precedence over the\n
\ncore.noPublish
\nconfiguration file option.
Default: false
\n\nExample:
\n\nnfd-worker -no-publish\n
The -oneshot
flag causes nfd-worker to exit after one pass of feature\ndetection.
Default: false
\n\nExample:
\n\nnfd-worker -oneshot -no-publish\n
The following logging-related flags are inherited from the\nklog package.
\n\n\n\n\nNOTE: The logger setup can also be specified via the
\ncore.klog
\nconfiguration file options. However, the command line flags take precedence\nover any corresponding config file options specified.
If true, adds the file directory to the header of the log messages.
\n\nDefault: false
\n\nLog to standard error as well as files.
\n\nDefault: false
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
\n\nLog to standard error instead of files
\n\nDefault: true
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
\n\nLogs at or above this threshold go to stderr.
\n\nDefault: 2
\n\nNumber for the log level verbosity.
\n\nDefault: 0
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n","dir":"/reference/","name":"worker-commandline-reference.md","path":"reference/worker-commandline-reference.md","url":"/reference/worker-commandline-reference.html"},{"title":"Using node labels","layout":"default","sort":2,"content":"Nodes with specific features can be targeted using the nodeSelector
field. The\nfollowing example shows how to target nodes with Intel TurboBoost enabled.
apiVersion: v1\nkind: Pod\nmetadata:\n labels:\n env: test\n name: golang-test\nspec:\n containers:\n - image: golang\n name: go1\n nodeSelector:\n feature.node.kubernetes.io/cpu-pstate.turbo: 'true'\n
For more details on targeting nodes, see\nnode selection.
\n","dir":"/usage/","name":"using-labels.md","path":"usage/using-labels.md","url":"/usage/using-labels.html"},{"title":"Helm","layout":"default","sort":3,"content":"Node Feature Discovery provides a Helm chart to manage its deployment.
\n\n\n\n\nNOTE: NFD is not ideal for other Helm charts to depend on as that may\nresult in multiple parallel NFD deployments in the same cluster which is not\nfully supported by the NFD Helm chart.
\n
Helm package manager should be installed.
\n\nTo install the latest stable version:
\n\nexport NFD_NS=node-feature-discovery\nhelm repo add nfd https://kubernetes-sigs.github.io/node-feature-discovery/charts\nhelm repo update\nhelm install nfd/node-feature-discovery --namespace $NFD_NS --create-namespace --generate-name\n
To install the latest development version you need to clone the NFD Git\nrepository and install from there.
\n\ngit clone https://github.com/kubernetes-sigs/node-feature-discovery/\ncd node-feature-discovery/deployment/helm\nexport NFD_NS=node-feature-discovery\nhelm install node-feature-discovery ./node-feature-discovery/ --namespace $NFD_NS --create-namespace\n
See the configuration section below for instructions how to\nalter the deployment parameters.
\n\nYou can override values from values.yaml
and provide a file with custom values:
export NFD_NS=node-feature-discovery\nhelm install nfd/node-feature-discovery -f <path/to/custom/values.yaml> --namespace $NFD_NS --create-namespace\n
To specify each parameter separately you can provide them to helm install command:
\n\nexport NFD_NS=node-feature-discovery\nhelm install nfd/node-feature-discovery --set nameOverride=NFDinstance --set master.replicaCount=2 --namespace $NFD_NS --create-namespace\n
To uninstall the node-feature-discovery
deployment:
export NFD_NS=node-feature-discovery\nhelm uninstall node-feature-discovery --namespace $NFD_NS\n
The command removes all the Kubernetes components associated with the chart and\ndeletes the release.
\n\nTo tailor the deployment of the Node Feature Discovery to your needs following\nChart parameters are available.
\n\nName | \nType | \nDefault | \nDescription | \n
---|---|---|---|
image.repository | \n string | \nregistry.k8s.io/nfd/node-feature-discovery | \n NFD image repository | \n
image.tag | \n string | \nv0.15.7 | \n NFD image tag | \n
image.pullPolicy | \n string | \nAlways | \n Image pull policy | \n
imagePullSecrets | \n list | \n[] | \nImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info | \n
nameOverride | \n string | \n\n | Override the name of the chart | \n
fullnameOverride | \n string | \n\n | Override a default fully qualified app name | \n
tls.enable | \n bool | \nfalse | \nSpecifies whether to use TLS for communications between components. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
tls.certManager | \n bool | \nfalse | \nIf enabled, requires cert-manager to be installed and will automatically create the required TLS certificates. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
enableNodeFeatureApi | \n bool | \ntrue | \nEnable the NodeFeature CRD API for communicating node features. This will automatically disable the gRPC communication. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
prometheus.enable | \n bool | \nfalse | \nSpecifies whether to expose metrics using prometheus operator | \n
prometheus.labels | \n dict | \n{} | \nSpecifies labels for use with the prometheus operator to control how it is selected | \n
Metrics are configured to be exposed using prometheus operator API’s by\ndefault. If you want to expose metrics using the prometheus operator\nAPI’s you need to install the prometheus operator in your cluster.
\n\nName | \nType | \nDefault | \ndescription | \n
---|---|---|---|
master.* | \n dict | \n\n | NFD master deployment configuration | \n
master.enable | \n bool | \ntrue | \nSpecifies whether nfd-master should be deployed | \n
master.port | \n integer | \n\n | Specifies the TCP port that nfd-master listens for incoming requests. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
master.metricsPort | \n integer | \n8081 | \nPort on which to expose metrics from components to prometheus operator | \n
master.instance | \n string | \n\n | Instance name. Used to separate annotation namespaces for multiple parallel deployments | \n
master.resyncPeriod | \n string | \n\n | NFD API controller resync period. | \n
master.extraLabelNs | \n array | \n[] | \nList of allowed extra label namespaces | \n
master.resourceLabels | \n array | \n[] | \nList of labels to be registered as extended resources | \n
master.enableTaints | \n bool | \nfalse | \nSpecifies whether to enable or disable node tainting | \n
master.crdController | \n bool | \nnull | \nSpecifies whether the NFD CRD API controller is enabled. If not set, controller will be enabled if master.instance is empty. | \n
master.featureRulesController | \n bool | \nnull | \nDEPRECATED: use master.crdController instead | \n
master.replicaCount | \n integer | \n1 | \nNumber of desired pods. This is a pointer to distinguish between explicit zero and not specified | \n
master.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
master.securityContext | \n dict | \n{} | \nContainer security settings | \n
master.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether a service account should be created | \n
master.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account | \n
master.serviceAccount.name | \n string | \n\n | The name of the service account to use. If not set and create is true, a name is generated using the fullname template | \n
master.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for nfd-master | \n
master.service.type | \n string | \nClusterIP | \nNFD master service type. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
master.service.port | \n integer | \n8080 | \nNFD master service port. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | \n
master.resources | \n dict | \n{} | \nNFD master pod resources management | \n
master.nodeSelector | \n dict | \n{} | \nNFD master pod node selector | \n
master.tolerations | \n dict | \nScheduling to master node is disabled | \nNFD master pod tolerations | \n
master.annotations | \n dict | \n{} | \nNFD master pod annotations | \n
master.affinity | \n dict | \n\n | NFD master pod required node affinity | \n
master.deploymentAnnotations | \n dict | \n{} | \nNFD master deployment annotations | \n
master.nfdApiParallelism | \n integer | \n10 | \nSpecifies the maximum number of concurrent node updates. | \n
master.config | \n dict | \n\n | NFD master configuration | \n
Name | \nType | \nDefault | \ndescription | \n
---|---|---|---|
worker.* | \n dict | \n\n | NFD worker daemonset configuration | \n
worker.enable | \n bool | \ntrue | \nSpecifies whether nfd-worker should be deployed | \n
worker.metricsPort* | \n int | \n8081 | \nPort on which to expose metrics from components to prometheus operator | \n
worker.config | \n dict | \n\n | NFD worker configuration | \n
worker.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
worker.securityContext | \n dict | \n{} | \nContainer security settings | \n
worker.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether a service account for nfd-worker should be created | \n
worker.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account for nfd-worker | \n
worker.serviceAccount.name | \n string | \n\n | The name of the service account to use for nfd-worker. If not set and create is true, a name is generated using the fullname template (suffixed with -worker ) | \n
worker.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for nfd-worker | \n
worker.mountUsrSrc | \n bool | \nfalse | \nSpecifies whether to allow users to mount the hostpath /user/src. Does not work on systems without /usr/src AND a read-only /usr | \n
worker.resources | \n dict | \n{} | \nNFD worker pod resources management | \n
worker.nodeSelector | \n dict | \n{} | \nNFD worker pod node selector | \n
worker.tolerations | \n dict | \n{} | \nNFD worker pod node tolerations | \n
worker.priorityClassName | \n string | \n\n | NFD worker pod priority class | \n
worker.annotations | \n dict | \n{} | \nNFD worker pod annotations | \n
worker.daemonsetAnnotations | \n dict | \n{} | \nNFD worker daemonset annotations | \n
Name | \nType | \nDefault | \ndescription | \n
---|---|---|---|
topologyUpdater.* | \n dict | \n\n | NFD Topology Updater configuration | \n
topologyUpdater.enable | \n bool | \nfalse | \nSpecifies whether the NFD Topology Updater should be created | \n
topologyUpdater.createCRDs | \n bool | \nfalse | \nSpecifies whether the NFD Topology Updater CRDs should be created | \n
topologyUpdater.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether the service account for topology updater should be created | \n
topologyUpdater.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account for topology updater | \n
topologyUpdater.serviceAccount.name | \n string | \n\n | The name of the service account for topology updater to use. If not set and create is true, a name is generated using the fullname template and -topology-updater suffix | \n
topologyUpdater.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for topology updater | \n
topologyUpdater.metricsPort | \n integer | \n8081 | \nPort on which to expose prometheus metrics | \n
topologyUpdater.kubeletConfigPath | \n string | \n”” | \nSpecifies the kubelet config host path | \n
topologyUpdater.kubeletPodResourcesSockPath | \n string | \n”” | \nSpecifies the kubelet sock path to read pod resources | \n
topologyUpdater.updateInterval | \n string | \n60s | \nTime to sleep between CR updates. Non-positive value implies no CR update. | \n
topologyUpdater.watchNamespace | \n string | \n* | \n Namespace to watch pods, * for all namespaces | \n
topologyUpdater.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
topologyUpdater.securityContext | \n dict | \n{} | \nContainer security settings | \n
topologyUpdater.resources | \n dict | \n{} | \nTopology updater pod resources management | \n
topologyUpdater.nodeSelector | \n dict | \n{} | \nTopology updater pod node selector | \n
topologyUpdater.tolerations | \n dict | \n{} | \nTopology updater pod node tolerations | \n
topologyUpdater.annotations | \n dict | \n{} | \nTopology updater pod annotations | \n
topologyUpdater.daemonsetAnnotations | \n dict | \n{} | \nTopology updater daemonset annotations | \n
topologyUpdater.affinity | \n dict | \n{} | \nTopology updater pod affinity | \n
topologyUpdater.config | \n dict | \n\n | configuration | \n
topologyUpdater.podSetFingerprint | \n bool | \nfalse | \nEnables compute and report of pod fingerprint in NRT objects. | \n
topologyUpdater.kubeletStateDir | \n string | \n/var/lib/kubelet | \nSpecifies kubelet state directory path for watching state and checkpoint files. Empty value disables kubelet state tracking. | \n
Name | \nType | \nDefault | \ndescription | \n
---|---|---|---|
gc.* | \n dict | \n\n | NFD Garbage Collector configuration | \n
gc.enable | \n bool | \ntrue | \nSpecifies whether the NFD Garbage Collector should be created | \n
gc.serviceAccount.create | \n bool | \ntrue | \nSpecifies whether the service account for garbage collector should be created | \n
gc.serviceAccount.annotations | \n dict | \n{} | \nAnnotations to add to the service account for garbage collector | \n
gc.serviceAccount.name | \n string | \n\n | The name of the service account for garbage collector to use. If not set and create is true, a name is generated using the fullname template and -gc suffix | \n
gc.rbac.create | \n bool | \ntrue | \nSpecifies whether to create RBAC configuration for garbage collector | \n
gc.interval | \n string | \n1h | \nTime between periodic garbage collector runs | \n
gc.podSecurityContext | \n dict | \n{} | \nPodSecurityContext holds pod-level security attributes and common container settings | \n
gc.resources | \n dict | \n{} | \nGarbage collector pod resources management | \n
gc.metricsPort | \n integer | \n8081 | \nPort on which to serve Prometheus metrics | \n
gc.nodeSelector | \n dict | \n{} | \nGarbage collector pod node selector | \n
gc.tolerations | \n dict | \n{} | \nGarbage collector pod node tolerations | \n
gc.annotations | \n dict | \n{} | \nGarbage collector pod annotations | \n
gc.deploymentAnnotations | \n dict | \n{} | \nGarbage collector deployment annotations | \n
gc.affinity | \n dict | \n{} | \nGarbage collector pod affinity | \n
See the\nsample configuration file\nfor a full example configuration.
\n\nnoPublish
option disables updates to the Node objects in the Kubernetes\nAPI server, making a “dry-run” flag for nfd-master. No Labels, Annotations, Taints\nor ExtendedResources of nodes are updated.
Default: false
Example:
\n\nnoPublish: true\n
extraLabelNs
specifies a list of allowed feature\nlabel namespaces. This option can be used to allow\nother vendor or application specific namespaces for custom labels from the\nlocal and custom feature sources, even though these labels were denied using\nthe denyLabelNs
parameter.
The same namespace control and this option applies to Extended Resources (created\nwith resourceLabels
), too.
Default: empty
\n\nExample:
\n\nextraLabelNs: [\"added.ns.io\",\"added.kubernets.io\"]\n
denyLabelNs
specifies a list of excluded\nlabel namespaces. By default, nfd-master allows creating labels in all\nnamespaces, excluding kubernetes.io
namespace and its sub-namespaces\n(i.e. *.kubernetes.io
). However, you should note that\nkubernetes.io
and its sub-namespaces are always denied.\nThis option can be used to exclude some vendors or application specific\nnamespaces.
Default: empty
\n\nExample:
\n\ndenyLabelNs: [\"denied.ns.io\",\"denied.kubernetes.io\"]\n
The autoDefaultNs
option controls the automatic prefixing of names. When set\nto true (the default in NFD version v0.15) nfd-master\nautomatically adds the default feature.node.kubernetes.io/
prefix to\nunprefixed labels, annotations and extended resources - this is also the\ndefault behavior in NFD v0.15 and earlier. When the option is set to false
,\nno prefix will be prepended to unprefixed names, effectively causing them to be\nfiltered out (as NFD does not allow unprefixed names of labels, annotations or\nextended resources). The default will be changed to false
in a future\nrelease.
For example, with the autoDefaultNs
set to true
, a NodeFeatureRule with
labels:\n foo: bar\n
Will turn into feature.node.kubernetes.io/foo=bar
node label. With\nautoDefaultNs
set to false
, no prefix is added and the label will be\nfiltered out.
Note that taint keys are not affected by this option.
\n\nDefault: true
Example:
\n\nautoDefaultNs: false\n
DEPRECATED: NodeFeatureRule\nshould be used for managing extended resources in NFD.
\n\nThe resourceLabels
option specifies a list of features to be\nadvertised as extended resources instead of labels. Features that have integer\nvalues can be published as Extended Resources by listing them in this option.
Default: empty
\n\nExample:
\n\nresourceLabels: [\"vendor-1.com/feature-1\",\"vendor-2.io/feature-2\"]\n
enableTaints
enables/disables node tainting feature of NFD.
Default: false
\n\nExample:
\n\nenableTaints: true\n
labelWhiteList
specifies a regular expression for filtering feature\nlabels based on their name. Each label must match against the given regular\nexpression or it will not be published.
\n\n\n** NOTE:** The regular expression is only matches against the “basename” part\nof the label, i.e. to the part of the name after ‘/’. The label namespace is\nomitted.
\n
Default: empty
\n\nExample:
\n\nlabelWhiteList: \"foo\"\n
The resyncPeriod
option specifies the NFD API controller resync period.\nThe resync means nfd-master replaying all NodeFeature and NodeFeatureRule objects,\nthus effectively re-syncing all nodes in the cluster (i.e. ensuring labels, annotations,\nextended resources and taints are in place).\nOnly has effect when the NodeFeature\nCRD API has been enabled with -enable-nodefeature-api
.
Default: 1 hour.
\n\nExample:
\n\nresyncPeriod: 2h\n
The leaderElection
section exposes configuration to tweak leader election.
leaderElection.leaseDuration
is the duration that non-leader candidates will\nwait to force acquire leadership. This is measured against time of\nlast observed ack.
A client needs to wait a full LeaseDuration without observing a change to\nthe record before it can attempt to take over. When all clients are\nshutdown and a new set of clients are started with different names against\nthe same leader record, they must wait the full LeaseDuration before\nattempting to acquire the lease. Thus LeaseDuration should be as short as\npossible (within your tolerance for clock skew rate) to avoid a possible\nlong waits in the scenario.
\n\nDefault: 15 seconds.
\n\nExample:
\n\nleaderElection:\n leaseDurtation: 15s\n
leaderElection.renewDeadline
is the duration that the acting master will retry\nrefreshing leadership before giving up.
This value has to be lower than leaseDuration and greater than retryPeriod*1.2.
\n\nDefault: 10 seconds.
\n\nExample:
\n\nleaderElection:\n renewDeadline: 10s\n
leaderElection.retryPeriod
is the duration the LeaderElector clients should wait\nbetween tries of actions.
It has to be greater than 0.
\n\nDefault: 2 seconds.
\n\nExample:
\n\nleaderElection:\n retryPeriod: 2s\n
The nfdApiParallelism
option can be used to specify the maximum\nnumber of concurrent node updates.
It takes effect only when -enable-nodefeature-api
has been set.
Default: 10
\n\nExample:
\n\nnfdApiParallelism: 1\n
The following options specify the logger configuration. Most of which can be\ndynamically adjusted at run-time.
\n\n\n\n\nNOTE: The logger options can also be specified via command line flags\nwhich take precedence over any corresponding config file options.
\n
If true, adds the file directory to the header of the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nLog to standard error as well as files.
\n\nDefault: false
Run-time configurable: yes
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nRun-time configurable: yes
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
Run-time configurable: no
\n\nLog to standard error instead of files
\n\nDefault: true
Run-time configurable: yes
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
Run-time configurable: no
\n\nLogs at or above this threshold go to stderr (default 2)
\n\nRun-time configurable: yes
\n\nNumber for the log level verbosity.
\n\nDefault: 0
Run-time configurable: yes
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n\nRun-time configurable: yes
\n","dir":"/reference/","name":"master-configuration-reference.md","path":"reference/master-configuration-reference.md","url":"/reference/master-configuration-reference.html"},{"title":"Usage","layout":"default","sort":3,"content":"Usage instructions.
\n","dir":"/usage/","name":"index.md","path":"usage/index.md","url":"/usage/"},{"title":"NFD-Master","layout":"default","sort":3,"content":"NFD-Master is responsible for connecting to the Kubernetes API server and\nupdating node objects. More specifically, it modifies node labels, taints and\nextended resources based on requests from nfd-workers and 3rd party extensions.
\n\nThe NodeFeature Controller uses NodeFeature objects as\nthe input for the NodeFeatureRule\nprocessing pipeline. In addition, any labels listed in the NodeFeature object\nare created on the node (note the allowed\nlabel namespaces are controlled).
\n\nNFD-Master acts as the controller for\nNodeFeatureRule objects.\nIt applies the rules specified in NodeFeatureRule objects on raw feature data\nand creates node labels accordingly. The feature data used as the input is\nreceived from nfd-worker instances through\nNodeFeature objects.
\n\n\n\n\nNOTE: when gRPC (DEPRECATED) is used for communicating\nthe features (by setting the flag
\n-enable-nodefeature-api=false
on both\nnfd-master and nfd-worker, or via Helm values.enableNodeFeatureApi=false),\n(re-)labelling only happens when a request is received from nfd-worker.\nThat is, in practice rules are evaluated and labels for each node are created\non intervals specified by the\ncore.sleepInterval
\nconfiguration option of nfd-worker instances. This means that modification or\ncreation of NodeFeatureRule objects does not instantly cause the node\nlabels to be updated. Instead, the changes only come visible in node labels\nas nfd-worker instances send their labelling requests. This limitation is not\npresent when gRPC interface is disabled\nand NodeFeature API is used.
NFD-Master supports dynamic configuration through a configuration file. The\ndefault location is /etc/kubernetes/node-feature-discovery/nfd-master.conf
,\nbut, this can be changed by specifying the-config
command line flag.\nConfiguration file is re-read whenever it is modified which makes run-time\nre-configuration of nfd-master straightforward.
Master configuration file is read inside the container, and thus, Volumes and\nVolumeMounts are needed to make your configuration available for NFD. The\npreferred method is to use a ConfigMap which provides easy deployment and\nre-configurability.
\n\nThe provided nfd-master deployment templates create an empty configmap and\nmount it inside the nfd-master containers. In kustomize deployments,\nconfiguration can be edited with:
\n\nkubectl -n ${NFD_NS} edit configmap nfd-master-conf\n
In Helm deployments,\nMaster pod parameter\nmaster.config
can be used to edit the respective configuration.
See\nnfd-master configuration file reference\nfor more details.\nThe (empty-by-default)\nexample config\ncontains all available configuration options and can be used as a reference\nfor creating a configuration.
\n\nNFD-Master runs as a deployment, by default\nit prefers running on the cluster’s master nodes but will run on worker\nnodes if no master nodes are found.
\n\nFor High Availability, you should increase the replica count of\nthe deployment object. You should also look into adding\ninter-pod\naffinity to prevent masters from running on the same node.\nHowever note that inter-pod affinity is costly and is not recommended\nin bigger clusters.
\n\n\n\n\nNote: When NFD-Master is intended to run with more than one replica,\nit is advised to use
\n-enable-leader-election
flag. This flag turns on\nleader election for NFD-Master and let only one replica to act on changes\nin NodeFeature and NodeFeatureRule objects.
If you have RBAC authorization enabled (as is the default e.g. with clusters\ninitialized with kubeadm) you need to configure the appropriate ClusterRoles,\nClusterRoleBindings and a ServiceAccount for NFD to create node\nlabels. The provided template will configure these for you.
\n","dir":"/usage/","name":"nfd-master.md","path":"usage/nfd-master.md","url":"/usage/nfd-master.html"},{"title":"NFD Operator","layout":"default","sort":4,"content":"The Node Feature Discovery Operator automates installation,\nconfiguration and updates of NFD using a specific NodeFeatureDiscovery custom\nresource. This also provides good support for managing NFD as a dependency of\nother operators.
\n\nDeployment using the\nNode Feature Discovery Operator\nis recommended to be done via\noperatorhub.io.
\n\nInstall the operator:
\n\nkubectl create -f https://operatorhub.io/install/nfd-operator.yaml\n
Create NodeFeatureDiscovery
object (in nfd
namespace here):
cat << EOF | kubectl apply -f -\napiVersion: v1\nkind: Namespace\nmetadata:\n name: nfd\n---\napiVersion: nfd.kubernetes.io/v1\nkind: NodeFeatureDiscovery\nmetadata:\n name: my-nfd-deployment\n namespace: nfd\nspec:\n operand:\n image: registry.k8s.io/nfd/node-feature-discovery:v0.15.7\n imagePullPolicy: IfNotPresent\nEOF\n
If you followed the deployment instructions above you can uninstall NFD with:
\n\nkubectl -n nfd delete NodeFeatureDiscovery my-nfd-deployment\n
Optionally, you can also remove the namespace:
\n\nkubectl delete ns nfd\n
See the node-feature-discovery-operator and OLM project\ndocumentation for instructions for uninstalling the operator and operator\nlifecycle manager, respectively.
\n\n\n","dir":"/deployment/","name":"operator.md","path":"deployment/operator.md","url":"/deployment/operator.html"},{"title":"Reference","layout":"default","sort":4,"content":"Command line and configuration reference.
\n","dir":"/reference/","name":"index.md","path":"reference/index.md","url":"/reference/"},{"title":"Worker config reference","layout":"default","sort":4,"content":"See the\nsample configuration file\nfor a full example configuration.
\n\nThe core
section contains common configuration settings that are not specific\nto any particular feature source.
core.sleepInterval
specifies the interval between consecutive passes of\nfeature (re-)detection, and thus also the interval between node re-labeling. A\nnon-positive value implies infinite sleep interval, i.e. no re-detection or\nre-labeling is done.
Default: 60s
Example:
\n\ncore:\n sleepInterval: 60s\n
core.featureSources
specifies the list of enabled feature sources. A special\nvalue all
enables all sources. Prefixing a source name with -
indicates\nthat the source will be disabled instead - this is only meaningful when used in\nconjunction with all
. This option allows completely disabling the feature\ndetection so that neither standard feature labels are generated nor the raw\nfeature data is available for custom rule processing.
Default: [all]
Example:
\n\ncore:\n # Enable all but cpu and local sources\n featureSources:\n - \"all\"\n - \"-cpu\"\n - \"-local\"\n
core:\n # Enable only cpu and local sources\n featureSources:\n - \"cpu\"\n - \"local\"\n
core.labelSources
specifies the list of enabled label sources. A special\nvalue all
enables all sources. Prefixing a source name with -
indicates\nthat the source will be disabled instead - this is only meaningful when used in\nconjunction with all
. This configuration option affects the generation of\nnode labels but not the actual discovery of the underlying feature data that is\nused e.g. in custom/NodeFeatureRule
rules.
\n\n\nNOTE: Overridden by the
\n-label-sources
command line flag and the\ncore.sources
configurations option (if either of them is specified).
Default: [all]
Example:
\n\ncore:\n # Enable all but cpu and system sources\n labelSources:\n - \"all\"\n - \"-cpu\"\n - \"-system\"\n
core:\n # Enable only cpu and system sources\n labelSources:\n - \"cpu\"\n - \"system\"\n
DEPRECATED: use core.labelSources
instead.
\n\n\nNOTE:
\ncore.sources
takes precedence over thecore.labelSources
\nconfiguration file option.
core.labelWhiteList
specifies a regular expression for filtering feature\nlabels based on the label name. Non-matching labels are not published.
\n\n\nNOTE: The regular expression is only matches against the “basename” part\nof the label, i.e. to the part of the name after ‘/’. The label prefix (or\nnamespace) is omitted.
\n
Default: null
Example:
\n\ncore:\n labelWhiteList: '^cpu-cpuid'\n
Setting core.noPublish
to true
disables all communication with the\nnfd-master and the Kubernetes API server. It is effectively a “dry-run” option.\nNFD-Worker runs feature detection normally, but no labeling requests are sent\nto nfd-master and no NodeFeature\nobjects are created or updated in the API server.
\n\n\nNOTE: Overridden by the\n
\n-no-publish
\ncommand line flag (if specified).
Default: false
Example:
\n\ncore:\n noPublish: true\n
The following options specify the logger configuration. Most of which can be\ndynamically adjusted at run-time.
\n\n\n\n\nNOTE: The logger options can also be specified via command line flags\nwhich take precedence over any corresponding config file options.
\n
If true, adds the file directory to the header of the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nLog to standard error as well as files.
\n\nDefault: false
Run-time configurable: yes
\n\nWhen logging hits line file:N, emit a stack trace.
\n\nDefault: empty
\n\nRun-time configurable: yes
\n\nIf non-empty, write log files in this directory.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nIf non-empty, use this log file.
\n\nDefault: empty
\n\nRun-time configurable: no
\n\nDefines the maximum size a log file can grow to. Unit is megabytes. If the\nvalue is 0, the maximum file size is unlimited.
\n\nDefault: 1800
Run-time configurable: no
\n\nLog to standard error instead of files
\n\nDefault: true
Run-time configurable: yes
\n\nIf true, avoid header prefixes in the log messages.
\n\nDefault: false
Run-time configurable: yes
\n\nIf true, avoid headers when opening log files.
\n\nDefault: false
Run-time configurable: no
\n\nLogs at or above this threshold go to stderr (default 2)
\n\nRun-time configurable: yes
\n\nNumber for the log level verbosity.
\n\nDefault: 0
Run-time configurable: yes
\n\nComma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
\n\nRun-time configurable: yes
\n\nThe sources
section contains feature source specific configuration parameters.
Prevent publishing cpuid features listed in this option.
\n\n\n\n\nNOTE: overridden by
\nsources.cpu.cpuid.attributeWhitelist
(if specified)
Default: [BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT,\nNX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SGXLC, SSE, SSE2, SSE3, SSE4.1,\nSSE4.2, SSSE3, TDX_GUEST]
Example:
\n\nsources:\n cpu:\n cpuid:\n attributeBlacklist: [MMX, MMXEXT]\n
Only publish the cpuid features listed in this option.
\n\n\n\n\nNOTE: takes precedence over
\nsources.cpu.cpuid.attributeBlacklist
Default: empty
\n\nExample:
\n\nsources:\n cpu:\n cpuid:\n attributeWhitelist: [AVX512BW, AVX512CD, AVX512DQ, AVX512F, AVX512VL]\n
Path of the kernel config file. If empty, NFD runs a search in the well-known\nstandard locations.
\n\nDefault: empty
\n\nExample:
\n\nsources:\n kernel:\n kconfigFile: \"/path/to/kconfig\"\n
Kernel configuration options to publish as feature labels.
\n\nDefault: [NO_HZ, NO_HZ_IDLE, NO_HZ_FULL, PREEMPT]
Example:
\n\nsources:\n kernel:\n configOpts: [NO_HZ, X86, DMI]\n
Configuration option to disable/enable hooks execution. Enabled by default.\nHooks are DEPRECATED since v0.12.0 release and support will be removed in a\nfuture release. Use\nfeature files instead.
\n\n\n\n\nNOTE: The default NFD container image only supports statically linked\nbinaries. Use the full image variant\nfor a slightly more extensive environment that additionally supports bash and\nperl runtimes.
\n
Related tracking issues:
\n\nDefault: false
\n\nExample:
\n\nsources:\n local:\n hooksEnabled: true\n
List of PCI device class IDs for which to\npublish a label. Can be specified as a main class only (e.g. 03
) or full\nclass-subclass combination (e.g. 0300
) - the former implies that all\nsubclasses are accepted. The format of the labels can be further configured\nwith deviceLabelFields.
Default: [\"03\", \"0b40\", \"12\"]
Example:
\n\nsources:\n pci:\n deviceClassWhitelist: [\"0200\", \"03\"]\n
The set of PCI ID fields to use when constructing the name of the feature\nlabel. Valid fields are class
, vendor
, device
, subsystem_vendor
and\nsubsystem_device
.
Default: [class, vendor]
Example:
\n\nsources:\n pci:\n deviceLabelFields: [class, vendor, device]\n
With the example config above NFD would publish labels like:\nfeature.node.kubernetes.io/pci-<class-id>_<vendor-id>_<device-id>.present=true
List of USB device class IDs for\nwhich to publish a feature label. The format of the labels can be further\nconfigured with deviceLabelFields.
\n\nDefault: [\"0e\", \"ef\", \"fe\", \"ff\"]
Example:
\n\nsources:\n usb:\n deviceClassWhitelist: [\"ef\", \"ff\"]\n
The set of USB ID fields from which to compose the name of the feature label.\nValid fields are class
, vendor
, device
and serial
.
Default: [class, vendor, device]
Example:
\n\nsources:\n pci:\n deviceLabelFields: [class, vendor]\n
With the example config above NFD would publish labels like:\nfeature.node.kubernetes.io/usb-<class-id>_<vendor-id>.present=true
List of rules to process in the custom feature source to create user-specific\nlabels. Refer to the documentation of the\ncustom feature source for\ndetails of the available rules and their configuration.
\n\nDefault: empty
\n\nExample:
\n\nsources:\n custom:\n - name: \"my custom rule\"\n labels:\n my-custom-feature: \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n e1000e: {op: Exists}\n - feature: pci.device\n matchExpressions:\n class: {op: In, value: [\"0200\"]}\n vendor: {op: In, value: [\"8086\"]}\n
NFD-Worker is preferably run as a Kubernetes DaemonSet. This assures\nre-labeling on regular intervals capturing changes in the system configuration\nand makes sure that new nodes are labeled as they are added to the cluster.\nWorker connects to the nfd-master service to advertise hardware features.
\n\nWhen run as a daemonset, nodes are re-labeled at an default interval of 60s.\nThis can be changed by using the\ncore.sleepInterval
\nconfig option.
The worker configuration file is watched and re-read on every change which\nprovides a mechanism of dynamic run-time reconfiguration. See\nworker configuration for more details.
\n\nNFD-Worker supports dynamic configuration through a configuration file. The\ndefault location is /etc/kubernetes/node-feature-discovery/nfd-worker.conf
,\nbut, this can be changed by specifying the-config
command line flag.\nConfiguration file is re-read whenever it is modified which makes run-time\nre-configuration of nfd-worker straightforward.
Worker configuration file is read inside the container, and thus, Volumes and\nVolumeMounts are needed to make your configuration available for NFD. The\npreferred method is to use a ConfigMap which provides easy deployment and\nre-configurability.
\n\nThe provided nfd-worker deployment templates create an empty configmap and\nmount it inside the nfd-worker containers. In kustomize deployments,\nconfiguration can be edited with:
\n\nkubectl -n ${NFD_NS} edit configmap nfd-worker-conf\n
In Helm deployments,\nWorker pod parameter\nworker.config
can be used to edit the respective configuration.
See\nnfd-worker configuration file reference\nfor more details.\nThe (empty-by-default)\nexample config\ncontains all available configuration options and can be used as a reference\nfor creating a configuration.
\n\nConfiguration options can also be specified via the -options
command line\nflag, in which case no mounts need to be used. The same format as in the config\nfile must be used, i.e. JSON (or YAML). For example:
-options='{\"sources\": { \"pci\": { \"deviceClassWhitelist\": [\"12\"] } } }'\n
Configuration options specified from the command line will override those read\nfrom the config file.
\n","dir":"/usage/","name":"nfd-worker.md","path":"usage/nfd-worker.md","url":"/usage/nfd-worker.html"},{"title":"TLS authentication","layout":"default","sort":5,"content":"\n\n\nDEPRECATED: this section only applies when the gRPC API is used, i.e.\nwhen the NodeFeature API is disabled via the
\n-enable-nodefeature-api=false
\nflag on both nfd-master and nfd-worker. The gRPC API is deprecated and will\nbe removed in a future release.
NFD supports mutual TLS authentication between the nfd-master and nfd-worker\ninstances. That is, nfd-worker and nfd-master both verify that the other end\npresents a valid certificate.
\n\nTLS authentication is enabled by specifying -ca-file
, -key-file
and\n-cert-file
args, on both the nfd-master and nfd-worker instances. The\ntemplate specs provided with NFD contain (commented out) example configuration\nfor enabling TLS authentication.
The Common Name (CN) of the nfd-master certificate must match the DNS name of\nthe nfd-master Service of the cluster. By default, nfd-master only check that\nthe nfd-worker has been signed by the specified root certificate (-ca-file).
\n\nAdditional hardening can be enabled by specifying -verify-node-name
in\nnfd-master args, in which case nfd-master verifies that the NodeName presented\nby nfd-worker matches the Common Name (CN) or a Subject Alternative Name (SAN)\nof its certificate. Note that -verify-node-name
complicates certificate\nmanagement and is not yet supported in the helm or kustomize deployment\nmethods.
cert-manager can be used to automate certificate\nmanagement between nfd-master and the nfd-worker pods.
\n\nThe NFD source code repository contains an example kustomize overlay and helm\nchart that can be used to deploy NFD with cert-manager supplied certificates\nenabled.
\n\nTo install cert-manager
itself, you can run:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml\n
Alternatively, you can refer to cert-manager documentation for other\ninstallation methods such as the Helm chart they provide.
\n\nTo use the kustomize overlay to install node-feature-discovery with TLS enabled,\nyou may use the following:
\n\nkubectl apply -k deployment/overlays/samples/cert-manager\n
To make use of the helm chart, override values.yaml
to enable both the\ntls.enabled
and tls.certManager
options. Note that if you do not enable\ntls.certManager
, helm will successfully install the application, but\ndeployment will wait until certificates are manually created, as demonstrated\nbelow.
See the sample installation commands in the Helm Deployment\nand Configuration sections above for how to either override\nindividual values, or provide a yaml file with which to override default\nvalues.
\n\nIf you do not with to make use of cert-manager, the certificates can be\nmanually created and stored as secrets within the NFD namespace.
\n\nCreate a CA certificate
\n\nopenssl req -x509 -newkey rsa:4096 -keyout ca.key -nodes \\\n -subj \"/CN=nfd-ca\" -days 10000 -out ca.crt\n
Create a common openssl config file.
\n\ncat <<EOF > nfd-common.conf\n[ req ]\ndefault_bits = 4096\nprompt = no\ndefault_md = sha256\nreq_extensions = req_ext\ndistinguished_name = dn\n\n[ dn ]\nC = XX\nST = some-state\nL = some-city\nO = some-company\nOU = node-feature-discovery\n\n[ req_ext ]\nsubjectAltName = @alt_names\n\n[ v3_ext ]\nauthorityKeyIdentifier=keyid,issuer:always\nbasicConstraints=CA:FALSE\nkeyUsage=keyEncipherment,dataEncipherment\nextendedKeyUsage=serverAuth,clientAuth\nsubjectAltName=@alt_names\nEOF\n
Now, create the nfd-master certificate.
\n\ncat <<EOF > nfd-master.conf\n.include nfd-common.conf\n\n[ dn ]\nCN = nfd-master\n\n[ alt_names ]\nDNS.1 = nfd-master\nDNS.2 = nfd-master.node-feature-discovery.svc.cluster.local\nDNS.3 = localhost\nEOF\n\nopenssl req -new -newkey rsa:4096 -keyout nfd-master.key -nodes -out nfd-master.csr -config nfd-master.conf\n
Create certificates for nfd-worker and nfd-topology-updater
\n\ncat <<EOF > nfd-worker.conf\n.include nfd-common.conf\n\n[ dn ]\nCN = nfd-worker\n\n[ alt_names ]\nDNS.1 = nfd-worker\nDNS.2 = nfd-worker.node-feature-discovery.svc.cluster.local\nEOF\n\n# Config for topology updater is identical except for the DN and alt_names\nsed -e 's/worker/topology-updater/g' < nfd-worker.conf > nfd-topology-updater.conf\n\nopenssl req -new -newkey rsa:4096 -keyout nfd-worker.key -nodes -out nfd-worker.csr -config nfd-worker.conf\nopenssl req -new -newkey rsa:4096 -keyout nfd-topology-updater.key -nodes -out nfd-topology-updater.csr -config nfd-topology-updater.conf\n
Now, sign the certificates with the CA created earlier.
\n\nfor cert in nfd-master nfd-worker nfd-topology-updater; do\n echo signing $cert\n openssl x509 -req -in $cert.csr -CA ca.crt -CAkey ca.key \\\n -CAcreateserial -out $cert.crt -days 10000 \\\n -extensions v3_ext -extfile $cert.conf\ndone\n
Finally, turn these certificates into secrets.
\n\nfor cert in nfd-master nfd-worker nfd-topology-updater; do\n echo creating secret for $cert in node-feature-discovery namespace\n cat <<EOF | kubectl create -n node-feature-discovery -f -\n---\napiVersion: v1\nkind: Secret\ntype: kubernetes.io/tls\nmetadata:\n name: ${cert}-cert\ndata:\n ca.crt: $( cat ca.crt | base64 -w 0 )\n tls.crt: $( cat $cert.crt | base64 -w 0 )\n tls.key: $( cat $cert.key | base64 -w 0 )\nEOF\n\ndone\n
git clone https://github.com/kubernetes-sigs/node-feature-discovery\ncd node-feature-discovery\n
See customizing the build below for altering the\ncontainer image registry, for example.
\n\nmake\n
Optional, this example with Docker.
\n\ndocker push <IMAGE_TAG>\n
The default set of architectures enabled for mulit-arch builds are linux/amd64
\nand linux/arm64
. If more architectures are needed one can override the\nIMAGE_ALL_PLATFORMS
variable with a comma separated list of OS/ARCH
tuples.
make image-all\n
Currently docker
does not support loading of manifest-lists meaning the images\nare not shown when executing docker images
, see:\nbuildx issue #59.
make push-all\n
The resulting container image can be used in the same way on each arch by pulling\ne.g. node-feature-discovery:v0.15.7
without specifying the\narchitecture. The manifest-list will take care of providing the right\narchitecture image.
To use your published image from the step above instead of the\nregistry.k8s.io/nfd/node-feature-discovery
image, edit image
\nattribute in the spec template(s) to the new location\n(<registry-name>/<image-name>[:<version>]
).
The yamls
makefile generates a kustomization.yaml
matching your locally\nbuilt image and using the deploy/overlays/default
deployment. See\nbuild customization below for configurability, e.g.\nchanging the deployment namespace.
K8S_NAMESPACE=my-ns make yamls\nkubectl apply -k .\n
You can use alternative deployment methods by modifying the auto-generated\nkustomization file.
\n\nYou can also build the binaries locally
\n\nmake build\n
This will compile binaries under bin/
There are several Makefile variables that control the build process and the\nname of the resulting container image. The following are targeted targeted for\nbuild customization and they can be specified via environment variables or\nmakefile overrides.
\n\nVariable | \nDescription | \nDefault value | \n
---|---|---|
HOSTMOUNT_PREFIX | \nPrefix of system directories for feature discovery (local builds) | \n/ (local builds) /host- (container builds) | \n
IMAGE_BUILD_CMD | \nCommand to build the image | \ndocker build | \n
IMAGE_BUILD_EXTRA_OPTS | \nExtra options to pass to build command | \nempty | \n
IMAGE_BUILDX_CMD | \nCommand to build and push multi-arch images with buildx | \nDOCKER_CLI_EXPERIMENTAL=enabled docker buildx build –platform=${IMAGE_ALL_PLATFORMS} –progress=auto –pull | \n
IMAGE_ALL_PLATFORMS | \nComma separated list of OS/ARCH tuples for mulit-arch builds | \nlinux/amd64,linux/arm64 | \n
IMAGE_PUSH_CMD | \nCommand to push the image to remote registry | \ndocker push | \n
IMAGE_REGISTRY | \nContainer image registry to use | \nregistry.k8s.io/nfd | \n
IMAGE_TAG_NAME | \nContainer image tag name | \n<nfd version> | \n
IMAGE_EXTRA_TAG_NAMES | \nAdditional container image tag(s) to create when building image | \nempty | \n
K8S_NAMESPACE | \nnfd-master and nfd-worker namespace | \nnode-feature-discovery | \n
For example, to use a custom registry:
\n\nmake IMAGE_REGISTRY=<my custom registry uri>\n
Or to specify a build tool different from Docker, It can be done in 2 ways:
\n\nvia environment
\n\n IMAGE_BUILD_CMD=\"buildah bud\" make\n
by overriding the variable value
\n\n make IMAGE_BUILD_CMD=\"buildah bud\"\n
Unit tests are automatically run as part of the container image build. You can\nalso run them manually in the source code tree by running:
\n\nmake test\n
End-to-end tests are built on top of the e2e test framework of Kubernetes, and,\nthey required a cluster to run them on. For running the tests on your test\ncluster you need to specify the kubeconfig to be used:
\n\nmake e2e-test KUBECONFIG=$HOME/.kube/config\n
There are several environment variables that can be used to customize the\ne2e-tests:
\n\nVariable | \nDescription | \nDefault value | \n
---|---|---|
KUBECONFIG | \nKubeconfig for running e2e-tests | \nempty | \n
E2E_TEST_CONFIG | \nParameterization file of e2e-tests (see example) | \nempty | \n
E2E_PULL_IF_NOT_PRESENT | \nTrue-ish value makes the image pull policy IfNotPresent (to be used only in e2e tests) | \nfalse | \n
E2E_TEST_FULL_IMAGE | \nRun e2e-test also against the Full Image tag | \nfalse | \n
E2E_GINKGO_LABEL_FILTER | \nGinkgo label filter to use for running e2e tests | \nempty | \n
OPENSHIFT | \nNon-empty value enables OpenShift specific support (only affects e2e tests) | \nempty | \n
\n\n\n**DEPRECATED: Running NFD locally is deprecated and will be removed in a\nfuture release. It depends on the gRPC API which is deprecated and will be\nremoved in a future release. To run NFD locally, use the\n
\n-enable-nodefeature-api=false
flag.
You can run NFD locally, either directly on your host OS or in containers for\ntesting and development purposes. This may be useful e.g. for checking\nfeatures-detection.
\n\nWhen running as a standalone container labeling is expected to fail because\nKubernetes API is not available. Thus, it is recommended to use -no-publish
\nAlso specify -crd-controller=false
and -enable-nodefeature-api=false
\ncommand line flags to disable CRD controller and enable gRPC. E.g.
$ export NFD_CONTAINER_IMAGE=registry.k8s.io/nfd/node-feature-discovery:v0.15.7\n$ docker run --rm --name=nfd-test ${NFD_CONTAINER_IMAGE} nfd-master -no-publish -crd-controller=false -enable-nodefeature-api=false\n2019/02/01 14:48:21 Node Feature Discovery Master <NFD_VERSION>\n2019/02/01 14:48:21 gRPC server serving on port: 8080\n
To run nfd-worker as a “stand-alone” container you need to run it in the same\nnetwork namespace as the nfd-master container:
\n\n$ docker run --rm --network=container:nfd-test ${NFD_CONTAINER_IMAGE} nfd-worker -enable-nodefeature-api=false\n2019/02/01 14:48:56 Node Feature Discovery Worker <NFD_VERSION>\n...\n
If you just want to try out feature discovery without connecting to nfd-master,\npass the -no-publish
flag to nfd-worker.
\n\n\nNOTE: Some feature sources need certain directories and/or files from the\nhost mounted inside the NFD container. Thus, you need to provide Docker with\nthe correct
\n--volume
options for them to work correctly when run\nstand-alone directly withdocker run
. See\nthe default deployment\nfor up-to-date information about the required volume mounts.
To run nfd-topology-updater as a “stand-alone” container\nyou need to run it in with the -no-publish
flag to disable communication to\nthe Kubernetes apiserver.
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-topology-updater -no-publish\n2019/02/01 14:48:56 Node Feature Discovery Topology Updater <NFD_VERSION>\n...\n
If you just want to try out resource topology discovery without connecting to\nthe Kubernetes API, pass the -no-publish
flag to nfd-topology-updater.
\n\n\nNOTE: NFD topology updater needs certain directories and/or files from\nthe host mounted inside the NFD container. Thus, you need to provide Docker\nwith the correct
\n\n--volume
options for them to work correctly when\nrun stand-alone directly withdocker run
. See\nthe template spec\nfor up-to-date information about the required volume mounts.PodResource API is a prerequisite for\nnfd-topology-updater. Preceding Kubernetes v1.23, the
\nkubelet
must be\nstarted with the following flag:\n--feature-gates=KubeletPodResourcesGetAllocatable=true
. Starting\nKubernetes v1.23, theGetAllocatableResources
is enabled by default through\nKubeletPodResourcesGetAllocatable
feature gate.
Another option for building NFD locally is via Tilt tool, which can build container\nimages, push them to a local registry and reload your Kubernetes pods automatically.\nWhen using Tilt, you don’t have to build container images and re-deploy your pods\nmanually but instead let the Tilt take care of it. Tiltfile is a configuration file\nfor the Tilt and is located at the root directory. To develop NFD with Tilt, follow\nthe steps below.
\n\nTo start up your Tilt development environment, run
\n\ntilt up\n
at the root of your local NFD codebase. Tilt will start a web interface in the\nlocalhost and port 10350. From the web interface, you are able to see how NFD worker\nand master are progressing, watch their build and runtime logs. Once your code changes\nare saved locally, Tilt will notice it and re-build the container image from the\ncurrent code, push the image to the registry and re-deploy NFD pods with the latest\ncontainer image.
\n\nTo override environment variables used in the Tiltfile during image build,\nexport them in your current terminal before starting Tilt.
\n\nexport IMAGE_TAG_NAME=\"v1\"\ntilt up\n
This will override the default value(master
) of IMAGE_TAG_NAME
variable defined\nin the Tiltfile.
All documentation resides under the\ndocs\ndirectory in the source tree. It is designed to be served as a html site by\nGitHub Pages.
\n\nBuilding the documentation is containerized to fix the build\nenvironment. The recommended way for developing documentation is to run:
\n\nmake site-serve\n
This will build the documentation in a container and serve it under\nlocalhost:4000/ making it easy to verify the results.\nAny changes made to the docs/
will automatically re-trigger a rebuild and are\nreflected in the served content and can be inspected with a browser refresh.
To just build the html documentation run:
\n\nmake site-build\n
This will generate html documentation under docs/_site/
.
To quickly view available command line flags execute nfd-topology-updater -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 \\\nnfd-topology-updater -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -config
flag specifies the path of the nfd-topology-updater\nconfiguration file to use.
Default: /etc/kubernetes/node-feature-discovery/nfd-topology-updater.conf
\n\nExample:
\n\nnfd-topology-updater -config=/opt/nfd/nfd-topology-updater.conf\n
The -no-publish
flag disables all communication with the nfd-master, making\nit a “dry-run” flag for nfd-topology-updater. NFD-Topology-Updater runs\nresource hardware topology detection normally, but no CR requests are sent to\nnfd-master.
Default: false
\n\nExample:
\n\nnfd-topology-updater -no-publish\n
The -oneshot
flag causes nfd-topology-updater to exit after one pass of\nresource hardware topology detection.
Default: false
\n\nExample:
\n\nnfd-topology-updater -oneshot -no-publish\n
The -metrics
flag specifies the port on which to expose\nPrometheus metrics. Setting this to 0 disables the\nmetrics server on nfd-topology-updater.
Default: 8081
\n\nExample:
\n\nnfd-topology-updater -metrics=12345\n
The -sleep-interval
specifies the interval between resource hardware\ntopology re-examination (and CR updates). zero means no CR updates on interval basis.
Default: 60s
\n\nExample:
\n\nnfd-topology-updater -sleep-interval=1h\n
The -watch-namespace
specifies the namespace to ensure that resource\nhardware topology examination only happens for the pods running in the\nspecified namespace. Pods that are not running in the specified namespace\nare not considered during resource accounting. This is particularly useful\nfor testing/debugging purpose. A “*” value would mean that all the pods would\nbe considered during the accounting process.
Default: “*”
\n\nExample:
\n\nnfd-topology-updater -watch-namespace=rte\n
The -kubelet-config-uri
specifies the path to the Kubelet’s configuration.\nNote that the URi could either be a local host file or an HTTP endpoint.
Default: https://${NODE_ADDRESS}:10250/configz
Example:
\n\nnfd-topology-updater -kubelet-config-uri=file:///var/lib/kubelet/config.yaml\n
The -api-auth-token-file
specifies the path to the api auth token file\nwhich is used to retrieve Kubelet’s configuration from Kubelet secure port,\nonly taking effect when -kubelet-config-uri
is https.\nNote that this token file must bind to a role that has the get
capability to\nnodes/proxy
resources.
Default: /var/run/secrets/kubernetes.io/serviceaccount/token
Example:
\n\nnfd-topology-updater -token-file=/var/run/secrets/kubernetes.io/serviceaccount/token\n
The -podresources-socket
specifies the path to the Unix socket where kubelet\nexports a gRPC service to enable discovery of in-use CPUs and devices, and to\nprovide metadata for them.
Default: /host-var/lib/kubelet/pod-resources/kubelet.sock
\n\nExample:
\n\nnfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock\n
Enables compute and report the pod set fingerprint in the NRT.\nA pod fingerprint is a compact representation of the “node state” regarding resources.
\n\nDefault: false
Example:
\n\nnfd-topology-updater -pods-fingerprint\n
The -kubelet-state-dir
specifies the path to the Kubelet state directory,\nwhere state and checkpoint files are stored.\nThe files are mount as read-only and cannot be change by the updater.\nEnabled by default.\nPassing an empty string will disable the watching.
Default: /host-var/lib/kubelet
\n\nExample:
\n\nnfd-topology-updater -kubelet-state-dir=/var/lib/kubelet\n
NFD-Topology-Updater is preferably run as a Kubernetes DaemonSet.\nThis assures re-examination on regular intervals\nand/or per pod life-cycle events, capturing changes in the allocated\nresources and hence the allocatable resources on a per-zone basis by updating\nNodeResourceTopology custom resources.\nIt makes sure that new NodeResourceTopology instances are created for each new\nnodes that get added to the cluster.
\n\nBecause of the design and implementation of Kubernetes, only resources exclusively\nallocated to Guaranteed Quality of Service\npods will be accounted.\nThis includes\nCPU cores,\nmemory\nand\ndevices.
\n\nWhen run as a daemonset, nodes are re-examined for the allocated resources\n(to determine the information of the allocatable resources on a per-zone basis\nwhere a zone can be a NUMA node) at an interval specified using the\n-sleep-interval
\noption. The default sleep interval is set to 60s\nwhich is the value when no -sleep-interval is specified.\nThe re-examination can be disabled by setting the sleep-interval to 0.
Another option is to configure the updater to update\nthe allocated resources per pod life-cycle events.\nThe updater will monitor the checkpoint file stated in\n-kubelet-state-dir
\nand triggers an update for every change occurs in the files.
In addition, it can avoid examining specific allocated resources\ngiven a configuration of resources to exclude via -excludeList
Kubelet PodResource API with the\nGetAllocatableResources functionality enabled is a\nprerequisite for nfd-topology-updater to be able to run (i.e. Kubernetes v1.21\nor later is required).
\n\nPreceding Kubernetes v1.23, the kubelet
must be started with\n--feature-gates=KubeletPodResourcesGetAllocatable=true
.
Starting from Kubernetes v1.23, the KubeletPodResourcesGetAllocatable
\nfeature gate. is enabled by default
NFD-Topology-Updater supports configuration through a configuration file. The\ndefault location is /etc/kubernetes/node-feature-discovery/topology-updater.conf
,\nbut, this can be changed by specifying the-config
command line flag.
\n\n\nNOTE: unlike nfd-worker, dynamic configuration updates are not supported.
\n
Topology-Updater configuration file is read inside the container,\nand thus, Volumes and VolumeMounts are needed\nto make your configuration available for NFD.\nThe preferred method is to use a ConfigMap\nwhich provides easy deployment and re-configurability.
\n\nThe provided nfd-topology-updater deployment templates\ncreate an empty configmap\nand mount it inside the nfd-topology-updater containers.\nIn kustomize deployments, configuration can be edited with:
\n\nkubectl -n ${NFD_NS} edit configmap nfd-topology-updater-conf\n
In Helm deployments,\nTopology Updater parameters\ntoplogyUpdater.config
can be used to edit the respective configuration.
See\nnfd-topology-updater configuration file reference\nfor more details.\nThe (empty-by-default)\nexample config\ncontains all available configuration options and can be used as a reference\nfor creating a configuration.
\n\n\n","dir":"/usage/","name":"nfd-topology-updater.md","path":"usage/nfd-topology-updater.md","url":"/usage/nfd-topology-updater.html"},{"title":"Contributing","layout":"default","sort":6,"content":"You can reach us via the following channels:
\n\nThis is a\nSIG-node\nsubproject, hosted under the\nKubernetes SIGs organization in Github.\nThe project was established in 2016 and was migrated to Kubernetes SIGs in 2018.
\n\nThis is open source software released under the Apache 2.0 License.
\n","dir":"/contributing/","name":"index.md","path":"contributing/index.md","url":"/contributing/"},{"title":"Uninstallation","layout":"default","sort":6,"content":"Follow the uninstallation instructions of the deployment method used\n(kustomize,\nhelm or\noperator).
\n\nNFD-Master has a special -prune
command line flag for removing all\nnfd-related node labels, annotations, extended resources and taints from the\ncluster.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.7\nkubectl -n node-feature-discovery wait job.batch/nfd-master --for=condition=complete && \\\n kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.7\n
\n\n","dir":"/deployment/","name":"uninstallation.md","path":"deployment/uninstallation.md","url":"/deployment/uninstallation.html"},{"title":"Topology-Updater config reference","layout":"default","sort":6,"content":"NOTE: You must run prune before removing the RBAC rules (serviceaccount,\nclusterrole and clusterrolebinding).
\n
See the\nsample configuration file\nfor a full example configuration.
\n\nThe excludeList
specifies a key-value map of allocated resources\nthat should not be examined by the topology-updater\nagent per node.\nEach key is a node name with a value as a list of resources\nthat should not be examined by the agent for that specific node.
Default: empty
\n\nExample:
\n\nexcludeList:\n nodeA: [hugepages-2Mi]\n nodeB: [memory]\n nodeC: [cpu, hugepages-2Mi]\n
excludeList.*
is a special value that use to specify all nodes.\nA resource that would be listed under this key, would be excluded from all nodes.
Default: empty
\n\nExample:
\n\nexcludeList:\n '*': [hugepages-2Mi]\n
NFD-GC (NFD Garbage-Collector) is preferably run as a Kubernetes deployment\nwith one replica. It makes sure that all\nNodeFeature and\nNodeResourceTopology objects\nhave corresponding nodes and removes stale objects for non-existent nodes.
\n\nThe daemon watches for Node deletion events and removes NodeFeature and\nNodeResourceTopology objects upon them. It also runs periodically to make sure\nno node delete event was missed and to remove any NodeFeature or\nNodeResourceTopology objects that were created without corresponding node. The\ndefault garbage collector interval is set to 1h which is the value when no\n-gc-interval is specified.
\n\nIn Helm deployments (see\ngarbage collector parameters)\nNFD-GC will only be deployed when enableNodeFeatureApi
or\ntopologyUpdater.enable
is set to true.
Metrics are configured to be exposed using prometheus operator\nAPI’s by default. If you want to expose metrics using the prometheus operator\nAPI’s you need to install the prometheus operator in your cluster.\nBy default NFD Master and Worker expose metrics on port 8081.
\n\nThe exposed metrics are
\n\nMetric | \nType | \nDescription | \n
---|---|---|
nfd_master_build_info | \n Gauge | \nVersion from which nfd-master was built | \n
nfd_worker_build_info | \n Gauge | \nVersion from which nfd-worker was built | \n
nfd_gc_build_info | \n Gauge | \nVersion from which nfd-gc was built | \n
nfd_topology_updater_build_info | \n Gauge | \nVersion from which nfd-topology-updater was built | \n
nfd_node_update_requests_total | \n Counter | \nNumber of node update requests received by the master over gRPC | \n
nfd_node_updates_total | \n Counter | \nNumber of nodes updated | \n
nfd_node_update_failures_total | \n Counter | \nNumber of nodes update failures | \n
nfd_node_labels_rejected_total | \n Counter | \nNumber of nodes labels rejected by nfd-master | \n
nfd_node_extendedresources_rejected_total | \n Counter | \nNumber of nodes extended resources rejected by nfd-master | \n
nfd_node_taints_rejected_total | \n Counter | \nNumber of nodes taints rejected by nfd-master | \n
nfd_nodefeaturerule_processing_duration_seconds | \n Histogram | \nTime taken to process NodeFeatureRule objects | \n
nfd_nodefeaturerule_processing_errors_total | \n Counter | \nNumber or errors encountered while processing NodeFeatureRule objects | \n
nfd_feature_discovery_duration_seconds | \n Histogram | \nTime taken to discover features on a node | \n
nfd_topology_updater_scan_errors_total | \n Counter | \nNumber of errors in scanning resource allocation of pods. | \n
nfd_gc_objects_deleted_total | \n Counter | \nNumber of NodeFeature and NodeResourceTopology objects garbage collected. | \n
nfd_gc_object_delete_failures_total | \n Counter | \nNumber of errors in deleting NodeFeature and NodeResourceTopology objects. | \n
To deploy NFD with metrics enabled using kustomize, you can use the\nprometheus overlay.
\n\nBy default metrics are enabled when deploying NFD via Helm. To enable Prometheus\nto scrape metrics from NFD, you need to pass the following values to Helm:
\n\n--set prometheus.enable=true\n
For more info on Helm deployment, see Helm.
\n\nIt is recommended to specify\n--set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false
\nwhen deploying prometheus-operator via Helm to enable the prometheus-operator\nto scrape metrics from any PodMonitor.
or setting labels on the PodMonitor via the helm parameter prometheus.labels
\nto control which Prometheus instances will scrape this PodMonitor.
NFD contains an example Grafana dashboard. You can import\nexamples/grafana-dashboard.json
\nto your Grafana instance to visualize the NFD metrics.
To quickly view available command line flags execute nfd-gc -help
.\nIn a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 \\\nnfd-gc -help\n
Print usage and exit.
\n\nPrint version and exit.
\n\nThe -gc-interval
specifies the interval between periodic garbage collector runs.
Default: 1h
\n\nExample:
\n\nnfd-gc -gc-interval=1h\n
NFD uses some Kubernetes custom resources.
\n\nNodeFeature is an NFD-specific custom resource for communicating node\nfeatures and node labeling requests. The nfd-master pod watches for NodeFeature\nobjects, labels nodes as specified and uses the listed features as input when\nevaluating NodeFeatureRules. NodeFeature objects can be\nused for implementing 3rd party extensions (see\ncustomization guide for more\ndetails).
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeature\nmetadata:\n labels:\n nfd.node.kubernetes.io/node-name: node-1\n name: node-1-vendor-features\nspec:\n features:\n instances:\n vendor.device:\n elements:\n - attributes:\n model: \"xpu-1\"\n memory: \"4000\"\n type: \"fast\"\n - attributes:\n model: \"xpu-2\"\n memory: \"16000\"\n type: \"slow\"\n labels:\n vendor-xpu-present: \"true\"\n
NodeFeatureRule is an NFD-specific custom resource that is designed for\nrule-based custom labeling of nodes. NFD-Master watches for NodeFeatureRule\nobjects in the cluster and labels nodes according to the rules within. Some use\ncases are e.g. application specific labeling in a specific environments or\nbeing distributed by hardware vendors to create specific labels for their\ndevices.
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: example-rule\nspec:\n rules:\n - name: \"example rule\"\n labels:\n \"example-custom-feature\": \"true\"\n # Label is created if all of the rules below match\n matchFeatures:\n # Match if \"veth\" kernel module is loaded\n - feature: kernel.loadedmodule\n matchExpressions:\n veth: {op: Exists}\n # Match if any PCI device with vendor 8086 exists in the system\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"8086\"]}\n
See the\nCustomization guide\nfor full documentation of the NodeFeatureRule resource and its usage.
\n\nThe\ndeployment/nodefeaturerule/samples/
\ndirectory contains sample NodeFeatureRule objects that replicate the built-in\ndefault feature labels generated by NFD. The sample rules can be used as a base\nto customize NFD feature labels. To use them in place of the the NFD built-in\nlabels, the corresponding feature source(s) of nfd-worker should be disabled\nwith the\ncore.labelSources
\nconfiguration option.
When run with NFD-Topology-Updater, NFD creates NodeResourceTopology objects\ncorresponding to node resource hardware topology such as:
\n\napiVersion: topology.node.k8s.io/v1alpha1\nkind: NodeResourceTopology\nmetadata:\n name: node1\ntopologyPolicies: [\"SingleNUMANodeContainerLevel\"]\nzones:\n - name: node-0\n type: Node\n resources:\n - name: cpu\n capacity: 20\n allocatable: 16\n available: 10\n - name: vendor/nic1\n capacity: 3\n allocatable: 3\n available: 3\n - name: node-1\n type: Node\n resources:\n - name: cpu\n capacity: 30\n allocatable: 30\n available: 15\n - name: vendor/nic2\n capacity: 6\n allocatable: 6\n available: 6\n - name: node-2\n type: Node\n resources:\n - name: cpu\n capacity: 30\n allocatable: 30\n available: 15\n - name: vendor/nic1\n capacity: 3\n allocatable: 3\n available: 3\n
The NodeResourceTopology objects created by NFD can be used to gain insight\ninto the allocatable resources along with the granularity of those resources at\na per-zone level (represented by node-0 and node-1 in the above example) or can\nbe used by an external entity (e.g. topology-aware scheduler plugin) to take an\naction based on the gathered information.
\n\n\n","dir":"/usage/","name":"custom-resources.md","path":"usage/custom-resources.md","url":"/usage/custom-resources.html"},{"title":"Kubectl plugin cmdline reference","layout":"default","sort":8,"content":"To quickly view available command line flags execute kubectl nfd -help
.
Print usage and exit.
\n\nValidate a NodeFeatureRule file.
\n\nThe --nodefeature-file
flag specifies the path to the NodeFeatureRule file\nto validate.
Test a NodeFeatureRule file against a node without applying it.
\n\nThe --kubeconfig
flag specifies the path to the kubeconfig file to use for\nCLI requests.
The --namespace
flag specifies the namespace to use for CLI requests.\nDefault: default
.
The --nodename
flag specifies the name of the node to test the\nNodeFeatureRule against.
The --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file\nto test.
Process a NodeFeatureRule file against a NodeFeature file.
\n\nThe --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file\nto test.
The --nodefeature-file
flag specifies the path to the NodeFeature file to test.
NFD provides multiple extension points for vendor and application specific\nlabeling:
\n\nNodeFeature
objects can be\nused to communicate “raw” node features and node labeling requests to\nnfd-master.NodeFeatureRule
objects provide a way to\ndeploy custom labeling rules via the Kubernetes API.local
feature source of nfd-worker creates\nlabels by reading text files and executing hooks.custom
feature source of nfd-worker creates\nlabels based on user-specified rules.NodeFeature objects provide a way for 3rd party extensions to advertise custom\nfeatures, both as “raw” features that serve as input to\nNodeFeatureRule objects and as feature\nlabels directly.
\n\nNote that RBAC rules must be created for each extension for them to be able to\ncreate and manipulate NodeFeature objects in their namespace.
\n\nThe NodeFeature CRD API can be disabled with the\n-enable-nodefeature-api=false
command line flag. This flag must be specified\nfor both nfd-master and nfd-worker as it will enable the gRPC communication\nbetween them. Note that the gRPC API is DEPRECATED and will be removed in a\nfuture release, at which point the NodeFeature API cannot be disabled.
Consider the following referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeature\nmetadata:\n labels:\n nfd.node.kubernetes.io/node-name: node-1\n name: vendor-features-for-node-1\nspec:\n # Features for NodeFeatureRule matching\n features:\n flags:\n vendor.flags:\n elements:\n feature-x: {}\n feature-y: {}\n attributes:\n vendor.config:\n elements:\n setting-a: \"auto\"\n knob-b: \"123\"\n instances:\n vendor.devices:\n elements:\n - attributes:\n model: \"dev-1000\"\n vendor: \"acme\"\n - attributes:\n model: \"dev-2000\"\n vendor: \"acme\"\n # Labels to be created\n labels:\n vendor.io/feature.enabled: \"true\"\n
The object targets node named node-1
. It lists two “flag type” features under\nthe vendor.flags
domain, two “attribute type” features and under the\nvendor.config
domain and two “instance type” features under the\nvendor.devices
domain. These features will not be directly affecting the node\nlabels but they will be used as input when the\nNodeFeatureRule
objects are evaluated.
In addition, the example requests directly the\nvendor.io/feature.enabled=true
node label to be created.
The nfd.node.kubernetes.io/node-name=<node-name>
must be in place for each\nNodeFeature object as NFD uses it to determine the node which it is targeting.
Features are divided into three different types:
\n\nNodeFeatureRule
objects provide an easy way to create vendor or application\nspecific labels and taints. It uses a flexible rule-based mechanism for creating\nlabels and optionally taints based on node features.
Consider the following referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-sample-rule-object\nspec:\n rules:\n - name: \"my sample rule\"\n labels:\n \"feature.node.kubernetes.io/my-sample-feature\": \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n dummy: {op: Exists}\n - feature: kernel.config\n matchExpressions:\n X86: {op: In, value: [\"y\"]}\n
It specifies one rule which creates node label\nfeature.node.kubernetes.io/my-sample-feature=true
if both of the following\nconditions are true (matchFeatures
implements a logical AND over the\nmatchers):
dummy
network driver module has been loaded=y
Create a NodeFeatureRule
with a yaml file:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/v0.15.7/examples/nodefeaturerule.yaml\n
Now, on X86 platforms the feature label appears after doing modprobe dummy
on\na system and correspondingly the label is removed after rmmod dummy
. Note a\nre-labeling delay up to the sleep-interval of nfd-worker (1 minute by default).
See Feature rule format for detailed description of\navailable fields and how to write labeling rules.
\n\nThis feature is experimental.
\n\nIn some circumstances, it is desirable to keep nodes with specialized hardware\naway from running general workload and instead leave them for workloads that\nneed the specialized hardware. One way to achieve it is to taint the nodes with\nthe specialized hardware and add corresponding toleration to pods that require\nthe special hardware. NFD offers node tainting functionality which is disabled\nby default. User can define one or more custom taints via the taints
field of\nthe NodeFeatureRule CR. The same rule-based mechanism is applied here and the\nNFD taints only rule matching nodes.
To enable the tainting feature, --enable-taints
flag needs to be set to true
.\nIf the flag --enable-taints
is set to false
(i.e. disabled), taints defined in\nthe NodeFeatureRule CR have no effect and will be ignored by the NFD master.
See documentation of the taints field for detailed description how\nto specify taints in the NodeFeatureRule object.
\n\n\n\n\nNOTE: Before enabling any taints, make sure to edit nfd-worker daemonset\nto tolerate the taints to be created. Otherwise, already running pods that do\nnot tolerate the taint are evicted immediately from the node including the\nnfd-worker pod.
\n
NFD-Worker has a special feature source named local
which is an integration\npoint for external feature detectors. It provides a mechanism for pluggable\nextensions, allowing the creation of new user-specific features and even\noverriding built-in labels.
The local
feature source has two methods for detecting features, feature\nfiles and hooks (deprecated). The features discovered by the local
source can\nfurther be used in label rules specified in\nNodeFeatureRule
objects and the\ncustom
feature source.
\n\n\nNOTE: Be careful when creating and/or updating hook or feature files\nwhile NFD is running. To avoid race conditions you should write\ninto a temporary file, and atomically create/update the original file by\ndoing a file rename operation. NFD ignores dot files,\nso temporary file can be written to the same directory and renamed\n(
\n.my.feature
->my.feature
) once file is complete. Both file names should\n(obviously) be unique for the given application.
Consider a plaintext file\n/etc/kubernetes/node-feature-discovery/features.d/my-features
\nhaving the following contents (or alternatively a shell script\n/etc/kubernetes/node-feature-discovery/source.d/my-hook.sh
having the\nfollowing stdout output):
feature.node.kubernetes.io/my-feature.1\nfeature.node.kubernetes.io/my-feature.2=myvalue\nvendor.io/my-feature.3=456\n
This will translate into the following node labels:
\n\nfeature.node.kubernetes.io/my-feature.1: \"true\"\nfeature.node.kubernetes.io/my-feature.2: \"myvalue\"\nvendor.io/my-feature.3: \"456\"\n
The local
source reads files found in\n/etc/kubernetes/node-feature-discovery/features.d/
. File content is parsed\nand translated into node labels, see the input format below.
DEPRECATED The local
source executes hooks found in\n/etc/kubernetes/node-feature-discovery/source.d/
. The hook files must be\nexecutable and they are supposed to print all discovered features in stdout
.\nSince NFD v0.13 the default container image only supports statically linked ELF\nbinaries.
stderr
output of hooks is propagated to NFD log so it can be used for\ndebugging and logging.
NFD tries to execute any regular files found from the hooks directory.\nAny additional data files the hook might need (e.g. a configuration file)\nshould be placed in a separate directory to avoid NFD unnecessarily\ntrying to execute them. A subdirectory under the hooks directory can be used,\nfor example /etc/kubernetes/node-feature-discovery/source.d/conf/
.
\n\n\nNOTE: Hooks are being DEPRECATED and will be removed in a future release.\nStarting from release v0.14 hooks are disabled by default and can be enabled\nvia
\nsources.local.hooksEnabled
field in the worker configuration.
sources:\n local:\n hooksEnabled: true # true by default at this point\n
\n\n\nNOTE: NFD will blindly run any executables placed/mounted in the hooks\ndirectory. It is the user’s responsibility to review the hooks for e.g.\npossible security implications.
\n\nNOTE: The full image variant\nprovides backwards-compatibility with older NFD versions by including a more\nexpanded environment, supporting bash and perl runtimes.
\n
The hook stdout and feature files are expected to contain features in simple\nkey-value pairs, separated by newlines:
\n\n# This is a comment\n<key>[=<value>]\n
The label value defaults to true
, if not specified.
Label namespace must be specified with <namespace>/<name>[=<value>]
.
\n\n\nNOTE: The feature file size limit it 64kB. The feature file will be\nignored if the size limit is exceeded.
\n
Comment lines (starting with #
) are ignored.
Adding following line anywhere to feature file defines date when\nits content expires / is ignored:
\n\n# +expiry-time=2023-07-29T11:22:33Z\n
Also, the expiry-time value would stay the same during the processing of the\nfeature file until another expiry-time directive is encountered.\nConsidering the following file:
\n\n# +expiry-time=2012-07-28T11:22:33Z\nvendor.io/feature1=featureValue\n\n# +expiry-time=2080-07-28T11:22:33Z\nvendor.io/feature2=featureValue2\n\n# +expiry-time=2070-07-28T11:22:33Z\nvendor.io/feature3=featureValue3\n\n# +expiry-time=2002-07-28T11:22:33Z\nvendor.io/feature4=featureValue4\n
After processing the above file, only vendor.io/feature2
and\nvendor.io/feature3
would be included in the list of accepted features.
\n\n\nNOTE: The time format supported is RFC3339. Also, the
\nexpiry-time
\ntag is only evaluated in each re-discovery period, and the expiration of\nnode labels is not tracked.
To exclude specific features from the local.feature
Feature, you can use the\n# +no-feature
directive. The # +no-label
directive causes the feature to\nbe excluded from the local.label
Feature and a node label not to be generated.
Considering the following file:
\n\n# +no-feature\nvendor.io/label-only=value\n\nvendor.io/my-feature=value\n\nvendor.io/foo=bar\n\n# +no-label\nfoo=baz\n
Processing the above file would result in the following Features:
\n\nlocal.features:\n foo: baz\n vendor.io/my-feature: value\nlocal.labels:\n vendor.io/label-only: value\n vendor.io/my-feature: value\n
and the following labels added to the Node:
\n\nvendor.io/label-only=value\nvendor.io/my-feature=value\n
\n\n\nNOTE: use of unprefixed label names (like
\nfoo=bar
) should not be used.\nIn NFD v0.15 unprefixed names will be automatically prefixed\nwithfeature.node.kubernetes.io/
but this will change in a future version\n(see\nautoDefaultNs config option.\nUnprefixed names for plain Features (tagged with# +no-label
) can be used\nwithout restrictions, however.
The standard NFD deployments contain hostPath
mounts for\n/etc/kubernetes/node-feature-discovery/source.d/
and\n/etc/kubernetes/node-feature-discovery/features.d/
, making these directories\nfrom the host available inside the nfd-worker container.
One use case for the feature files and hooks is detecting features in other\nPods outside NFD, e.g. in Kubernetes device plugins. By using the same\nhostPath
mounts for /etc/kubernetes/node-feature-discovery/source.d/
and\n/etc/kubernetes/node-feature-discovery/features.d/
in the side-car (e.g.\ndevice plugin) creates a shared area for deploying feature files and hooks to\nNFD. NFD periodically scans the directories and reads any feature files and\nruns any hooks it finds.
The custom
feature source in nfd-worker provides a rule-based mechanism for\nlabel creation, similar to the\nNodeFeatureRule
objects. The difference is\nthat the rules are specified in the worker configuration instead of a\nKubernetes API object.
See worker configuration\nfor instructions how to set-up and manage the worker configuration.
\n\nConsider the following referential configuration for nfd-worker:
\n\ncore:\n labelSources: [\"custom\"]\nsources:\n custom:\n - name: \"my sample rule\"\n labels:\n \"feature.node.kubenernetes.io/my-sample-feature\": \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n dummy: {op: Exists}\n - feature: kernel.config\n matchExpressions:\n X86: {op: In, value: [\"y\"]}\n
It specifies one rule which creates node label\nfeature.node.kubenernetes.io/my-sample-feature=true
if both of the following\nconditions are true (matchFeatures
implements a logical AND over the\nmatchers):
dummy
network driver module has been loaded=y
In addition, the configuration only enables the custom
source, disabling all\nbuilt-in labels.
Now, on X86 platforms the feature label appears after doing modprobe dummy
on\na system and correspondingly the label is removed after rmmod dummy
. Note a\nre-labeling delay up to the sleep-interval of nfd-worker (1 minute by default).
In addition to the rules defined in the nfd-worker configuration file, the\ncustom
feature source can read more configuration files located in the\n/etc/kubernetes/node-feature-discovery/custom.d/
directory. This makes more\ndynamic and flexible configuration easier.
As an example, consider having file\n/etc/kubernetes/node-feature-discovery/custom.d/my-rule.yaml
with the\nfollowing content:
- name: \"my e1000 rule\"\n labels:\n \"feature.node.kubenernetes.io/e1000.present\": \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n e1000: {op: Exists}\n
This simple rule will create feature.node.kubenernetes.io/e1000.present=true
\nlabel if the e1000
kernel module has been loaded.
The\nsamples/custom-rules
\nkustomize overlay sample contains an example for deploying a custom rule from a\nConfigMap.
Feature labels have the following format:
\n\n<namespace>/<name> = <value>\n
The namespace part (i.e. prefix) of the labels is controlled by nfd:
\n\nfeature.node.kubernetes.io
.-deny-label-ns
\ncommand line flag of nfd-master\n -extra-label-ns
\ncommand line flag of nfd-master.\ne.g: nfd-master -deny-label-ns=\"*\" -extra-label-ns=example.com
This section describes the rule format used in\nNodeFeatureRule
objects and in the\nconfiguration of the custom
feature source.
It is based on a generic feature matcher that covers all features discovered by\nnfd-worker. The rules rely on a unified data model of the available features\nand a generic expression-based format. Features that can be used in the rules\nare described in detail in available features below.
\n\nTake this rule as a referential example:
\n\n - name: \"my feature rule\"\n labels:\n \"feature.node.kubernetes.io/my-special-feature\": \"my-value\"\n matchFeatures:\n - feature: cpu.cpuid\n matchExpressions:\n AVX512F: {op: Exists}\n - feature: kernel.version\n matchExpressions:\n major: {op: In, value: [\"5\"]}\n minor: {op: Gt, value: [\"1\"]}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"8086\"]}\n class: {op: In, value: [\"0200\"]}\n
This will yield feature.node.kubernetes.io/my-special-feature=my-value
node\nlabel if all of these are true (matchFeatures
implements a logical AND over\nthe matchers):
The .name
field is required and used as an identifier of the rule.
The .labels
is a map of the node labels to create if the rule matches.
Take this rule as a referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-sample-rule-object\nspec:\n rules:\n - name: \"my dynamic label value rule\"\n labels:\n feature.node.kubernetes.io/linux-lsm-enabled: \"@kernel.config.LSM\"\n feature.node.kubernetes.io/custom-label: \"customlabel\"\n
Label linux-lsm-enabled
uses the @
notation for dynamic values.\nThe value of the label will be the value of the attribute LSM
\nof the feature kernel.config
.
The @<feature-name>.<element-name>
format can be used to inject values of\ndetected features to the label. See\navailable features for possible values to use.
This will yield into the following node label:
\n\n labels:\n ...\n feature.node.kubernetes.io/linux-lsm-enabled: apparmor\n feature.node.kubernetes.io/custom-label: \"customlabel\"\n
The .labelsTemplate
field specifies a text template for dynamically creating\nlabels based on the matched features. See templating for\ndetails.
\n\n\nNOTE: The
\nlabels
field has priority overlabelsTemplate
, i.e.\nlabels specified in thelabels
field will override anything\noriginating fromlabelsTemplate
.
The .annotations
field is a list of features to be advertised as node\nannotations.
Take this rule as a referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: feature-annotations-example\nspec:\n rules:\n - name: \"annotation-example\"\n annotations:\n feature.node.kubernetes.io/defaul-ns-annotation: \"foo\"\n custom.vendor.io/feature: \"baz\"\n matchFeatures:\n - feature: kernel.version\n matchExpressions:\n major: {op: Exists}\n
This will yield into the following node annotations:
\n\n annotations:\n ...\n feature.node.kubernetes.io/defaul-ns-annotation: \"foo\"\n custom.vendor.io/feature: \"baz\"\n ...\n
NFD enforces some limitations to the namespace (or prefix)/ of the annotations:
\n\nkubernetes.io/
and its sub-namespaces (like sub.ns.kubernetes.io/
) cannot\ngenerally be usedfeature.node.kubernetes.io/
and its sub-namespaces\n(like sub.ns.feature.node.kubernetes.io
)my-annotation
) should not be used. In NFD v0.15 unprefixed names will be automatically prefixed with\nfeature.node.kubernetes.io/
but this will change in a future version (see\nautoDefaultNs config option.\n\n\nNOTE: The
\nannotations
field has will only advertise features via node\nannotations the features won’t be advertised as node labels unless they are\nspecified in thelabels
field.
taints is a list of taint entries and each entry can have key
, value
and effect
,\nwhere the value
is optional. Effect could be NoSchedule
, PreferNoSchedule
\nor NoExecute
. To learn more about the meaning of these effects, check out k8s documentation.
Example NodeFeatureRule with taints:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-sample-rule-object\nspec:\n rules:\n - name: \"my sample taint rule\"\n taints:\n - effect: PreferNoSchedule\n key: \"feature.node.kubernetes.io/special-node\"\n value: \"true\"\n - effect: NoExecute\n key: \"feature.node.kubernetes.io/dedicated-node\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n dummy: {op: Exists}\n - feature: kernel.config\n matchExpressions:\n X86: {op: In, value: [\"y\"]}\n
In this example, if the my sample taint rule
rule is matched,\nfeature.node.kubernetes.io/pci-0300_1d0f.present=true:NoExecute
\nand feature.node.kubernetes.io/cpu-cpuid.ADX:NoExecute
taints are set on the node.
There are some limitations to the namespace part (i.e. prefix/) of the taint\nkey:
\n\nkubernetes.io/
and its sub-namespaces (like sub.ns.kubernetes.io/
) cannot\ngenerally be usedfeature.node.kubernetes.io/
and its sub-namespaces\n(like sub.ns.feature.node.kubernetes.io
)foo
) keys are disallowed\n\n\nNOTE: taints field is not available for the custom rules of nfd-worker\nand only for NodeFeatureRule objects.
\n
The .vars
field is a map of values (key-value pairs) to store for subsequent\nrules to use. In other words, these are variables that are not advertised as\nnode labels. See backreferences for more details on the\nusage of vars.
The .extendedResources
field is a list of extended resources to advertise.\nSee extended resources for more details.
Take this rule as a referential example:
\n\napiVersion: nfd.k8s-sigs.io/v1alpha1\nkind: NodeFeatureRule\nmetadata:\n name: my-extended-resource-rule\nspec:\n rules:\n - name: \"my extended resource rule\"\n extendedResources:\n vendor.io/dynamic: \"@kernel.version.major\"\n vendor.io/static: \"123\"\n matchFeatures:\n - feature: kernel.version\n matchExpressions:\n major: {op: Exists}\n
The extended resource vendor.io/dynamic
is defined in the form @feature.attribute
.\nThe value of the extended resource will be the value of the attribute major
\nof the feature kernel.version
.
The @<feature-name>.<element-name>
format can be used to inject values of\ndetected features to the extended resource. See\navailable features for possible values to use. Note that\nthe value must be eligible as a\nKubernetes resource quantity.
This will yield into the following node status:
\n\n allocatable:\n ...\n vendor.io/dynamic: \"5\"\n vendor.io/static: \"123\"\n ...\n capacity:\n ...\n vendor.io/dynamic: \"5\"\n vendor.io/static: \"123\"\n ...\n
There are some limitations to the namespace part (i.e. prefix)/ of the Extended\nResources names:
\n\nkubernetes.io/
and its sub-namespaces (like sub.ns.kubernetes.io/
) cannot\ngenerally be usedfeature.node.kubernetes.io/
and its sub-namespaces\n(like sub.ns.feature.node.kubernetes.io
)my-er
) site.version }} unprefixed names will be\nautomatically prefixed with feature.node.kubernetes.io/
but this will\nchange in a future version (see\nautoDefaultNs config option.\n\n\nNOTE:
\n.extendedResources
is not supported by the\ncustom feature source – it can only be used in\nNodeFeatureRule objects.
The .varsTemplate
field specifies a text template for dynamically creating\nvars based on the matched features. See templating for details\non using templates and backreferences for more details on\nthe usage of vars.
\n\n\nNOTE: The
\nvars
field has priority overvarsTemplate
, i.e.\nvars specified in thevars
field will override anything originating from\nvarsTemplate
.
The .matchFeatures
field specifies a feature matcher, consisting of a list of\nfeature matcher terms. It implements a logical AND over the terms i.e. all\nof them must match for the rule to trigger.
matchFeatures:\n - feature: <feature-name>\n matchExpressions:\n <key>:\n op: <op>\n value:\n - <value-1>\n - ...\n matchName:\n op: <op>\n value:\n - <value-1>\n - ...\n
The .matchFeatures[].feature
field specifies the feature which to evaluate.
\n\n\nNOTE:If both
\nmatchExpressions
and\nmatchName
are specified, they both must match.
The .matchFeatures[].matchExpressions
field is used to match against the\nvalue(s) of a feature. The matchExpressions
field consists of a set of\nexpressions, each of which is evaluated against all elements of the specified\nfeature.
matchExpressions:\n <key>:\n op: <op>\n value:\n - <value-1>\n - ...\n
In each MatchExpression the key
specifies the name of of the feature element\n(flag and attribute features) or name of the attribute (instance\nfeatures) which to look for. The behavior of MatchExpression depends on the\nfeature type:
<key>
The op
field specifies the operator to apply. Valid values are described\nbelow.
Operator | \nNumber of values | \nMatches when | \n
---|---|---|
In | \n 1 or greater | \nInput is equal to one of the values | \n
NotIn | \n 1 or greater | \nInput is not equal to any of the values | \n
InRegexp | \n 1 or greater | \nValues of the MatchExpression are treated as regexps and input matches one or more of them | \n
Exists | \n 0 | \nThe key exists | \n
DoesNotExist | \n 0 | \nThe key does not exists | \n
Gt | \n 1 | \nInput is greater than the value. Both the input and value must be integer numbers. | \n
Lt | \n 1 | \nInput is less than the value. Both the input and value must be integer numbers. | \n
GtLt | \n 2 | \nInput is between two values. Both the input and value must be integer numbers. | \n
IsTrue | \n 0 | \nInput is equal to “true” | \n
IsFalse | \n 0 | \nInput is equal “false” | \n
The value
field of MatchExpression is a list of string arguments to the\noperator.
The .matchFeatures[].matchName
field is used to match against the\nname(s) of a feature (whereas the matchExpressions
field\nmatches against the value(s). The matchName
field consists of a single\nexpression which is evaulated against the name of each element of the specified\nfeature.
matchName:\n op: <op>\n value:\n - <value-1>\n - ...\n
The behavior of matchName
depends on the feature type:
The op
field specifies the operator to apply. Same operators as for\nmatchExpressions
above are available.
Operator | \nNumber of values | \nMatches | \n
---|---|---|
In | \n 1 or greater | \nAll name is equal to one of the values | \n
NotIn | \n 1 or greater | \nAll name that is not equal to any of the values | \n
InRegexp | \n 1 or greater | \nAll name that matches any of the values (treated as regexps) | \n
Exists | \n 0 | \nAll elements | \n
Other operators are not practical with matchName
(DoesNotExist
never\nmatches; Gt
,Lt
and GtLt
are only usable if feature names are integers;\nIsTrue
and IsFalse
are only usable if the feature name is true
or\nfalse
).
The value
field is a list of string arguments to the operator.
An example:
\n\n matchFeatures:\n - feature: cpu.cpuid\n matchName: {op: InRegexp, value: [\"^AVX\"]}\n
The snippet above would match if any CPUID feature starting with AVX is present\n(e.g. AVX1 or AVX2 or AVX512F etc).
\n\nThe .matchAny
field is a list of of matchFeatures
\nmatchers. A logical OR is applied over the matchers, i.e. at least one of them\nmust match for the rule to trigger.
Consider the following example:
\n\n matchAny:\n - matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n kmod-1: {op: Exists}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"0eee\"]}\n class: {op: In, value: [\"0200\"]}\n - matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n kmod-2: {op: Exists}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"0fff\"]}\n class: {op: In, value: [\"0200\"]}\n
This matches if kernel module kmod-1 is loaded and a network controller from\nvendor 0eee is present, OR, if kernel module kmod-2 has been loaded and a\nnetwork controller from vendor 0fff is present (OR both of these conditions are\ntrue).
\n\nThe following features are available for matching:
\n\nFeature | \nFeature type | \nElements | \nValue type | \nDescription | \n
---|---|---|---|---|
cpu.cpuid | \n flag | \n\n | \n | Supported CPU capabilities | \n
\n | \n | <cpuid-flag> | \n \n | CPUID flag is present | \n
cpu.cstate | \n attribute | \n\n | \n | Status of cstates in the intel_idle cpuidle driver | \n
\n | \n | enabled | \n bool | \n‘true’ if cstates are set, otherwise ‘false’. Does not exist of intel_idle driver is not active. | \n
cpu.model | \n attribute | \n\n | \n | CPU model related attributes | \n
\n | \n | family | \n int | \nCPU family | \n
\n | \n | vendor_id | \n string | \nCPU vendor ID | \n
\n | \n | id | \n int | \nCPU model ID | \n
cpu.pstate | \n attribute | \n\n | \n | State of the Intel pstate driver. Does not exist if the driver is not enabled. | \n
\n | \n | status | \n string | \nStatus of the driver, possible values are ‘active’ and ‘passive’ | \n
\n | \n | turbo | \n bool | \n‘true’ if turbo frequencies are enabled, otherwise ‘false’ | \n
\n | \n | scaling | \n string | \nActive scaling_governor, possible values are ‘powersave’ or ‘performance’. | \n
cpu.rdt | \n attribute | \n\n | \n | Intel RDT capabilities supported by the system | \n
\n | \n | <rdt-flag> | \n \n | RDT capability is supported, see RDT flags for details | \n
\n | \n | RDTL3CA_NUM_CLOSID | \n int | \nThe number or available CLOSID (Class of service ID) for Intel L3 Cache Allocation Technology | \n
cpu.security | \n attribute | \n\n | \n | Features related to security and trusted execution environments | \n
\n | \n | sgx.enabled | \n bool | \ntrue if Intel SGX (Software Guard Extensions) has been enabled, otherwise does not exist | \n
\n | \n | sgx.epc | \n int | \nThe total amount Intel SGX Encrypted Page Cache memory in bytes. It’s only present if sgx.enabled is true . | \n
\n | \n | se.enabled | \n bool | \ntrue if IBM Secure Execution for Linux is available and has been enabled, otherwise does not exist | \n
\n | \n | tdx.enabled | \n bool | \ntrue if Intel TDX (Trusted Domain Extensions) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | tdx.total_keys | \n int | \nThe total amount of keys an Intel TDX (Trusted Domain Extensions) host can provide. It’s only present if tdx.enabled is true . | \n
\n | \n | tdx.protected | \n bool | \ntrue if a guest VM was started using Intel TDX (Trusted Domain Extensions), otherwise does not exist. | \n
\n | \n | sev.enabled | \n bool | \ntrue if AMD SEV (Secure Encrypted Virtualization) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | sev.es.enabled | \n bool | \ntrue if AMD SEV-ES (Encrypted State supported) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | sev.snp.enabled | \n bool | \ntrue if AMD SEV-SNP (Secure Nested Paging supported) is available on the host and has been enabled, otherwise does not exist | \n
\n | \n | sev.asids | \n int | \nThe total amount of AMD SEV address-space identifiers (ASIDs), based on the /sys/fs/cgroup/misc.capacity information. | \n
\n | \n | sev.encrypted_state_ids | \n int | \nThe total amount of AMD SEV-ES and SEV-SNP supported, based on the /sys/fs/cgroup/misc.capacity information. | \n
cpu.sst | \n attribute | \n\n | \n | Intel SST (Speed Select Technology) capabilities | \n
\n | \n | bf.enabled | \n bool | \ntrue if Intel SST-BF (Intel Speed Select Technology - Base frequency) has been enabled, otherwise does not exist | \n
cpu.topology | \n attribute | \n\n | \n | CPU topology related features | \n
\n | \n | hardware_multithreading | \n bool | \nHardware multithreading, such as Intel HTT, is enabled | \n
\n | \n | socket_count | \n int | \nNumber of CPU Sockets | \n
cpu.coprocessor | \n attribute | \n\n | \n | CPU Coprocessor related features | \n
\n | \n | nx_gzip | \n bool | \nNest Accelerator GZIP support is enabled | \n
kernel.config | \n attribute | \n\n | \n | Kernel configuration options | \n
\n | \n | <config-flag> | \n string | \nValue of the kconfig option | \n
kernel.loadedmodule | \n flag | \n\n | \n | Kernel modules loaded on the node as reported by /proc/modules | \n
kernel.enabledmodule | \n flag | \n\n | \n | Kernel modules loaded on the node and available as built-ins as reported by modules.builtin | \n
\n | \n | mod-name | \n \n | Kernel module <mod-name> is loaded | \n
kernel.selinux | \n attribute | \n\n | \n | Kernel SELinux related features | \n
\n | \n | enabled | \n bool | \ntrue if SELinux has been enabled and is in enforcing mode, otherwise false | \n
kernel.version | \n attribute | \n\n | \n | Kernel version information | \n
\n | \n | full | \n string | \nFull kernel version (e.g. ‘4.5.6-7-g123abcde’) | \n
\n | \n | major | \n int | \nFirst component of the kernel version (e.g. ‘4’) | \n
\n | \n | minor | \n int | \nSecond component of the kernel version (e.g. ‘5’) | \n
\n | \n | revision | \n int | \nThird component of the kernel version (e.g. ‘6’) | \n
local.label | \n attribute | \n\n | \n | Labels from feature files and hooks, i.e. labels from the local feature source | \n
local.feature | \n attribute | \n\n | \n | Features from feature files and hooks, i.e. features from the local feature source | \n
\n | \n | <label-name> | \n string | \nLabel <label-name> created by the local feature source, value equals the value of the label | \n
memory.nv | \n instance | \n\n | \n | NVDIMM devices present in the system | \n
\n | \n | <sysfs-attribute> | \n string | \nValue of the sysfs device attribute, available attributes: devtype , mode | \n
memory.numa | \n attribute | \n\n | \n | NUMA nodes | \n
\n | \n | is_numa | \n bool | \ntrue if NUMA architecture, false otherwise | \n
\n | \n | node_count | \n int | \nNumber of NUMA nodes | \n
network.device | \n instance | \n\n | \n | Physical (non-virtual) network interfaces present in the system | \n
\n | \n | name | \n string | \nName of the network interface | \n
\n | \n | <sysfs-attribute> | \n string | \nSysfs network interface attribute, available attributes: operstate , speed , sriov_numvfs , sriov_totalvfs | \n
network.virtual | \n instance | \n\n | \n | Virtual network interfaces present in the system | \n
\n | \n | name | \n string | \nName of the network interface | \n
\n | \n | <sysfs-attribute> | \n string | \nSysfs network interface attribute, available attributes: operstate , speed | \n
pci.device | \n instance | \n\n | \n | PCI devices present in the system | \n
\n | \n | <sysfs-attribute> | \n string | \nValue of the sysfs device attribute, available attributes: class , vendor , device , subsystem_vendor , subsystem_device , sriov_totalvfs , iommu_group/type , iommu/intel-iommu/version | \n
storage.block | \n instance | \n\n | \n | Block storage devices present in the system | \n
\n | \n | name | \n string | \nName of the block device | \n
\n | \n | <sysfs-attribute> | \n string | \nSysfs network interface attribute, available attributes: dax , rotational , nr_zones , zoned | \n
system.osrelease | \n attribute | \n\n | \n | System identification data from /etc/os-release | \n
\n | \n | <parameter> | \n string | \nOne parameter from /etc/os-release | \n
system.name | \n attribute | \n\n | \n | System name information | \n
\n | \n | nodename | \n string | \nName of the kubernetes node object | \n
usb.device | \n instance | \n\n | \n | USB devices present in the system | \n
\n | \n | <sysfs-attribute> | \n string | \nValue of the sysfs device attribute, available attributes: class , vendor , device , serial | \n
rule.matched | \n attribute | \n\n | \n | Previously matched rules | \n
\n | \n | <label-or-var> | \n string | \nLabel or var from a preceding rule that matched | \n
Flag | \nDescription | \n
---|---|
RDTMON | \nIntel RDT Monitoring Technology | \n
RDTCMT | \nIntel Cache Monitoring (CMT) | \n
RDTMBM | \nIntel Memory Bandwidth Monitoring (MBM) | \n
RDTL3CA | \nIntel L3 Cache Allocation Technology | \n
RDTl2CA | \nIntel L2 Cache Allocation Technology | \n
RDTMBA | \nIntel Memory Bandwidth Allocation (MBA) Technology | \n
Rules support template-based creation of labels and vars with the\n.labelsTemplate
and .varsTemplate
fields. These makes it possible to\ndynamically generate labels and vars based on the features that matched.
The template must expand into a simple format with <key>=<value>
pairs\nseparated by newline.
Consider the following example:\n
\n\n labelsTemplate: |\n {{ range .pci.device }}vendor-{{ .class }}-{{ .device }}.present=true\n {{ end }}\n matchFeatures:\n - feature: pci.device\n matchExpressions:\n class: {op: InRegexp, value: [\"^02\"]}\n vendor: [\"0fff\"]\n
The rule above will create individual labels\nfeature.node.kubernetes.io/vendor-<class-id>-<device-id>.present=true
for\neach network controller device (device class starting with 02) from vendor\n0fff.
All the matched features of each feature matcher term under matchFeatures
\nfields are available for the template engine. Matched features can be\nreferenced with {{ .<feature-name> }}
in the template, and\nthe available data could be described in yaml as follows:
.\n <key-feature>:\n - Name: <matched-key>\n - ...\n\n <value-feature>:\n - Name: <matched-key>\n Value: <matched-value>\n - ...\n\n <instance-feature>:\n - <attribute-1-name>: <attribute-1-value>\n <attribute-2-name>: <attribute-2-value>\n ...\n - ...\n
That is, the per-feature data is a list of objects whose data fields depend on\nthe type of the feature:
\n\nA simple example of a template utilizing name and value from an attribute\nfeature:\n
\n\n labelsTemplate: |\n {{ range .system.osrelease }}system-{{ .Name }}={{ .Value }}\n {{ end }}\n matchFeatures:\n - feature: system.osRelease\n matchExpressions:\n ID: {op: Exists}\n VERSION_ID.major: {op: Exists}\n
\n\n\nNOTE:If both
\nmatchExpressions
andmatchName
for a feature matcher\nterm (seematchFeatures
) is specified, the list of\nmatched features (for the template engine) is the union from both of these.\n\nNOTE: In case of matchAny is specified, the template is executed\nseparately against each individualmatchFeatures
field and the final set of\nlabels will be superset of all these separate template expansions. E.g.\nconsider the following:
- name: <name>\n labelsTemplate: <template>\n matchFeatures: <matcher#1>\n matchAny:\n - matchFeatures: <matcher#2>\n - matchFeatures: <matcher#3>\n
In the example above (assuming the overall result is a match) the template\nwould be executed on matcher#1 as well as on matcher#2 and/or matcher#3\n(depending on whether both or only one of them match). All the labels from\nthese separate expansions would be created, i.e. the end result would be a\nunion of all the individual expansions.
\n\nRule templates use the Golang text/template\npackage and all its built-in functionality (e.g. pipelines and functions) can\nbe used. An example template taking use of the built-in len
function,\nadvertising the number of PCI network controllers from a specific vendor:\n
labelsTemplate: |\n num-intel-network-controllers={{ .pci.device | len }}\n matchFeatures:\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"8086\"]}\n class: {op: In, value: [\"0200\"]}\n\n
Imaginative template pipelines are possible, but care must be taken to\nproduce understandable and maintainable rule sets.
\n\nRules support referencing the output of preceding rules. This enables\nsophisticated scenarios where multiple rules are combined together\nto for more complex heuristics than a single rule can provide. The labels and\nvars created by the execution of preceding rules are available as a special\nrule.matched
feature.
Consider the following configuration:
\n\n - name: \"my kernel label rule\"\n labels:\n kernel-feature: \"true\"\n matchFeatures:\n - feature: kernel.version\n matchExpressions:\n major: {op: Gt, value: [\"4\"]}\n\n - name: \"my var rule\"\n vars:\n nolabel-feature: \"true\"\n matchFeatures:\n - feature: cpu.cpuid\n matchExpressions:\n AVX512F: {op: Exists}\n - feature: pci.device\n matchExpressions:\n vendor: {op: In, value: [\"0fff\"]}\n device: {op: In, value: [\"1234\", \"1235\"]}\n\n - name: \"my high level feature rule\"\n labels:\n high-level-feature: \"true\"\n matchFeatures:\n - feature: rule.matched\n matchExpressions:\n kernel-feature: {op: IsTrue}\n nolabel-feature: {op: IsTrue}\n
The feature.node.kubernetes.io/high-level-feature = true
label depends on the\ntwo previous rules.
Note that when referencing rules across multiple\nNodeFeatureRule
objects attention must be\npaid to the ordering. NodeFeatureRule
objects are processed in alphabetical\norder (based on their .metadata.name
).
Some more configuration examples below.
\n\nMatch certain CPUID features:
\n\n - name: \"example cpuid rule\"\n labels:\n my-special-cpu-feature: \"true\"\n matchFeatures:\n - feature: cpu.cpuid\n matchExpressions:\n AESNI: {op: Exists}\n AVX: {op: Exists}\n
Require a certain loaded kernel module and OS version:
\n\n - name: \"my multi-feature rule\"\n labels:\n my-special-multi-feature: \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n e1000: {op: Exists}\n - feature: system.osrelease\n matchExpressions:\n NAME: {op: InRegexp, values: [\"^openSUSE\"]}\n VERSION_ID.major: {op: Gt, values: [\"14\"]}\n
Require a loaded kernel module and two specific PCI devices (both of which\nmust be present):
\n\n - name: \"my multi-device rule\"\n labels:\n my-multi-device-feature: \"true\"\n matchFeatures:\n - feature: kernel.loadedmodule\n matchExpressions:\n my-driver-module: {op: Exists}\n - pci.device:\n vendor: \"0fff\"\n device: \"1234\"\n - pci.device:\n vendor: \"0fff\"\n device: \"abcd\"\n
Node Feature Discovery follows semantic versioning where\nthe version number consists of three components, i.e. MAJOR.MINOR.PATCH.
\n\nThe most recent two minor releases (or release branches) of Node Feature\nDiscovery are supported. That is, with X being the latest release, X and X-1\nare supported and X-1 reaches end-of-life when X+1 is released.
\n\nBuilt-in feature labels and\nfeatures are supported\nfor 2 releases after being deprecated, at minimum. That is, if a feature label\nis deprecated in version X, it will be supported in X+1 and X+2 and\nmay be dropped in X+3.
\n\nCommand-line flags and configuration file options are supported for 1 more\nrelease after being deprecated, at minimum. That is, if option/flag is\ndeprecated in version X, it will be supported in X+1 and may be removed\nin X+2.
\n\nThe same policy (support for 1 release after deprecation) also applies to Helm\nchart parameters.
\n","dir":"/reference/","name":"versions.md","path":"reference/versions.md","url":"/reference/versions.html"},{"title":"Examples and demos","layout":"default","sort":9,"content":"This page contains usage examples and demos.
\n\nA demo on the benefits of using node feature discovery can be found in the\nsource code repository under\ndemo/.
\n","dir":"/usage/","name":"examples-and-demos.md","path":"usage/examples-and-demos.md","url":"/usage/examples-and-demos.html"},{"title":"Kubectl plugin","layout":"default","sort":10,"content":"\n\n\nDeveloper Preview This feature is currently in developer preview and\nsubject to change. It is not recommended to use it in production\nenvironments.
\n
The kubectl
plugin kubectl nfd
can be used to validate/dryrun and test\nNodeFeatureRule objects. It can be installed with the following command:
git clone https://github.com/kubernetes-sigs/node-feature-discovery\ncd node-feature-discovery\nmake build-kubectl-nfd\nKUBECTL_PATH=/usr/local/bin/\nmv ./bin/kubectl-nfd ${KUBECTL_PATH}\n
The plugin can be used to validate a NodeFeatureRule object:
\n\nkubectl nfd validate -f <nodefeaturerule.yaml>\n
The plugin can be used to test a NodeFeatureRule object against a node:
\n\nkubectl nfd test -f <nodefeaturerule.yaml> -n <node-name>\n
The plugin can be used to DryRun a NodeFeatureRule object against a NodeFeature\nfile:
\n\nkubectl get -n node-feature-discovery nodefeature <nodename> -o yaml > <nodefeature.yaml>\nkubectl nfd dryrun -f <nodefeaturerule.yaml> -n <nodefeature.yaml>\n
Or you can use the example NodeFeature file(it is a minimal NodeFeature file):
\n\n$ kubectl nfd dryrun -f examples/nodefeaturerule.yaml -n examples/nodefeature.yaml\nEvaluating NodeFeatureRule \"examples/nodefeaturerule.yaml\" against NodeFeature \"examples/nodefeature.yaml\"\nProcessing rule: my sample rule\n*** Labels ***\nvendor.io/my-sample-feature=true\nNodeFeatureRule \"examples/nodefeaturerule.yaml\" is valid for NodeFeature \"examples/nodefeature.yaml\"\n
Node Feature Discovery provides a Helm chart to manage its deployment.
NOTE: NFD is not ideal for other Helm charts to depend on as that may result in multiple parallel NFD deployments in the same cluster which is not fully supported by the NFD Helm chart.
Helm package manager should be installed.
To install the latest stable version:
export NFD_NS=node-feature-discovery
+ Helm · Node Feature Discovery
Deployment with Helm
Table of contents
Node Feature Discovery provides a Helm chart to manage its deployment.
NOTE: NFD is not ideal for other Helm charts to depend on as that may result in multiple parallel NFD deployments in the same cluster which is not fully supported by the NFD Helm chart.
Prerequisites
Helm package manager should be installed.
Deployment
To install the latest stable version:
export NFD_NS=node-feature-discovery
helm repo add nfd https://kubernetes-sigs.github.io/node-feature-discovery/charts
helm repo update
helm install nfd/node-feature-discovery --namespace $NFD_NS --create-namespace --generate-name
@@ -12,4 +12,4 @@
helm install nfd/node-feature-discovery --set nameOverride=NFDinstance --set master.replicaCount=2 --namespace $NFD_NS --create-namespace
Uninstalling the chart
To uninstall the node-feature-discovery
deployment:
export NFD_NS=node-feature-discovery
helm uninstall node-feature-discovery --namespace $NFD_NS
-
The command removes all the Kubernetes components associated with the chart and deletes the release.
Chart parameters
To tailor the deployment of the Node Feature Discovery to your needs following Chart parameters are available.
General parameters
Name Type Default Description image.repository
string registry.k8s.io/nfd/node-feature-discovery
NFD image repository image.tag
string v0.15.6
NFD image tag image.pullPolicy
string Always
Image pull policy imagePullSecrets
list [] ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info nameOverride
string Override the name of the chart fullnameOverride
string Override a default fully qualified app name tls.enable
bool false Specifies whether to use TLS for communications between components. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release tls.certManager
bool false If enabled, requires cert-manager to be installed and will automatically create the required TLS certificates. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release enableNodeFeatureApi
bool true Enable the NodeFeature CRD API for communicating node features. This will automatically disable the gRPC communication. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release prometheus.enable
bool false Specifies whether to expose metrics using prometheus operator prometheus.labels
dict {} Specifies labels for use with the prometheus operator to control how it is selected
Metrics are configured to be exposed using prometheus operator API's by default. If you want to expose metrics using the prometheus operator API's you need to install the prometheus operator in your cluster.
Master pod parameters
Name Type Default description master.*
dict NFD master deployment configuration master.enable
bool true Specifies whether nfd-master should be deployed master.port
integer Specifies the TCP port that nfd-master listens for incoming requests. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release master.metricsPort
integer 8081 Port on which to expose metrics from components to prometheus operator master.instance
string Instance name. Used to separate annotation namespaces for multiple parallel deployments master.resyncPeriod
string NFD API controller resync period. master.extraLabelNs
array [] List of allowed extra label namespaces master.resourceLabels
array [] List of labels to be registered as extended resources master.enableTaints
bool false Specifies whether to enable or disable node tainting master.crdController
bool null Specifies whether the NFD CRD API controller is enabled. If not set, controller will be enabled if master.instance
is empty. master.featureRulesController
bool null DEPRECATED: use master.crdController
instead master.replicaCount
integer 1 Number of desired pods. This is a pointer to distinguish between explicit zero and not specified master.podSecurityContext
dict {} PodSecurityContext holds pod-level security attributes and common container settings master.securityContext
dict {} Container security settings master.serviceAccount.create
bool true Specifies whether a service account should be created master.serviceAccount.annotations
dict {} Annotations to add to the service account master.serviceAccount.name
string The name of the service account to use. If not set and create is true, a name is generated using the fullname template master.rbac.create
bool true Specifies whether to create RBAC configuration for nfd-master master.service.type
string ClusterIP NFD master service type. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release master.service.port
integer 8080 NFD master service port. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release master.resources
dict {} NFD master pod resources management master.nodeSelector
dict {} NFD master pod node selector master.tolerations
dict Scheduling to master node is disabled NFD master pod tolerations master.annotations
dict {} NFD master pod annotations master.affinity
dict NFD master pod required node affinity master.deploymentAnnotations
dict {} NFD master deployment annotations master.nfdApiParallelism
integer 10 Specifies the maximum number of concurrent node updates. master.config
dict NFD master configuration
Worker pod parameters
Name Type Default description worker.*
dict NFD worker daemonset configuration worker.enable
bool true Specifies whether nfd-worker should be deployed worker.metricsPort*
int 8081 Port on which to expose metrics from components to prometheus operator worker.config
dict NFD worker configuration worker.podSecurityContext
dict {} PodSecurityContext holds pod-level security attributes and common container settings worker.securityContext
dict {} Container security settings worker.serviceAccount.create
bool true Specifies whether a service account for nfd-worker should be created worker.serviceAccount.annotations
dict {} Annotations to add to the service account for nfd-worker worker.serviceAccount.name
string The name of the service account to use for nfd-worker. If not set and create is true, a name is generated using the fullname template (suffixed with -worker
) worker.rbac.create
bool true Specifies whether to create RBAC configuration for nfd-worker worker.mountUsrSrc
bool false Specifies whether to allow users to mount the hostpath /user/src. Does not work on systems without /usr/src AND a read-only /usr worker.resources
dict {} NFD worker pod resources management worker.nodeSelector
dict {} NFD worker pod node selector worker.tolerations
dict {} NFD worker pod node tolerations worker.priorityClassName
string NFD worker pod priority class worker.annotations
dict {} NFD worker pod annotations worker.daemonsetAnnotations
dict {} NFD worker daemonset annotations
Topology updater parameters
Name Type Default description topologyUpdater.*
dict NFD Topology Updater configuration topologyUpdater.enable
bool false Specifies whether the NFD Topology Updater should be created topologyUpdater.createCRDs
bool false Specifies whether the NFD Topology Updater CRDs should be created topologyUpdater.serviceAccount.create
bool true Specifies whether the service account for topology updater should be created topologyUpdater.serviceAccount.annotations
dict {} Annotations to add to the service account for topology updater topologyUpdater.serviceAccount.name
string The name of the service account for topology updater to use. If not set and create is true, a name is generated using the fullname template and -topology-updater
suffix topologyUpdater.rbac.create
bool true Specifies whether to create RBAC configuration for topology updater topologyUpdater.metricsPort
integer 8081 Port on which to expose prometheus metrics topologyUpdater.kubeletConfigPath
string "" Specifies the kubelet config host path topologyUpdater.kubeletPodResourcesSockPath
string "" Specifies the kubelet sock path to read pod resources topologyUpdater.updateInterval
string 60s Time to sleep between CR updates. Non-positive value implies no CR update. topologyUpdater.watchNamespace
string *
Namespace to watch pods, *
for all namespaces topologyUpdater.podSecurityContext
dict {} PodSecurityContext holds pod-level security attributes and common container settings topologyUpdater.securityContext
dict {} Container security settings topologyUpdater.resources
dict {} Topology updater pod resources management topologyUpdater.nodeSelector
dict {} Topology updater pod node selector topologyUpdater.tolerations
dict {} Topology updater pod node tolerations topologyUpdater.annotations
dict {} Topology updater pod annotations topologyUpdater.daemonsetAnnotations
dict {} Topology updater daemonset annotations topologyUpdater.affinity
dict {} Topology updater pod affinity topologyUpdater.config
dict configuration topologyUpdater.podSetFingerprint
bool false Enables compute and report of pod fingerprint in NRT objects. topologyUpdater.kubeletStateDir
string /var/lib/kubelet Specifies kubelet state directory path for watching state and checkpoint files. Empty value disables kubelet state tracking.
Garbage collector parameters
Name Type Default description gc.*
dict NFD Garbage Collector configuration gc.enable
bool true Specifies whether the NFD Garbage Collector should be created gc.serviceAccount.create
bool true Specifies whether the service account for garbage collector should be created gc.serviceAccount.annotations
dict {} Annotations to add to the service account for garbage collector gc.serviceAccount.name
string The name of the service account for garbage collector to use. If not set and create is true, a name is generated using the fullname template and -gc
suffix gc.rbac.create
bool true Specifies whether to create RBAC configuration for garbage collector gc.interval
string 1h Time between periodic garbage collector runs gc.podSecurityContext
dict {} PodSecurityContext holds pod-level security attributes and common container settings gc.resources
dict {} Garbage collector pod resources management gc.metricsPort
integer 8081 Port on which to serve Prometheus metrics gc.nodeSelector
dict {} Garbage collector pod node selector gc.tolerations
dict {} Garbage collector pod node tolerations gc.annotations
dict {} Garbage collector pod annotations gc.deploymentAnnotations
dict {} Garbage collector deployment annotations gc.affinity
dict {} Garbage collector pod affinity
Node Feature Discovery v0.15
\ No newline at end of file
+
The command removes all the Kubernetes components associated with the chart and deletes the release.
To tailor the deployment of the Node Feature Discovery to your needs following Chart parameters are available.
Name | Type | Default | Description |
---|---|---|---|
image.repository | string | registry.k8s.io/nfd/node-feature-discovery | NFD image repository |
image.tag | string | v0.15.7 | NFD image tag |
image.pullPolicy | string | Always | Image pull policy |
imagePullSecrets | list | [] | ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info |
nameOverride | string | Override the name of the chart | |
fullnameOverride | string | Override a default fully qualified app name | |
tls.enable | bool | false | Specifies whether to use TLS for communications between components. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release |
tls.certManager | bool | false | If enabled, requires cert-manager to be installed and will automatically create the required TLS certificates. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release |
enableNodeFeatureApi | bool | true | Enable the NodeFeature CRD API for communicating node features. This will automatically disable the gRPC communication. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release |
prometheus.enable | bool | false | Specifies whether to expose metrics using prometheus operator |
prometheus.labels | dict | {} | Specifies labels for use with the prometheus operator to control how it is selected |
Metrics are configured to be exposed using prometheus operator API's by default. If you want to expose metrics using the prometheus operator API's you need to install the prometheus operator in your cluster.
Name | Type | Default | description |
---|---|---|---|
master.* | dict | NFD master deployment configuration | |
master.enable | bool | true | Specifies whether nfd-master should be deployed |
master.port | integer | Specifies the TCP port that nfd-master listens for incoming requests. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release | |
master.metricsPort | integer | 8081 | Port on which to expose metrics from components to prometheus operator |
master.instance | string | Instance name. Used to separate annotation namespaces for multiple parallel deployments | |
master.resyncPeriod | string | NFD API controller resync period. | |
master.extraLabelNs | array | [] | List of allowed extra label namespaces |
master.resourceLabels | array | [] | List of labels to be registered as extended resources |
master.enableTaints | bool | false | Specifies whether to enable or disable node tainting |
master.crdController | bool | null | Specifies whether the NFD CRD API controller is enabled. If not set, controller will be enabled if master.instance is empty. |
master.featureRulesController | bool | null | DEPRECATED: use master.crdController instead |
master.replicaCount | integer | 1 | Number of desired pods. This is a pointer to distinguish between explicit zero and not specified |
master.podSecurityContext | dict | {} | PodSecurityContext holds pod-level security attributes and common container settings |
master.securityContext | dict | {} | Container security settings |
master.serviceAccount.create | bool | true | Specifies whether a service account should be created |
master.serviceAccount.annotations | dict | {} | Annotations to add to the service account |
master.serviceAccount.name | string | The name of the service account to use. If not set and create is true, a name is generated using the fullname template | |
master.rbac.create | bool | true | Specifies whether to create RBAC configuration for nfd-master |
master.service.type | string | ClusterIP | NFD master service type. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release |
master.service.port | integer | 8080 | NFD master service port. NOTE: this parameter is related to the deprecated gRPC API and will be removed with it in a future release |
master.resources | dict | {} | NFD master pod resources management |
master.nodeSelector | dict | {} | NFD master pod node selector |
master.tolerations | dict | Scheduling to master node is disabled | NFD master pod tolerations |
master.annotations | dict | {} | NFD master pod annotations |
master.affinity | dict | NFD master pod required node affinity | |
master.deploymentAnnotations | dict | {} | NFD master deployment annotations |
master.nfdApiParallelism | integer | 10 | Specifies the maximum number of concurrent node updates. |
master.config | dict | NFD master configuration |
Name | Type | Default | description |
---|---|---|---|
worker.* | dict | NFD worker daemonset configuration | |
worker.enable | bool | true | Specifies whether nfd-worker should be deployed |
worker.metricsPort* | int | 8081 | Port on which to expose metrics from components to prometheus operator |
worker.config | dict | NFD worker configuration | |
worker.podSecurityContext | dict | {} | PodSecurityContext holds pod-level security attributes and common container settings |
worker.securityContext | dict | {} | Container security settings |
worker.serviceAccount.create | bool | true | Specifies whether a service account for nfd-worker should be created |
worker.serviceAccount.annotations | dict | {} | Annotations to add to the service account for nfd-worker |
worker.serviceAccount.name | string | The name of the service account to use for nfd-worker. If not set and create is true, a name is generated using the fullname template (suffixed with -worker ) | |
worker.rbac.create | bool | true | Specifies whether to create RBAC configuration for nfd-worker |
worker.mountUsrSrc | bool | false | Specifies whether to allow users to mount the hostpath /user/src. Does not work on systems without /usr/src AND a read-only /usr |
worker.resources | dict | {} | NFD worker pod resources management |
worker.nodeSelector | dict | {} | NFD worker pod node selector |
worker.tolerations | dict | {} | NFD worker pod node tolerations |
worker.priorityClassName | string | NFD worker pod priority class | |
worker.annotations | dict | {} | NFD worker pod annotations |
worker.daemonsetAnnotations | dict | {} | NFD worker daemonset annotations |
Name | Type | Default | description |
---|---|---|---|
topologyUpdater.* | dict | NFD Topology Updater configuration | |
topologyUpdater.enable | bool | false | Specifies whether the NFD Topology Updater should be created |
topologyUpdater.createCRDs | bool | false | Specifies whether the NFD Topology Updater CRDs should be created |
topologyUpdater.serviceAccount.create | bool | true | Specifies whether the service account for topology updater should be created |
topologyUpdater.serviceAccount.annotations | dict | {} | Annotations to add to the service account for topology updater |
topologyUpdater.serviceAccount.name | string | The name of the service account for topology updater to use. If not set and create is true, a name is generated using the fullname template and -topology-updater suffix | |
topologyUpdater.rbac.create | bool | true | Specifies whether to create RBAC configuration for topology updater |
topologyUpdater.metricsPort | integer | 8081 | Port on which to expose prometheus metrics |
topologyUpdater.kubeletConfigPath | string | "" | Specifies the kubelet config host path |
topologyUpdater.kubeletPodResourcesSockPath | string | "" | Specifies the kubelet sock path to read pod resources |
topologyUpdater.updateInterval | string | 60s | Time to sleep between CR updates. Non-positive value implies no CR update. |
topologyUpdater.watchNamespace | string | * | Namespace to watch pods, * for all namespaces |
topologyUpdater.podSecurityContext | dict | {} | PodSecurityContext holds pod-level security attributes and common container settings |
topologyUpdater.securityContext | dict | {} | Container security settings |
topologyUpdater.resources | dict | {} | Topology updater pod resources management |
topologyUpdater.nodeSelector | dict | {} | Topology updater pod node selector |
topologyUpdater.tolerations | dict | {} | Topology updater pod node tolerations |
topologyUpdater.annotations | dict | {} | Topology updater pod annotations |
topologyUpdater.daemonsetAnnotations | dict | {} | Topology updater daemonset annotations |
topologyUpdater.affinity | dict | {} | Topology updater pod affinity |
topologyUpdater.config | dict | configuration | |
topologyUpdater.podSetFingerprint | bool | false | Enables compute and report of pod fingerprint in NRT objects. |
topologyUpdater.kubeletStateDir | string | /var/lib/kubelet | Specifies kubelet state directory path for watching state and checkpoint files. Empty value disables kubelet state tracking. |
Name | Type | Default | description |
---|---|---|---|
gc.* | dict | NFD Garbage Collector configuration | |
gc.enable | bool | true | Specifies whether the NFD Garbage Collector should be created |
gc.serviceAccount.create | bool | true | Specifies whether the service account for garbage collector should be created |
gc.serviceAccount.annotations | dict | {} | Annotations to add to the service account for garbage collector |
gc.serviceAccount.name | string | The name of the service account for garbage collector to use. If not set and create is true, a name is generated using the fullname template and -gc suffix | |
gc.rbac.create | bool | true | Specifies whether to create RBAC configuration for garbage collector |
gc.interval | string | 1h | Time between periodic garbage collector runs |
gc.podSecurityContext | dict | {} | PodSecurityContext holds pod-level security attributes and common container settings |
gc.resources | dict | {} | Garbage collector pod resources management |
gc.metricsPort | integer | 8081 | Port on which to serve Prometheus metrics |
gc.nodeSelector | dict | {} | Garbage collector pod node selector |
gc.tolerations | dict | {} | Garbage collector pod node tolerations |
gc.annotations | dict | {} | Garbage collector pod annotations |
gc.deploymentAnnotations | dict | {} | Garbage collector deployment annotations |
gc.affinity | dict | {} | Garbage collector pod affinity |
NFD offers two variants of the container image. Released container images are available for x86_64 and Arm64 architectures.
The default is a minimal image based on scratch and only supports running statically linked binaries.
For backwards compatibility a container image tag with suffix -minimal
(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.6-minimal
) is provided.
This image is based on debian:bookworm-slim and contains a full Linux system for running shell-based nfd-worker hooks and doing live debugging and diagnosis of the NFD images.
The container image tag has suffix -full
(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.6-full
).
NFD offers two variants of the container image. Released container images are available for x86_64 and Arm64 architectures.
The default is a minimal image based on scratch and only supports running statically linked binaries.
For backwards compatibility a container image tag with suffix -minimal
(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.7-minimal
) is provided.
This image is based on debian:bookworm-slim and contains a full Linux system for running shell-based nfd-worker hooks and doing live debugging and diagnosis of the NFD images.
The container image tag has suffix -full
(e.g. registry.k8s.io/nfd/node-feature-discovery:v0.15.7-full
).
Node Feature Discovery can be deployed on any recent version of Kubernetes (v1.21+).
See Image variants for description of the different NFD container images available.
Using Kustomize provides straightforward deployment with kubectl
integration and declarative customization.
Using Helm provides easy management of NFD deployments with nice configuration management and easy upgrades.
Using Operator provides deployment and configuration management via CRDs.
Node Feature Discovery can be deployed on any recent version of Kubernetes (v1.21+).
See Image variants for description of the different NFD container images available.
Using Kustomize provides straightforward deployment with kubectl
integration and declarative customization.
Using Helm provides easy management of NFD deployments with nice configuration management and easy upgrades.
Using Operator provides deployment and configuration management via CRDs.
Kustomize can be used to deploy NFD. Customization of the deployment is done by maintaining declarative overlays on top of the base overlays in NFD.
To follow the deployment instructions here, kubectl v1.21 or later is required.
The kustomize overlays provided in the repo can be used directly:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6
-
This will required RBAC rules and deploy nfd-master (as a deployment) and nfd-worker (as daemonset) in the node-feature-discovery
namespace.
NOTE: nfd-topology-updater is not deployed as part of the
default
overlay. Refer to the Master Worker Topologyupdater and Topologyupdater below.
Alternatively you can clone the repository and customize the deployment by creating your own overlays. See kustomize for more information about managing deployment configurations.
The NFD repository hosts a set of overlays for different usages and deployment scenarios under deployment/overlays
default
: default deployment of nfd-worker as a daemonset, described abovedefault-job
: see Worker one-shot belowmaster-worker-topologyupdater
: see Master Worker Topologyupdater belowtopologyupdater
: see Topology Updater belowprometheus
: see Metrics belowprune
: clean up the cluster after uninstallation, see Removing feature labelssamples/cert-manager
: an example for supplementing the default deployment with cert-manager for TLS authentication, see Automated TLS certificate management using cert-manager for detailssamples/custom-rules
: an example for spicing up the default deployment with a separately managed configmap of custom labeling rules, see Custom feature source for more information about custom node labelsFeature discovery can alternatively be configured as a one-shot job. The default-job
overlay may be used to achieve this:
NUM_NODES=$(kubectl get no -o jsonpath='{.items[*].metadata.name}' | wc -w)
-kubectl kustomize https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default-job?ref=v0.15.6 | \
+ Kustomize · Node Feature Discovery
Deployment with Kustomize
Table of contents
Kustomize can be used to deploy NFD. Customization of the deployment is done by maintaining declarative overlays on top of the base overlays in NFD.
To follow the deployment instructions here, kubectl v1.21 or later is required.
The kustomize overlays provided in the repo can be used directly:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7
+
This will required RBAC rules and deploy nfd-master (as a deployment) and nfd-worker (as daemonset) in the node-feature-discovery
namespace.
NOTE: nfd-topology-updater is not deployed as part of the default
overlay. Refer to the Master Worker Topologyupdater and Topologyupdater below.
Alternatively you can clone the repository and customize the deployment by creating your own overlays. See kustomize for more information about managing deployment configurations.
Overlays
The NFD repository hosts a set of overlays for different usages and deployment scenarios under deployment/overlays
default
: default deployment of nfd-worker as a daemonset, described above default-job
: see Worker one-shot below master-worker-topologyupdater
: see Master Worker Topologyupdater below topologyupdater
: see Topology Updater below prometheus
: see Metrics below prune
: clean up the cluster after uninstallation, see Removing feature labels samples/cert-manager
: an example for supplementing the default deployment with cert-manager for TLS authentication, see Automated TLS certificate management using cert-manager for details samples/custom-rules
: an example for spicing up the default deployment with a separately managed configmap of custom labeling rules, see Custom feature source for more information about custom node labels
Worker one-shot
Feature discovery can alternatively be configured as a one-shot job. The default-job
overlay may be used to achieve this:
NUM_NODES=$(kubectl get no -o jsonpath='{.items[*].metadata.name}' | wc -w)
+kubectl kustomize https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default-job?ref=v0.15.7 | \
sed s"/NUM_NODES/$NUM_NODES/" | \
kubectl apply -f -
-
The example above launches as many jobs as there are non-master nodes. Note that this approach does not guarantee running once on every node. For example, tainted, non-ready nodes or some other reasons in Job scheduling may cause some node(s) will run extra job instance(s) to satisfy the request.
Master Worker Topologyupdater
NFD-Master, nfd-worker and nfd-topology-updater can be configured to be deployed as separate pods. The master-worker-topologyupdater
overlay may be used to achieve this:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/master-worker-topologyupdater?ref=v0.15.6
+
The example above launches as many jobs as there are non-master nodes. Note that this approach does not guarantee running once on every node. For example, tainted, non-ready nodes or some other reasons in Job scheduling may cause some node(s) will run extra job instance(s) to satisfy the request.
Master Worker Topologyupdater
NFD-Master, nfd-worker and nfd-topology-updater can be configured to be deployed as separate pods. The master-worker-topologyupdater
overlay may be used to achieve this:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/master-worker-topologyupdater?ref=v0.15.7
-
Topologyupdater
To deploy just nfd-topology-updater (without nfd-master and nfd-worker) use the topologyupdater
overlay:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.6
+
Topologyupdater
To deploy just nfd-topology-updater (without nfd-master and nfd-worker) use the topologyupdater
overlay:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.7
NFD-Topology-Updater can be configured along with the default
overlay (which deploys nfd-worker and nfd-master) where all the software components are deployed as separate pods;
-kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6
-kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.6
+kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7
+kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.7
-
Metrics
To allow prometheus operator to scrape metrics from node-feature-discovery, run the following command:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6
-kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prometheus?ref=v0.15.6
-
Uninstallation
Simplest way is to invoke kubectl delete
on the overlay that was used for deployment. Beware that this will also delete the namespace that NFD is running in. For example, in case the default overlay from the repo was used:
kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6
+
Metrics
To allow prometheus operator to scrape metrics from node-feature-discovery, run the following command:
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7
+kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prometheus?ref=v0.15.7
+
Uninstallation
Simplest way is to invoke kubectl delete
on the overlay that was used for deployment. Beware that this will also delete the namespace that NFD is running in. For example, in case the default overlay from the repo was used:
kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7
Alternatively you can delete create objects one-by-one, depending on the type of deployment, for example:
NFD_NS=node-feature-discovery
kubectl -n $NFD_NS delete ds nfd-worker
kubectl -n $NFD_NS delete deploy nfd-master
@@ -21,4 +21,4 @@
kubectl -n $NFD_NS delete sa nfd-master
kubectl delete clusterrole nfd-master
kubectl delete clusterrolebinding nfd-master
-
Node Feature Discovery v0.15
\ No newline at end of file
+
Metrics are configured to be exposed using prometheus operator API's by default. If you want to expose metrics using the prometheus operator API's you need to install the prometheus operator in your cluster. By default NFD Master and Worker expose metrics on port 8081.
The exposed metrics are
Metric | Type | Description |
---|---|---|
nfd_master_build_info | Gauge | Version from which nfd-master was built |
nfd_worker_build_info | Gauge | Version from which nfd-worker was built |
nfd_gc_build_info | Gauge | Version from which nfd-gc was built |
nfd_topology_updater_build_info | Gauge | Version from which nfd-topology-updater was built |
nfd_node_update_requests_total | Counter | Number of node update requests received by the master over gRPC |
nfd_node_updates_total | Counter | Number of nodes updated |
nfd_node_update_failures_total | Counter | Number of nodes update failures |
nfd_node_labels_rejected_total | Counter | Number of nodes labels rejected by nfd-master |
nfd_node_extendedresources_rejected_total | Counter | Number of nodes extended resources rejected by nfd-master |
nfd_node_taints_rejected_total | Counter | Number of nodes taints rejected by nfd-master |
nfd_nodefeaturerule_processing_duration_seconds | Histogram | Time taken to process NodeFeatureRule objects |
nfd_nodefeaturerule_processing_errors_total | Counter | Number or errors encountered while processing NodeFeatureRule objects |
nfd_feature_discovery_duration_seconds | Histogram | Time taken to discover features on a node |
nfd_topology_updater_scan_errors_total | Counter | Number of errors in scanning resource allocation of pods. |
nfd_gc_objects_deleted_total | Counter | Number of NodeFeature and NodeResourceTopology objects garbage collected. |
nfd_gc_object_delete_failures_total | Counter | Number of errors in deleting NodeFeature and NodeResourceTopology objects. |
To deploy NFD with metrics enabled using kustomize, you can use the prometheus overlay.
By default metrics are enabled when deploying NFD via Helm. To enable Prometheus to scrape metrics from NFD, you need to pass the following values to Helm:
--set prometheus.enable=true
-
For more info on Helm deployment, see Helm.
It is recommended to specify --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false
when deploying prometheus-operator via Helm to enable the prometheus-operator to scrape metrics from any PodMonitor.
or setting labels on the PodMonitor via the helm parameter prometheus.labels
to control which Prometheus instances will scrape this PodMonitor.
NFD contains an example Grafana dashboard. You can import examples/grafana-dashboard.json
to your Grafana instance to visualize the NFD metrics.
Metrics are configured to be exposed using prometheus operator API's by default. If you want to expose metrics using the prometheus operator API's you need to install the prometheus operator in your cluster. By default NFD Master and Worker expose metrics on port 8081.
The exposed metrics are
Metric | Type | Description |
---|---|---|
nfd_master_build_info | Gauge | Version from which nfd-master was built |
nfd_worker_build_info | Gauge | Version from which nfd-worker was built |
nfd_gc_build_info | Gauge | Version from which nfd-gc was built |
nfd_topology_updater_build_info | Gauge | Version from which nfd-topology-updater was built |
nfd_node_update_requests_total | Counter | Number of node update requests received by the master over gRPC |
nfd_node_updates_total | Counter | Number of nodes updated |
nfd_node_update_failures_total | Counter | Number of nodes update failures |
nfd_node_labels_rejected_total | Counter | Number of nodes labels rejected by nfd-master |
nfd_node_extendedresources_rejected_total | Counter | Number of nodes extended resources rejected by nfd-master |
nfd_node_taints_rejected_total | Counter | Number of nodes taints rejected by nfd-master |
nfd_nodefeaturerule_processing_duration_seconds | Histogram | Time taken to process NodeFeatureRule objects |
nfd_nodefeaturerule_processing_errors_total | Counter | Number or errors encountered while processing NodeFeatureRule objects |
nfd_feature_discovery_duration_seconds | Histogram | Time taken to discover features on a node |
nfd_topology_updater_scan_errors_total | Counter | Number of errors in scanning resource allocation of pods. |
nfd_gc_objects_deleted_total | Counter | Number of NodeFeature and NodeResourceTopology objects garbage collected. |
nfd_gc_object_delete_failures_total | Counter | Number of errors in deleting NodeFeature and NodeResourceTopology objects. |
To deploy NFD with metrics enabled using kustomize, you can use the prometheus overlay.
By default metrics are enabled when deploying NFD via Helm. To enable Prometheus to scrape metrics from NFD, you need to pass the following values to Helm:
--set prometheus.enable=true
+
For more info on Helm deployment, see Helm.
It is recommended to specify --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false
when deploying prometheus-operator via Helm to enable the prometheus-operator to scrape metrics from any PodMonitor.
or setting labels on the PodMonitor via the helm parameter prometheus.labels
to control which Prometheus instances will scrape this PodMonitor.
NFD contains an example Grafana dashboard. You can import examples/grafana-dashboard.json
to your Grafana instance to visualize the NFD metrics.
The Node Feature Discovery Operator automates installation, configuration and updates of NFD using a specific NodeFeatureDiscovery custom resource. This also provides good support for managing NFD as a dependency of other operators.
Deployment using the Node Feature Discovery Operator is recommended to be done via operatorhub.io.
Install the operator:
kubectl create -f https://operatorhub.io/install/nfd-operator.yaml
+ NFD Operator · Node Feature Discovery
Deployment with NFD Operator
Table of contents
The Node Feature Discovery Operator automates installation, configuration and updates of NFD using a specific NodeFeatureDiscovery custom resource. This also provides good support for managing NFD as a dependency of other operators.
Deployment
Deployment using the Node Feature Discovery Operator is recommended to be done via operatorhub.io.
- You need to have OLM installed. If you don't, take a look at the latest release for detailed instructions.
-
Install the operator:
kubectl create -f https://operatorhub.io/install/nfd-operator.yaml
-
Create NodeFeatureDiscovery
object (in nfd
namespace here):
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
@@ -12,9 +12,9 @@
namespace: nfd
spec:
operand:
- image: registry.k8s.io/nfd/node-feature-discovery:v0.15.6
+ image: registry.k8s.io/nfd/node-feature-discovery:v0.15.7
imagePullPolicy: IfNotPresent
EOF
Uninstallation
If you followed the deployment instructions above you can uninstall NFD with:
kubectl -n nfd delete NodeFeatureDiscovery my-nfd-deployment
Optionally, you can also remove the namespace:
kubectl delete ns nfd
-
See the node-feature-discovery-operator and OLM project documentation for instructions for uninstalling the operator and operator lifecycle manager, respectively.
Node Feature Discovery v0.15
\ No newline at end of file
+
See the node-feature-discovery-operator and OLM project documentation for instructions for uninstalling the operator and operator lifecycle manager, respectively.
DEPRECATED: this section only applies when the gRPC API is used, i.e. when the NodeFeature API is disabled via the
-enable-nodefeature-api=false
flag on both nfd-master and nfd-worker. The gRPC API is deprecated and will be removed in a future release.
NFD supports mutual TLS authentication between the nfd-master and nfd-worker instances. That is, nfd-worker and nfd-master both verify that the other end presents a valid certificate.
TLS authentication is enabled by specifying -ca-file
, -key-file
and -cert-file
args, on both the nfd-master and nfd-worker instances. The template specs provided with NFD contain (commented out) example configuration for enabling TLS authentication.
The Common Name (CN) of the nfd-master certificate must match the DNS name of the nfd-master Service of the cluster. By default, nfd-master only check that the nfd-worker has been signed by the specified root certificate (-ca-file).
Additional hardening can be enabled by specifying -verify-node-name
in nfd-master args, in which case nfd-master verifies that the NodeName presented by nfd-worker matches the Common Name (CN) or a Subject Alternative Name (SAN) of its certificate. Note that -verify-node-name
complicates certificate management and is not yet supported in the helm or kustomize deployment methods.
cert-manager can be used to automate certificate management between nfd-master and the nfd-worker pods.
The NFD source code repository contains an example kustomize overlay and helm chart that can be used to deploy NFD with cert-manager supplied certificates enabled.
To install cert-manager
itself, you can run:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml
+ TLS authentication · Node Feature Discovery
Communication security with TLS
Table of contents
DEPRECATED: this section only applies when the gRPC API is used, i.e. when the NodeFeature API is disabled via the -enable-nodefeature-api=false
flag on both nfd-master and nfd-worker. The gRPC API is deprecated and will be removed in a future release.
NFD supports mutual TLS authentication between the nfd-master and nfd-worker instances. That is, nfd-worker and nfd-master both verify that the other end presents a valid certificate.
TLS authentication is enabled by specifying -ca-file
, -key-file
and -cert-file
args, on both the nfd-master and nfd-worker instances. The template specs provided with NFD contain (commented out) example configuration for enabling TLS authentication.
The Common Name (CN) of the nfd-master certificate must match the DNS name of the nfd-master Service of the cluster. By default, nfd-master only check that the nfd-worker has been signed by the specified root certificate (-ca-file).
Additional hardening can be enabled by specifying -verify-node-name
in nfd-master args, in which case nfd-master verifies that the NodeName presented by nfd-worker matches the Common Name (CN) or a Subject Alternative Name (SAN) of its certificate. Note that -verify-node-name
complicates certificate management and is not yet supported in the helm or kustomize deployment methods.
Automated TLS certificate management using cert-manager
cert-manager can be used to automate certificate management between nfd-master and the nfd-worker pods.
The NFD source code repository contains an example kustomize overlay and helm chart that can be used to deploy NFD with cert-manager supplied certificates enabled.
To install cert-manager
itself, you can run:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml
Alternatively, you can refer to cert-manager documentation for other installation methods such as the Helm chart they provide.
To use the kustomize overlay to install node-feature-discovery with TLS enabled, you may use the following:
kubectl apply -k deployment/overlays/samples/cert-manager
To make use of the helm chart, override values.yaml
to enable both the tls.enabled
and tls.certManager
options. Note that if you do not enable tls.certManager
, helm will successfully install the application, but deployment will wait until certificates are manually created, as demonstrated below.
See the sample installation commands in the Helm Deployment and Configuration sections above for how to either override individual values, or provide a yaml file with which to override default values.
Manual TLS certificate management
If you do not with to make use of cert-manager, the certificates can be manually created and stored as secrets within the NFD namespace.
Create a CA certificate
openssl req -x509 -newkey rsa:4096 -keyout ca.key -nodes \
-subj "/CN=nfd-ca" -days 10000 -out ca.crt
@@ -78,4 +78,4 @@
EOF
done
-
Node Feature Discovery v0.15
\ No newline at end of file
+
Follow the uninstallation instructions of the deployment method used (kustomize, helm or operator).
NFD-Master has a special -prune
command line flag for removing all nfd-related node labels, annotations, extended resources and taints from the cluster.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.6
+ Uninstallation · Node Feature Discovery
Uninstallation
Follow the uninstallation instructions of the deployment method used (kustomize, helm or operator).
Removing feature labels
NFD-Master has a special -prune
command line flag for removing all nfd-related node labels, annotations, extended resources and taints from the cluster.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.7
kubectl -n node-feature-discovery wait job.batch/nfd-master --for=condition=complete && \
- kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.6
-
NOTE: You must run prune before removing the RBAC rules (serviceaccount, clusterrole and clusterrolebinding).
Node Feature Discovery v0.15
\ No newline at end of file
+ kubectl delete -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/prune?ref=v0.15.7
+
NOTE: You must run prune before removing the RBAC rules (serviceaccount, clusterrole and clusterrolebinding).
git clone https://github.com/kubernetes-sigs/node-feature-discovery
+ Developer guide · Node Feature Discovery
Developer guide
Table of contents
Building from source
Download the source code
git clone https://github.com/kubernetes-sigs/node-feature-discovery
cd node-feature-discovery
Docker build
Build the container image
See customizing the build below for altering the container image registry, for example.
make
Push the container image
Optional, this example with Docker.
docker push <IMAGE_TAG>
Docker multi-arch builds with buildx
The default set of architectures enabled for mulit-arch builds are linux/amd64
and linux/arm64
. If more architectures are needed one can override the IMAGE_ALL_PLATFORMS
variable with a comma separated list of OS/ARCH
tuples.
Build the manifest-list with a container image per arch
make image-all
Currently docker
does not support loading of manifest-lists meaning the images are not shown when executing docker images
, see: buildx issue #59.
Push the manifest-list with container image per arch
make push-all
-
The resulting container image can be used in the same way on each arch by pulling e.g. node-feature-discovery:v0.15.6
without specifying the architecture. The manifest-list will take care of providing the right architecture image.
Change the job spec to use your custom image (optional)
To use your published image from the step above instead of the registry.k8s.io/nfd/node-feature-discovery
image, edit image
attribute in the spec template(s) to the new location (<registry-name>/<image-name>[:<version>]
).
Deployment
The yamls
makefile generates a kustomization.yaml
matching your locally built image and using the deploy/overlays/default
deployment. See build customization below for configurability, e.g. changing the deployment namespace.
K8S_NAMESPACE=my-ns make yamls
+
The resulting container image can be used in the same way on each arch by pulling e.g. node-feature-discovery:v0.15.7
without specifying the architecture. The manifest-list will take care of providing the right architecture image.
Change the job spec to use your custom image (optional)
To use your published image from the step above instead of the registry.k8s.io/nfd/node-feature-discovery
image, edit image
attribute in the spec template(s) to the new location (<registry-name>/<image-name>[:<version>]
).
Deployment
The yamls
makefile generates a kustomization.yaml
matching your locally built image and using the deploy/overlays/default
deployment. See build customization below for configurability, e.g. changing the deployment namespace.
K8S_NAMESPACE=my-ns make yamls
kubectl apply -k .
You can use alternative deployment methods by modifying the auto-generated kustomization file.
Building locally
You can also build the binaries locally
make build
This will compile binaries under bin/
Customizing the build
There are several Makefile variables that control the build process and the name of the resulting container image. The following are targeted targeted for build customization and they can be specified via environment variables or makefile overrides.
Variable Description Default value HOSTMOUNT_PREFIX Prefix of system directories for feature discovery (local builds) / (local builds) /host- (container builds) IMAGE_BUILD_CMD Command to build the image docker build IMAGE_BUILD_EXTRA_OPTS Extra options to pass to build command empty IMAGE_BUILDX_CMD Command to build and push multi-arch images with buildx DOCKER_CLI_EXPERIMENTAL=enabled docker buildx build –platform=${IMAGE_ALL_PLATFORMS} –progress=auto –pull IMAGE_ALL_PLATFORMS Comma separated list of OS/ARCH tuples for mulit-arch builds linux/amd64,linux/arm64 IMAGE_PUSH_CMD Command to push the image to remote registry docker push IMAGE_REGISTRY Container image registry to use registry.k8s.io/nfd IMAGE_TAG_NAME Container image tag name <nfd version> IMAGE_EXTRA_TAG_NAMES Additional container image tag(s) to create when building image empty K8S_NAMESPACE nfd-master and nfd-worker namespace node-feature-discovery
For example, to use a custom registry:
make IMAGE_REGISTRY=<my custom registry uri>
@@ -12,19 +12,19 @@
by overriding the variable value
make IMAGE_BUILD_CMD="buildah bud"
Testing
Unit tests are automatically run as part of the container image build. You can also run them manually in the source code tree by running:
make test
End-to-end tests are built on top of the e2e test framework of Kubernetes, and, they required a cluster to run them on. For running the tests on your test cluster you need to specify the kubeconfig to be used:
make e2e-test KUBECONFIG=$HOME/.kube/config
-
There are several environment variables that can be used to customize the e2e-tests:
Variable Description Default value KUBECONFIG Kubeconfig for running e2e-tests empty E2E_TEST_CONFIG Parameterization file of e2e-tests (see example) empty E2E_PULL_IF_NOT_PRESENT True-ish value makes the image pull policy IfNotPresent (to be used only in e2e tests) false E2E_TEST_FULL_IMAGE Run e2e-test also against the Full Image tag false E2E_GINKGO_LABEL_FILTER Ginkgo label filter to use for running e2e tests empty OPENSHIFT Non-empty value enables OpenShift specific support (only affects e2e tests) empty
Running locally
**DEPRECATED: Running NFD locally is deprecated and will be removed in a future release. It depends on the gRPC API which is deprecated and will be removed in a future release. To run NFD locally, use the -enable-nodefeature-api=false
flag.
You can run NFD locally, either directly on your host OS or in containers for testing and development purposes. This may be useful e.g. for checking features-detection.
NFD-Master
When running as a standalone container labeling is expected to fail because Kubernetes API is not available. Thus, it is recommended to use -no-publish
Also specify -crd-controller=false
and -enable-nodefeature-api=false
command line flags to disable CRD controller and enable gRPC. E.g.
$ export NFD_CONTAINER_IMAGE=registry.k8s.io/nfd/node-feature-discovery:v0.15.6
+
There are several environment variables that can be used to customize the e2e-tests:
Variable Description Default value KUBECONFIG Kubeconfig for running e2e-tests empty E2E_TEST_CONFIG Parameterization file of e2e-tests (see example) empty E2E_PULL_IF_NOT_PRESENT True-ish value makes the image pull policy IfNotPresent (to be used only in e2e tests) false E2E_TEST_FULL_IMAGE Run e2e-test also against the Full Image tag false E2E_GINKGO_LABEL_FILTER Ginkgo label filter to use for running e2e tests empty OPENSHIFT Non-empty value enables OpenShift specific support (only affects e2e tests) empty
Running locally
**DEPRECATED: Running NFD locally is deprecated and will be removed in a future release. It depends on the gRPC API which is deprecated and will be removed in a future release. To run NFD locally, use the -enable-nodefeature-api=false
flag.
You can run NFD locally, either directly on your host OS or in containers for testing and development purposes. This may be useful e.g. for checking features-detection.
NFD-Master
When running as a standalone container labeling is expected to fail because Kubernetes API is not available. Thus, it is recommended to use -no-publish
Also specify -crd-controller=false
and -enable-nodefeature-api=false
command line flags to disable CRD controller and enable gRPC. E.g.
$ export NFD_CONTAINER_IMAGE=registry.k8s.io/nfd/node-feature-discovery:v0.15.7
$ docker run --rm --name=nfd-test ${NFD_CONTAINER_IMAGE} nfd-master -no-publish -crd-controller=false -enable-nodefeature-api=false
2019/02/01 14:48:21 Node Feature Discovery Master <NFD_VERSION>
2019/02/01 14:48:21 gRPC server serving on port: 8080
NFD-Worker
To run nfd-worker as a "stand-alone" container you need to run it in the same network namespace as the nfd-master container:
$ docker run --rm --network=container:nfd-test ${NFD_CONTAINER_IMAGE} nfd-worker -enable-nodefeature-api=false
2019/02/01 14:48:56 Node Feature Discovery Worker <NFD_VERSION>
...
-
If you just want to try out feature discovery without connecting to nfd-master, pass the -no-publish
flag to nfd-worker.
NOTE: Some feature sources need certain directories and/or files from the host mounted inside the NFD container. Thus, you need to provide Docker with the correct --volume
options for them to work correctly when run stand-alone directly with docker run
. See the default deployment for up-to-date information about the required volume mounts.
NFD-Topology-Updater
To run nfd-topology-updater as a "stand-alone" container you need to run it in with the -no-publish
flag to disable communication to the Kubernetes apiserver.
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-topology-updater -no-publish
+
If you just want to try out feature discovery without connecting to nfd-master, pass the -no-publish
flag to nfd-worker.
NOTE: Some feature sources need certain directories and/or files from the host mounted inside the NFD container. Thus, you need to provide Docker with the correct --volume
options for them to work correctly when run stand-alone directly with docker run
. See the default deployment for up-to-date information about the required volume mounts.
NFD-Topology-Updater
To run nfd-topology-updater as a "stand-alone" container you need to run it in with the -no-publish
flag to disable communication to the Kubernetes apiserver.
$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-topology-updater -no-publish
2019/02/01 14:48:56 Node Feature Discovery Topology Updater <NFD_VERSION>
...
-
If you just want to try out resource topology discovery without connecting to the Kubernetes API, pass the -no-publish
flag to nfd-topology-updater.
NOTE: NFD topology updater needs certain directories and/or files from the host mounted inside the NFD container. Thus, you need to provide Docker with the correct --volume
options for them to work correctly when run stand-alone directly with docker run
. See the template spec for up-to-date information about the required volume mounts.
PodResource API is a prerequisite for nfd-topology-updater. Preceding Kubernetes v1.23, the kubelet
must be started with the following flag: --feature-gates=KubeletPodResourcesGetAllocatable=true
. Starting Kubernetes v1.23, the GetAllocatableResources
is enabled by default through KubeletPodResourcesGetAllocatable
feature gate.
Running with Tilt
Another option for building NFD locally is via Tilt tool, which can build container images, push them to a local registry and reload your Kubernetes pods automatically. When using Tilt, you don't have to build container images and re-deploy your pods manually but instead let the Tilt take care of it. Tiltfile is a configuration file for the Tilt and is located at the root directory. To develop NFD with Tilt, follow the steps below.
Prerequisites
- Install Docker
- Setup Docker as a non-root user.
- Install kubectl
- Install kustomize
- Install tilt
- Create a local Kubernetes cluster
To start up your Tilt development environment, run
tilt up
+
If you just want to try out resource topology discovery without connecting to the Kubernetes API, pass the -no-publish
flag to nfd-topology-updater.
NOTE: NFD topology updater needs certain directories and/or files from the host mounted inside the NFD container. Thus, you need to provide Docker with the correct --volume
options for them to work correctly when run stand-alone directly with docker run
. See the template spec for up-to-date information about the required volume mounts.
PodResource API is a prerequisite for nfd-topology-updater. Preceding Kubernetes v1.23, the kubelet
must be started with the following flag: --feature-gates=KubeletPodResourcesGetAllocatable=true
. Starting Kubernetes v1.23, the GetAllocatableResources
is enabled by default through KubeletPodResourcesGetAllocatable
feature gate.
Running with Tilt
Another option for building NFD locally is via Tilt tool, which can build container images, push them to a local registry and reload your Kubernetes pods automatically. When using Tilt, you don't have to build container images and re-deploy your pods manually but instead let the Tilt take care of it. Tiltfile is a configuration file for the Tilt and is located at the root directory. To develop NFD with Tilt, follow the steps below.
Prerequisites
- Install Docker
- Setup Docker as a non-root user.
- Install kubectl
- Install kustomize
- Install tilt
- Create a local Kubernetes cluster
To start up your Tilt development environment, run
tilt up
at the root of your local NFD codebase. Tilt will start a web interface in the localhost and port 10350. From the web interface, you are able to see how NFD worker and master are progressing, watch their build and runtime logs. Once your code changes are saved locally, Tilt will notice it and re-build the container image from the current code, push the image to the registry and re-deploy NFD pods with the latest container image.
Environment variables
To override environment variables used in the Tiltfile during image build, export them in your current terminal before starting Tilt.
export IMAGE_TAG_NAME="v1"
tilt up
-
This will override the default value(master
) of IMAGE_TAG_NAME
variable defined in the Tiltfile.
Documentation
All documentation resides under the docs directory in the source tree. It is designed to be served as a html site by GitHub Pages.
Building the documentation is containerized to fix the build environment. The recommended way for developing documentation is to run:
make site-serve
+
This will override the default value(master
) of IMAGE_TAG_NAME
variable defined in the Tiltfile.
Documentation
All documentation resides under the docs directory in the source tree. It is designed to be served as a html site by GitHub Pages.
Building the documentation is containerized to fix the build environment. The recommended way for developing documentation is to run:
make site-serve
This will build the documentation in a container and serve it under localhost:4000/ making it easy to verify the results. Any changes made to the docs/
will automatically re-trigger a rebuild and are reflected in the served content and can be inspected with a browser refresh.
To just build the html documentation run:
make site-build
-
This will generate html documentation under docs/_site/
.
Node Feature Discovery v0.15
\ No newline at end of file
+
This will generate html documentation under docs/_site/
.
Welcome to Node Feature Discovery – a Kubernetes add-on for detecting hardware features and system configuration!
Continue to:
Introduction for more details on the project.
Quick start for quick step-by-step instructions on how to get NFD running on your cluster.
$ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6
+ Get started · Node Feature Discovery
Node Feature Discovery
Welcome to Node Feature Discovery – a Kubernetes add-on for detecting hardware features and system configuration!
Continue to:
-
Introduction for more details on the project.
-
Quick start for quick step-by-step instructions on how to get NFD running on your cluster.
Quick-start – the short-short version
$ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7
namespace/node-feature-discovery created
serviceaccount/nfd-master created
clusterrole.rbac.authorization.k8s.io/nfd-master created
@@ -22,4 +22,4 @@
"feature.node.kubernetes.io/cpu-cpuid.AESNI": "true",
...
-
Node Feature Discovery v0.15
\ No newline at end of file
+
This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, and advertises those features using node labels and optionally node extended resources, annotations and node taints. Node Feature Discovery is compatible with any recent version of Kubernetes (v1.21+).
NFD consists of four software components:
NFD-Master is the daemon responsible for communication towards the Kubernetes API. That is, it receives labeling requests from the worker and modifies node objects accordingly.
NFD-Worker is a daemon responsible for feature detection. It then communicates the information to nfd-master which does the actual node labeling. One instance of nfd-worker is supposed to be running on each node of the cluster,
NFD-Topology-Updater is a daemon responsible for examining allocated resources on a worker node to account for resources available to be allocated to new pod on a per-zone basis (where a zone can be a NUMA node). It then creates or updates a NodeResourceTopology custom resource object specific to this node. One instance of nfd-topology-updater is supposed to be running on each node of the cluster.
NFD-GC is a daemon responsible for cleaning obsolete NodeFeature and NodeResourceTopology objects.
One instance of nfd-gc is supposed to be running in the cluster.
Feature discovery is divided into domain-specific feature sources:
Each feature source is responsible for detecting a set of features which. in turn, are turned into node feature labels. Feature labels are prefixed with feature.node.kubernetes.io/
and also contain the name of the feature source. Non-standard user-specific feature labels can be created with the local and custom feature sources.
An overview of the default feature labels:
{
+ Introduction · Node Feature Discovery
Introduction
Table of contents
- NFD-Master
- NFD-Worker
- NFD-Topology-Updater
- NFD-GC
- Feature Discovery
- Node annotations
- Custom resources
This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, and advertises those features using node labels and optionally node extended resources, annotations and node taints. Node Feature Discovery is compatible with any recent version of Kubernetes (v1.21+).
NFD consists of four software components:
- nfd-master
- nfd-worker
- nfd-topology-updater
- nfd-gc
NFD-Master
NFD-Master is the daemon responsible for communication towards the Kubernetes API. That is, it receives labeling requests from the worker and modifies node objects accordingly.
NFD-Worker
NFD-Worker is a daemon responsible for feature detection. It then communicates the information to nfd-master which does the actual node labeling. One instance of nfd-worker is supposed to be running on each node of the cluster,
NFD-Topology-Updater
NFD-Topology-Updater is a daemon responsible for examining allocated resources on a worker node to account for resources available to be allocated to new pod on a per-zone basis (where a zone can be a NUMA node). It then creates or updates a NodeResourceTopology custom resource object specific to this node. One instance of nfd-topology-updater is supposed to be running on each node of the cluster.
NFD-GC
NFD-GC is a daemon responsible for cleaning obsolete NodeFeature and NodeResourceTopology objects.
One instance of nfd-gc is supposed to be running in the cluster.
Feature Discovery
Feature discovery is divided into domain-specific feature sources:
- CPU
- Kernel
- Memory
- Network
- PCI
- Storage
- System
- USB
- Custom (rule-based custom features)
- Local (hooks for user-specific features)
Each feature source is responsible for detecting a set of features which. in turn, are turned into node feature labels. Feature labels are prefixed with feature.node.kubernetes.io/
and also contain the name of the feature source. Non-standard user-specific feature labels can be created with the local and custom feature sources.
An overview of the default feature labels:
{
"feature.node.kubernetes.io/cpu-<feature-name>": "true",
"feature.node.kubernetes.io/custom-<feature-name>": "true",
"feature.node.kubernetes.io/kernel-<feature name>": "<feature value>",
@@ -10,4 +10,4 @@
"feature.node.kubernetes.io/usb-<device label>.present": "<feature value>",
"feature.node.kubernetes.io/<file name>-<feature name>": "<feature value>"
}
-
Node annotations
NFD also annotates nodes it is running on:
Annotation Description [<instance>.]nfd.node.kubernetes.io/feature-labels Comma-separated list of node labels managed by NFD. NFD uses this internally so must not be edited by users. [<instance>.]nfd.node.kubernetes.io/feature-annotations Comma-separated list of node annotations managed by NFD. NFD uses this internally so must not be edited by users. [<instance>.]nfd.node.kubernetes.io/extended-resources Comma-separated list of node extended resources managed by NFD. NFD uses this internally so must not be edited by users. [<instance>.]nfd.node.kubernetes.io/taints Comma-separated list of node taints managed by NFD. NFD uses this internally so must not be edited by users.
NOTE: the -instance
command line flag affects the annotation names
Unapplicable annotations are not created, i.e. for example nfd.node.kubernetes.io/extended-resources
is only placed if some extended resources were created by NFD.
Custom resources
NFD takes use of some Kubernetes Custom Resources.
NodeFeatures is be used for representing node features and requesting node labels to be generated.
NFD-Master uses NodeFeatureRules for custom labeling of nodes.
NFD-Topology-Updater creates NodeResourceTopology objects that describe the hardware topology of node resources.
Node Feature Discovery v0.15
\ No newline at end of file
+
NFD also annotates nodes it is running on:
Annotation | Description |
---|---|
[<instance>.]nfd.node.kubernetes.io/feature-labels | Comma-separated list of node labels managed by NFD. NFD uses this internally so must not be edited by users. |
[<instance>.]nfd.node.kubernetes.io/feature-annotations | Comma-separated list of node annotations managed by NFD. NFD uses this internally so must not be edited by users. |
[<instance>.]nfd.node.kubernetes.io/extended-resources | Comma-separated list of node extended resources managed by NFD. NFD uses this internally so must not be edited by users. |
[<instance>.]nfd.node.kubernetes.io/taints | Comma-separated list of node taints managed by NFD. NFD uses this internally so must not be edited by users. |
NOTE: the
-instance
command line flag affects the annotation names
Unapplicable annotations are not created, i.e. for example nfd.node.kubernetes.io/extended-resources
is only placed if some extended resources were created by NFD.
NFD takes use of some Kubernetes Custom Resources.
NodeFeatures is be used for representing node features and requesting node labels to be generated.
NFD-Master uses NodeFeatureRules for custom labeling of nodes.
NFD-Topology-Updater creates NodeResourceTopology objects that describe the hardware topology of node resources.
Minimal steps to deploy latest released version of NFD in your cluster.
Deploy with kustomize – creates a new namespace, service and required RBAC rules and deploys nfd-master and nfd-worker daemons.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.6
+ Quick start · Node Feature Discovery
Quick start
Minimal steps to deploy latest released version of NFD in your cluster.
Installation
Deploy with kustomize – creates a new namespace, service and required RBAC rules and deploys nfd-master and nfd-worker daemons.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/default?ref=v0.15.7
Verify
Wait until NFD master and NFD worker are running.
$ kubectl -n node-feature-discovery get ds,deploy
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/nfd-worker 2 2 2 2 2 <none> 10s
@@ -31,7 +31,7 @@
See that the pod is running on a desired node
$ kubectl get po feature-dependent-pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
feature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 <none> <none>
-
Additional Optional Installation Steps
Deploy nfd-topology-updater
To deploy nfd-topology-updater use the topologyupdater
kustomize overlay.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.6
+
Additional Optional Installation Steps
Deploy nfd-topology-updater
To deploy nfd-topology-updater use the topologyupdater
kustomize overlay.
kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref=v0.15.7
Verify nfd-topology-updater
Wait until nfd-topology-updater is running.
$ kubectl -n node-feature-discovery get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/nfd-topology-updater 2 2 2 2 2 <none> 5s
@@ -40,4 +40,4 @@
NAME AGE
kind-control-plane 23s
kind-worker 23s
-
Node Feature Discovery v0.15
\ No newline at end of file
+
To quickly view available command line flags execute nfd-gc -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 \
+ Garbage Collector Cmdline Reference · Node Feature Discovery
NFD-GC Commandline Flags
Table of Contents
To quickly view available command line flags execute nfd-gc -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 \
nfd-gc -help
-h, -help
Print usage and exit.
-version
Print version and exit.
-gc-interval
The -gc-interval
specifies the interval between periodic garbage collector runs.
Default: 1h
Example:
nfd-gc -gc-interval=1h
-
Node Feature Discovery v0.15
\ No newline at end of file
+
Command line and configuration reference.
Command line and configuration reference.
To quickly view available command line flags execute nfd-master -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 nfd-master -help
+ Master cmdline reference · Node Feature Discovery
Commandline flags of nfd-master
Table of contents
- -h, -help
- -version
- -prune
- -port
- -metrics
- -instance
- -ca-file
- -cert-file
- -key-file
- -verify-node-name
- -enable-nodefeature-api
- -enable-leader-election
- -enable-taints
- -no-publish
- -crd-controller
- -featurerules-controller
- -label-whitelist
- -extra-label-ns
- -deny-label-ns
- -resource-labels
- -config
- -options
- -nfd-api-parallelism
- Logging
- -resync-period
To quickly view available command line flags execute nfd-master -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 nfd-master -help
-h, -help
Print usage and exit.
-version
Print version and exit.
-prune
The -prune
flag is a sub-command like option for cleaning up the cluster. It causes nfd-master to remove all NFD related labels, annotations and extended resources from all Node objects of the cluster and exit.
-port
The -port
flag specifies the TCP port that nfd-master listens for incoming requests.
Default: 8080
Example:
nfd-master -port=443
-metrics
The -metrics
flag specifies the port on which to expose Prometheus metrics. Setting this to 0 disables the metrics server on nfd-master.
Default: 8081
Example:
nfd-master -metrics=12345
-instance
The -instance
flag makes it possible to run multiple NFD deployments in parallel. In practice, it separates the node annotations between deployments so that each of them can store metadata independently. The instance name must start and end with an alphanumeric character and may only contain alphanumeric characters, -
, _
or .
.
Default: empty
Example:
nfd-master -instance=network
@@ -20,4 +20,4 @@
-options
The -options
flag may be used to specify and override configuration file options directly from the command line. The required format is the same as in the config file i.e. JSON or YAML. Configuration options specified via this flag will override those from the configuration file:
Default: empty
Example:
nfd-master -options='{"noPublish": true}'
-nfd-api-parallelism
The -nfd-api-parallelism
flag can be used to specify the maximum number of concurrent node updates.
It takes effect only when -enable-nodefeature-api
has been set.
Default: 10
Example:
nfd-master -nfd-api-parallelism=1
Logging
The following logging-related flags are inherited from the klog package.
-add_dir_header
If true, adds the file directory to the header of the log messages.
Default: false
-alsologtostderr
Log to standard error as well as files.
Default: false
-log_backtrace_at
When logging hits line file:N, emit a stack trace.
Default: empty
-log_dir
If non-empty, write log files in this directory.
Default: empty
-log_file
If non-empty, use this log file.
Default: empty
-log_file_max_size
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited.
Default: 1800
-logtostderr
Log to standard error instead of files
Default: true
-skip_headers
If true, avoid header prefixes in the log messages.
Default: false
-skip_log_headers
If true, avoid headers when opening log files.
Default: false
-stderrthreshold
Logs at or above this threshold go to stderr.
Default: 2
-v
Number for the log level verbosity.
Default: 0
-vmodule
Comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
-resync-period
The -resync-period
flag specifies the NFD API controller resync period. The resync means nfd-master replaying all NodeFeature and NodeFeatureRule objects, thus effectively re-syncing all nodes in the cluster (i.e. ensuring labels, annotations, extended resources and taints are in place). Only has effect when the NodeFeature CRD API has been enabled with -enable-nodefeature-api
.
Default: 1 hour.
Example:
nfd-master -resync-period=2h
-
Node Feature Discovery v0.15
\ No newline at end of file
+
See the sample configuration file for a full example configuration.
noPublish
option disables updates to the Node objects in the Kubernetes API server, making a "dry-run" flag for nfd-master. No Labels, Annotations, Taints or ExtendedResources of nodes are updated.
Default: false
Example:
noPublish: true
+ Master config reference · Node Feature Discovery
Configuration file reference of nfd-master
Table of contents
- noPublish
- extraLabelNs
- denyLabelNs
- autoDefaultNs
- resourceLabels
- enableTaints
- labelWhiteList
- resyncPeriod
- leaderElection
- nfdApiParallelism
- klog
See the sample configuration file for a full example configuration.
noPublish
noPublish
option disables updates to the Node objects in the Kubernetes API server, making a "dry-run" flag for nfd-master. No Labels, Annotations, Taints or ExtendedResources of nodes are updated.
Default: false
Example:
noPublish: true
extraLabelNs
extraLabelNs
specifies a list of allowed feature label namespaces. This option can be used to allow other vendor or application specific namespaces for custom labels from the local and custom feature sources, even though these labels were denied using the denyLabelNs
parameter.
The same namespace control and this option applies to Extended Resources (created with resourceLabels
), too.
Default: empty
Example:
extraLabelNs: ["added.ns.io","added.kubernets.io"]
denyLabelNs
denyLabelNs
specifies a list of excluded label namespaces. By default, nfd-master allows creating labels in all namespaces, excluding kubernetes.io
namespace and its sub-namespaces (i.e. *.kubernetes.io
). However, you should note that kubernetes.io
and its sub-namespaces are always denied. This option can be used to exclude some vendors or application specific namespaces.
Default: empty
Example:
denyLabelNs: ["denied.ns.io","denied.kubernetes.io"]
autoDefaultNs
The autoDefaultNs
option controls the automatic prefixing of names. When set to true (the default in NFD version v0.15) nfd-master automatically adds the default feature.node.kubernetes.io/
prefix to unprefixed labels, annotations and extended resources - this is also the default behavior in NFD v0.15 and earlier. When the option is set to false
, no prefix will be prepended to unprefixed names, effectively causing them to be filtered out (as NFD does not allow unprefixed names of labels, annotations or extended resources). The default will be changed to false
in a future release.
For example, with the autoDefaultNs
set to true
, a NodeFeatureRule with
labels:
@@ -15,4 +15,4 @@
leaderElection.retryPeriod
leaderElection.retryPeriod
is the duration the LeaderElector clients should wait between tries of actions.
It has to be greater than 0.
Default: 2 seconds.
Example:
leaderElection:
retryPeriod: 2s
nfdApiParallelism
The nfdApiParallelism
option can be used to specify the maximum number of concurrent node updates.
It takes effect only when -enable-nodefeature-api
has been set.
Default: 10
Example:
nfdApiParallelism: 1
-
klog
The following options specify the logger configuration. Most of which can be dynamically adjusted at run-time.
NOTE: The logger options can also be specified via command line flags which take precedence over any corresponding config file options.
klog.addDirHeader
If true, adds the file directory to the header of the log messages.
Default: false
Run-time configurable: yes
klog.alsologtostderr
Log to standard error as well as files.
Default: false
Run-time configurable: yes
klog.logBacktraceAt
When logging hits line file:N, emit a stack trace.
Default: empty
Run-time configurable: yes
klog.logDir
If non-empty, write log files in this directory.
Default: empty
Run-time configurable: no
klog.logFile
If non-empty, use this log file.
Default: empty
Run-time configurable: no
klog.logFileMaxSize
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited.
Default: 1800
Run-time configurable: no
klog.logtostderr
Log to standard error instead of files
Default: true
Run-time configurable: yes
klog.skipHeaders
If true, avoid header prefixes in the log messages.
Default: false
Run-time configurable: yes
klog.skipLogHeaders
If true, avoid headers when opening log files.
Default: false
Run-time configurable: no
klog.stderrthreshold
Logs at or above this threshold go to stderr (default 2)
Run-time configurable: yes
klog.v
Number for the log level verbosity.
Default: 0
Run-time configurable: yes
klog.vmodule
Comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
Run-time configurable: yes
Node Feature Discovery v0.15
\ No newline at end of file
+
The following options specify the logger configuration. Most of which can be dynamically adjusted at run-time.
NOTE: The logger options can also be specified via command line flags which take precedence over any corresponding config file options.
If true, adds the file directory to the header of the log messages.
Default: false
Run-time configurable: yes
Log to standard error as well as files.
Default: false
Run-time configurable: yes
When logging hits line file:N, emit a stack trace.
Default: empty
Run-time configurable: yes
If non-empty, write log files in this directory.
Default: empty
Run-time configurable: no
If non-empty, use this log file.
Default: empty
Run-time configurable: no
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited.
Default: 1800
Run-time configurable: no
Log to standard error instead of files
Default: true
Run-time configurable: yes
If true, avoid header prefixes in the log messages.
Default: false
Run-time configurable: yes
If true, avoid headers when opening log files.
Default: false
Run-time configurable: no
Logs at or above this threshold go to stderr (default 2)
Run-time configurable: yes
Number for the log level verbosity.
Default: 0
Run-time configurable: yes
Comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
Run-time configurable: yes
To quickly view available command line flags execute kubectl nfd -help
.
Print usage and exit.
Validate a NodeFeatureRule file.
The --nodefeature-file
flag specifies the path to the NodeFeatureRule file to validate.
Test a NodeFeatureRule file against a node without applying it.
The --kubeconfig
flag specifies the path to the kubeconfig file to use for CLI requests.
The --namespace
flag specifies the namespace to use for CLI requests. Default: default
.
The --nodename
flag specifies the name of the node to test the NodeFeatureRule against.
The --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file to test.
Process a NodeFeatureRule file against a NodeFeature file.
The --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file to test.
The --nodefeature-file
flag specifies the path to the NodeFeature file to test.
To quickly view available command line flags execute kubectl nfd -help
.
Print usage and exit.
Validate a NodeFeatureRule file.
The --nodefeature-file
flag specifies the path to the NodeFeatureRule file to validate.
Test a NodeFeatureRule file against a node without applying it.
The --kubeconfig
flag specifies the path to the kubeconfig file to use for CLI requests.
The --namespace
flag specifies the namespace to use for CLI requests. Default: default
.
The --nodename
flag specifies the name of the node to test the NodeFeatureRule against.
The --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file to test.
Process a NodeFeatureRule file against a NodeFeature file.
The --nodefeaturerule-file
flag specifies the path to the NodeFeatureRule file to test.
The --nodefeature-file
flag specifies the path to the NodeFeature file to test.
To quickly view available command line flags execute nfd-topology-updater -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 \
+ Topology Updater Cmdline Reference · Node Feature Discovery
NFD-Topology-Updater Commandline Flags
Table of Contents
- -h, -help
- -version
- -config
- -no-publish
- -oneshot
- -metrics
- -sleep-interval
- -watch-namespace
- -kubelet-config-uri
- -api-auth-token-file
- -podresources-socket
- -pods-fingerprint
- -kubelet-state-dir
To quickly view available command line flags execute nfd-topology-updater -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 \
nfd-topology-updater -help
-h, -help
Print usage and exit.
-version
Print version and exit.
-config
The -config
flag specifies the path of the nfd-topology-updater configuration file to use.
Default: /etc/kubernetes/node-feature-discovery/nfd-topology-updater.conf
Example:
nfd-topology-updater -config=/opt/nfd/nfd-topology-updater.conf
-no-publish
The -no-publish
flag disables all communication with the nfd-master, making it a "dry-run" flag for nfd-topology-updater. NFD-Topology-Updater runs resource hardware topology detection normally, but no CR requests are sent to nfd-master.
Default: false
Example:
nfd-topology-updater -no-publish
@@ -11,4 +11,4 @@
-podresources-socket
The -podresources-socket
specifies the path to the Unix socket where kubelet exports a gRPC service to enable discovery of in-use CPUs and devices, and to provide metadata for them.
Default: /host-var/lib/kubelet/pod-resources/kubelet.sock
Example:
nfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock
-pods-fingerprint
Enables compute and report the pod set fingerprint in the NRT. A pod fingerprint is a compact representation of the "node state" regarding resources.
Default: false
Example:
nfd-topology-updater -pods-fingerprint
-kubelet-state-dir
The -kubelet-state-dir
specifies the path to the Kubelet state directory, where state and checkpoint files are stored. The files are mount as read-only and cannot be change by the updater. Enabled by default. Passing an empty string will disable the watching.
Default: /host-var/lib/kubelet
Example:
nfd-topology-updater -kubelet-state-dir=/var/lib/kubelet
-
Node Feature Discovery v0.15
\ No newline at end of file
+
See the sample configuration file for a full example configuration.
The excludeList
specifies a key-value map of allocated resources that should not be examined by the topology-updater agent per node. Each key is a node name with a value as a list of resources that should not be examined by the agent for that specific node.
Default: empty
Example:
excludeList:
+ Topology-Updater config reference · Node Feature Discovery
Configuration file reference of nfd-topology-updater
Table of contents
See the sample configuration file for a full example configuration.
excludeList
The excludeList
specifies a key-value map of allocated resources that should not be examined by the topology-updater agent per node. Each key is a node name with a value as a list of resources that should not be examined by the agent for that specific node.
Default: empty
Example:
excludeList:
nodeA: [hugepages-2Mi]
nodeB: [memory]
nodeC: [cpu, hugepages-2Mi]
excludeList.*
excludeList.*
is a special value that use to specify all nodes. A resource that would be listed under this key, would be excluded from all nodes.
Default: empty
Example:
excludeList:
'*': [hugepages-2Mi]
-
Node Feature Discovery v0.15
\ No newline at end of file
+
Node Feature Discovery follows semantic versioning where the version number consists of three components, i.e. MAJOR.MINOR.PATCH.
The most recent two minor releases (or release branches) of Node Feature Discovery are supported. That is, with X being the latest release, X and X-1 are supported and X-1 reaches end-of-life when X+1 is released.
Built-in feature labels and features are supported for 2 releases after being deprecated, at minimum. That is, if a feature label is deprecated in version X, it will be supported in X+1 and X+2 and may be dropped in X+3.
Command-line flags and configuration file options are supported for 1 more release after being deprecated, at minimum. That is, if option/flag is deprecated in version X, it will be supported in X+1 and may be removed in X+2.
The same policy (support for 1 release after deprecation) also applies to Helm chart parameters.
Node Feature Discovery follows semantic versioning where the version number consists of three components, i.e. MAJOR.MINOR.PATCH.
The most recent two minor releases (or release branches) of Node Feature Discovery are supported. That is, with X being the latest release, X and X-1 are supported and X-1 reaches end-of-life when X+1 is released.
Built-in feature labels and features are supported for 2 releases after being deprecated, at minimum. That is, if a feature label is deprecated in version X, it will be supported in X+1 and X+2 and may be dropped in X+3.
Command-line flags and configuration file options are supported for 1 more release after being deprecated, at minimum. That is, if option/flag is deprecated in version X, it will be supported in X+1 and may be removed in X+2.
The same policy (support for 1 release after deprecation) also applies to Helm chart parameters.
To quickly view available command line flags execute nfd-worker -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.6 nfd-worker -help
+ Worker cmdline reference · Node Feature Discovery
Commandline flags of nfd-worker
Table of contents
- -h, -help
- -version
- -config
- -options
- -server
- -ca-file
- -cert-file
- -key-file
- -kubeconfig
- -server-name-override
- -feature-sources
- -label-sources
- -enable-nodefeature-api
- -metrics
- -no-publish
- -oneshot
- Logging
To quickly view available command line flags execute nfd-worker -help
. In a docker container:
docker run registry.k8s.io/nfd/node-feature-discovery:v0.15.7 nfd-worker -help
-h, -help
Print usage and exit.
-version
Print version and exit.
-config
The -config
flag specifies the path of the nfd-worker configuration file to use.
Default: /etc/kubernetes/node-feature-discovery/nfd-worker.conf
Example:
nfd-worker -config=/opt/nfd/worker.conf
-options
The -options
flag may be used to specify and override configuration file options directly from the command line. The required format is the same as in the config file i.e. JSON or YAML. Configuration options specified via this flag will override those from the configuration file:
Default: empty
Example:
nfd-worker -options='{"sources":{"cpu":{"cpuid":{"attributeWhitelist":["AVX","AVX2"]}}}}'
-server
NOTE the gRPC API is deprecated and will be removed in a future release. and this flag will be removed as well.
The -server
flag specifies the address of the nfd-master endpoint where to connect to.
Default: localhost:8080
Example:
nfd-worker -server=nfd-master.nfd.svc.cluster.local:443
@@ -13,4 +13,4 @@
-metrics
The -metrics
flag specifies the port on which to expose Prometheus metrics. Setting this to 0 disables the metrics server on nfd-worker.
Default: 8081
Example:
nfd-worker -metrics=12345
-no-publish
The -no-publish
flag disables all communication with the nfd-master and the Kubernetes API server. It is effectively a "dry-run" flag for nfd-worker. NFD-Worker runs feature detection normally, but no labeling requests are sent to nfd-master and no NodeFeature objects are created or updated in the API server.
NOTE: This flag takes precedence over the core.noPublish
configuration file option.
Default: false
Example:
nfd-worker -no-publish
-oneshot
The -oneshot
flag causes nfd-worker to exit after one pass of feature detection.
Default: false
Example:
nfd-worker -oneshot -no-publish
-
Logging
The following logging-related flags are inherited from the klog package.
NOTE: The logger setup can also be specified via the core.klog
configuration file options. However, the command line flags take precedence over any corresponding config file options specified.
-add_dir_header
If true, adds the file directory to the header of the log messages.
Default: false
-alsologtostderr
Log to standard error as well as files.
Default: false
-log_backtrace_at
When logging hits line file:N, emit a stack trace.
Default: empty
-log_dir
If non-empty, write log files in this directory.
Default: empty
-log_file
If non-empty, use this log file.
Default: empty
-log_file_max_size
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited.
Default: 1800
-logtostderr
Log to standard error instead of files
Default: true
-skip_headers
If true, avoid header prefixes in the log messages.
Default: false
-skip_log_headers
If true, avoid headers when opening log files.
Default: false
-stderrthreshold
Logs at or above this threshold go to stderr.
Default: 2
-v
Number for the log level verbosity.
Default: 0
-vmodule
Comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
Node Feature Discovery v0.15
\ No newline at end of file
+
The following logging-related flags are inherited from the klog package.
NOTE: The logger setup can also be specified via the
core.klog
configuration file options. However, the command line flags take precedence over any corresponding config file options specified.
If true, adds the file directory to the header of the log messages.
Default: false
Log to standard error as well as files.
Default: false
When logging hits line file:N, emit a stack trace.
Default: empty
If non-empty, write log files in this directory.
Default: empty
If non-empty, use this log file.
Default: empty
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited.
Default: 1800
Log to standard error instead of files
Default: true
If true, avoid header prefixes in the log messages.
Default: false
If true, avoid headers when opening log files.
Default: false
Logs at or above this threshold go to stderr.
Default: 2
Number for the log level verbosity.
Default: 0
Comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
See the sample configuration file for a full example configuration.
The core
section contains common configuration settings that are not specific to any particular feature source.
core.sleepInterval
specifies the interval between consecutive passes of feature (re-)detection, and thus also the interval between node re-labeling. A non-positive value implies infinite sleep interval, i.e. no re-detection or re-labeling is done.
Default: 60s
Example:
core:
+ Worker config reference · Node Feature Discovery
Configuration file reference of nfd-worker
Table of contents
See the sample configuration file for a full example configuration.
core
The core
section contains common configuration settings that are not specific to any particular feature source.
core.sleepInterval
core.sleepInterval
specifies the interval between consecutive passes of feature (re-)detection, and thus also the interval between node re-labeling. A non-positive value implies infinite sleep interval, i.e. no re-detection or re-labeling is done.
Default: 60s
Example:
core:
sleepInterval: 60s
core.featureSources
core.featureSources
specifies the list of enabled feature sources. A special value all
enables all sources. Prefixing a source name with -
indicates that the source will be disabled instead - this is only meaningful when used in conjunction with all
. This option allows completely disabling the feature detection so that neither standard feature labels are generated nor the raw feature data is available for custom rule processing.
Default: [all]
Example:
core:
# Enable all but cpu and local sources
@@ -68,4 +68,4 @@
matchExpressions:
class: {op: In, value: ["0200"]}
vendor: {op: In, value: ["8086"]}
-
Node Feature Discovery v0.15
\ No newline at end of file
+
NFD uses some Kubernetes custom resources.
NodeFeature is an NFD-specific custom resource for communicating node features and node labeling requests. The nfd-master pod watches for NodeFeature objects, labels nodes as specified and uses the listed features as input when evaluating NodeFeatureRules. NodeFeature objects can be used for implementing 3rd party extensions (see customization guide for more details).
apiVersion: nfd.k8s-sigs.io/v1alpha1
+ CRDs · Node Feature Discovery
Custom Resources
Table of contents
NFD uses some Kubernetes custom resources.
NodeFeature
NodeFeature is an NFD-specific custom resource for communicating node features and node labeling requests. The nfd-master pod watches for NodeFeature objects, labels nodes as specified and uses the listed features as input when evaluating NodeFeatureRules. NodeFeature objects can be used for implementing 3rd party extensions (see customization guide for more details).
apiVersion: nfd.k8s-sigs.io/v1alpha1
kind: NodeFeature
metadata:
labels:
@@ -38,7 +38,7 @@
- feature: pci.device
matchExpressions:
vendor: {op: In, value: ["8086"]}
-
See the Customization guide for full documentation of the NodeFeatureRule resource and its usage.
The deployment/nodefeaturerule/samples/
directory contains sample NodeFeatureRule objects that replicate the built-in default feature labels generated by NFD. The sample rules can be used as a base to customize NFD feature labels. To use them in place of the the NFD built-in labels, the corresponding feature source(s) of nfd-worker should be disabled with the core.labelSources
configuration option.
NodeResourceTopology
When run with NFD-Topology-Updater, NFD creates NodeResourceTopology objects corresponding to node resource hardware topology such as:
apiVersion: topology.node.k8s.io/v1alpha1
+
See the Customization guide for full documentation of the NodeFeatureRule resource and its usage.
The deployment/nodefeaturerule/samples/
directory contains sample NodeFeatureRule objects that replicate the built-in default feature labels generated by NFD. The sample rules can be used as a base to customize NFD feature labels. To use them in place of the the NFD built-in labels, the corresponding feature source(s) of nfd-worker should be disabled with the core.labelSources
configuration option.
NodeResourceTopology
When run with NFD-Topology-Updater, NFD creates NodeResourceTopology objects corresponding to node resource hardware topology such as:
apiVersion: topology.node.k8s.io/v1alpha1
kind: NodeResourceTopology
metadata:
name: node1
@@ -77,4 +77,4 @@
capacity: 3
allocatable: 3
available: 3
-
The NodeResourceTopology objects created by NFD can be used to gain insight into the allocatable resources along with the granularity of those resources at a per-zone level (represented by node-0 and node-1 in the above example) or can be used by an external entity (e.g. topology-aware scheduler plugin) to take an action based on the gathered information.
Node Feature Discovery v0.15
\ No newline at end of file
+
The NodeResourceTopology objects created by NFD can be used to gain insight into the allocatable resources along with the granularity of those resources at a per-zone level (represented by node-0 and node-1 in the above example) or can be used by an external entity (e.g. topology-aware scheduler plugin) to take an action based on the gathered information.
NFD provides multiple extension points for vendor and application specific labeling:
NodeFeature
objects can be used to communicate "raw" node features and node labeling requests to nfd-master.NodeFeatureRule
objects provide a way to deploy custom labeling rules via the Kubernetes API.local
feature source of nfd-worker creates labels by reading text files and executing hooks.custom
feature source of nfd-worker creates labels based on user-specified rules.NodeFeature objects provide a way for 3rd party extensions to advertise custom features, both as "raw" features that serve as input to NodeFeatureRule objects and as feature labels directly.
Note that RBAC rules must be created for each extension for them to be able to create and manipulate NodeFeature objects in their namespace.
The NodeFeature CRD API can be disabled with the -enable-nodefeature-api=false
command line flag. This flag must be specified for both nfd-master and nfd-worker as it will enable the gRPC communication between them. Note that the gRPC API is DEPRECATED and will be removed in a future release, at which point the NodeFeature API cannot be disabled.
Consider the following referential example:
apiVersion: nfd.k8s-sigs.io/v1alpha1
+ Customization guide · Node Feature Discovery
Customization guide
Table of contents
- Overview
- NodeFeature custom resource
- NodeFeatureRule custom resource
- Local feature source
- Custom feature source
- Node labels
- Feature rule format
Overview
NFD provides multiple extension points for vendor and application specific labeling:
NodeFeature
objects can be used to communicate "raw" node features and node labeling requests to nfd-master. NodeFeatureRule
objects provide a way to deploy custom labeling rules via the Kubernetes API. local
feature source of nfd-worker creates labels by reading text files and executing hooks. custom
feature source of nfd-worker creates labels based on user-specified rules.
NodeFeature custom resource
NodeFeature objects provide a way for 3rd party extensions to advertise custom features, both as "raw" features that serve as input to NodeFeatureRule objects and as feature labels directly.
Note that RBAC rules must be created for each extension for them to be able to create and manipulate NodeFeature objects in their namespace.
The NodeFeature CRD API can be disabled with the -enable-nodefeature-api=false
command line flag. This flag must be specified for both nfd-master and nfd-worker as it will enable the gRPC communication between them. Note that the gRPC API is DEPRECATED and will be removed in a future release, at which point the NodeFeature API cannot be disabled.
A NodeFeature example
Consider the following referential example:
apiVersion: nfd.k8s-sigs.io/v1alpha1
kind: NodeFeature
metadata:
labels:
@@ -45,7 +45,7 @@
- feature: kernel.config
matchExpressions:
X86: {op: In, value: ["y"]}
-
It specifies one rule which creates node label feature.node.kubernetes.io/my-sample-feature=true
if both of the following conditions are true (matchFeatures
implements a logical AND over the matchers):
- The
dummy
network driver module has been loaded - X86 option in kernel config is set to
=y
Create a NodeFeatureRule
with a yaml file:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/v0.15.6/examples/nodefeaturerule.yaml
+
It specifies one rule which creates node label feature.node.kubernetes.io/my-sample-feature=true
if both of the following conditions are true (matchFeatures
implements a logical AND over the matchers):
- The
dummy
network driver module has been loaded - X86 option in kernel config is set to
=y
Create a NodeFeatureRule
with a yaml file:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/v0.15.7/examples/nodefeaturerule.yaml
Now, on X86 platforms the feature label appears after doing modprobe dummy
on a system and correspondingly the label is removed after rmmod dummy
. Note a re-labeling delay up to the sleep-interval of nfd-worker (1 minute by default).
See Feature rule format for detailed description of available fields and how to write labeling rules.
Node tainting
This feature is experimental.
In some circumstances, it is desirable to keep nodes with specialized hardware away from running general workload and instead leave them for workloads that need the specialized hardware. One way to achieve it is to taint the nodes with the specialized hardware and add corresponding toleration to pods that require the special hardware. NFD offers node tainting functionality which is disabled by default. User can define one or more custom taints via the taints
field of the NodeFeatureRule CR. The same rule-based mechanism is applied here and the NFD taints only rule matching nodes.
To enable the tainting feature, --enable-taints
flag needs to be set to true
. If the flag --enable-taints
is set to false
(i.e. disabled), taints defined in the NodeFeatureRule CR have no effect and will be ignored by the NFD master.
See documentation of the taints field for detailed description how to specify taints in the NodeFeatureRule object.
NOTE: Before enabling any taints, make sure to edit nfd-worker daemonset to tolerate the taints to be created. Otherwise, already running pods that do not tolerate the taint are evicted immediately from the node including the nfd-worker pod.
Local feature source
NFD-Worker has a special feature source named local
which is an integration point for external feature detectors. It provides a mechanism for pluggable extensions, allowing the creation of new user-specific features and even overriding built-in labels.
The local
feature source has two methods for detecting features, feature files and hooks (deprecated). The features discovered by the local
source can further be used in label rules specified in NodeFeatureRule
objects and the custom
feature source.
NOTE: Be careful when creating and/or updating hook or feature files while NFD is running. To avoid race conditions you should write into a temporary file, and atomically create/update the original file by doing a file rename operation. NFD ignores dot files, so temporary file can be written to the same directory and renamed (.my.feature
-> my.feature
) once file is complete. Both file names should (obviously) be unique for the given application.
An example
Consider a plaintext file /etc/kubernetes/node-feature-discovery/features.d/my-features
having the following contents (or alternatively a shell script /etc/kubernetes/node-feature-discovery/source.d/my-hook.sh
having the following stdout output):
feature.node.kubernetes.io/my-feature.1
feature.node.kubernetes.io/my-feature.2=myvalue
vendor.io/my-feature.3=456
@@ -107,7 +107,7 @@
- feature: kernel.loadedmodule
matchExpressions:
e1000: {op: Exists}
-
This simple rule will create feature.node.kubenernetes.io/e1000.present=true
label if the e1000
kernel module has been loaded.
The samples/custom-rules
kustomize overlay sample contains an example for deploying a custom rule from a ConfigMap.
Node labels
Feature labels have the following format:
<namespace>/<name> = <value>
+
This simple rule will create feature.node.kubenernetes.io/e1000.present=true
label if the e1000
kernel module has been loaded.
The samples/custom-rules
kustomize overlay sample contains an example for deploying a custom rule from a ConfigMap.
Node labels
Feature labels have the following format:
<namespace>/<name> = <value>
The namespace part (i.e. prefix) of the labels is controlled by nfd:
- All built-in labels use
feature.node.kubernetes.io
. - Namespaces may be excluded with the
-deny-label-ns
command line flag of nfd-master - To allow specific namespaces that were denied, you can use
-extra-label-ns
command line flag of nfd-master. e.g: nfd-master -deny-label-ns="*" -extra-label-ns=example.com
Feature rule format
This section describes the rule format used in NodeFeatureRule
objects and in the configuration of the custom
feature source.
It is based on a generic feature matcher that covers all features discovered by nfd-worker. The rules rely on a unified data model of the available features and a generic expression-based format. Features that can be used in the rules are described in detail in available features below.
Take this rule as a referential example:
- name: "my feature rule"
labels:
"feature.node.kubernetes.io/my-special-feature": "my-value"
@@ -349,4 +349,4 @@
- pci.device:
vendor: "0fff"
device: "abcd"
-
Node Feature Discovery v0.15
\ No newline at end of file
+
This page contains usage examples and demos.
A demo on the benefits of using node feature discovery can be found in the source code repository under demo/.
This page contains usage examples and demos.
A demo on the benefits of using node feature discovery can be found in the source code repository under demo/.
Features are advertised as labels in the Kubernetes Node object.
Label creation in nfd-worker is performed by a set of separate modules called label sources. The core.labelSources
configuration option (or -label-sources
flag) of nfd-worker controls which sources to enable for label generation.
All built-in labels use the feature.node.kubernetes.io
label namespace and have the following format.
feature.node.kubernetes.io/<feature> = <value>
-
NOTE: Consecutive runs of nfd-worker will update the labels on a given node. If features are not discovered on a consecutive run, the corresponding label will be removed. This includes any restrictions placed on the consecutive run, such as restricting discovered features with the
-label-whitelist
flag of nfd-master orcore.labelWhiteList
option of nfd-worker.
Feature name | Value | Description |
---|---|---|
cpu-cpuid.<cpuid-flag> | true | CPU capability is supported. NOTE: the capability might be supported but not enabled. |
cpu-hardware_multithreading | true | Hardware multithreading, such as Intel HTT, enabled (number of logical CPUs is greater than physical CPUs) |
cpu-coprocessor.nx_gzip | true | Nest Accelerator for GZIP is supported(Power). |
cpu-power.sst_bf.enabled | true | Intel SST-BF (Intel Speed Select Technology - Base frequency) enabled |
cpu-pstate.status | string | The status of the Intel pstate driver when in use and enabled, either ‘active' or ‘passive'. |
cpu-pstate.turbo | bool | Set to ‘true' if turbo frequencies are enabled in Intel pstate driver, set to ‘false' if they have been disabled. |
cpu-pstate.scaling_governor | string | The value of the Intel pstate scaling_governor when in use, either ‘powersave' or ‘performance'. |
cpu-cstate.enabled | bool | Set to ‘true' if cstates are set in the intel_idle driver, otherwise set to ‘false'. Unset if intel_idle cpuidle driver is not active. |
cpu-rdt.<rdt-flag> | true | DEPRECATED Intel RDT capability is supported. See RDT flags for details. |
cpu-security.sgx.enabled | true | Set to ‘true' if Intel SGX is enabled in BIOS (based on a non-zero sum value of SGX EPC section sizes). |
cpu-security.se.enabled | true | Set to ‘true' if IBM Secure Execution for Linux (IBM Z & LinuxONE) is available and enabled (requires /sys/firmware/uv/prot_virt_host facility) |
cpu-security.tdx.enabled | true | Set to ‘true' if Intel TDX is available on the host and has been enabled (requires /sys/module/kvm_intel/parameters/tdx ). |
cpu-security.tdx.protected | true | Set to ‘true' if Intel TDX was used to start the guest node, based on the existence of the "TDX_GUEST" information as part of cpuid features. |
cpu-security.sev.enabled | true | Set to ‘true' if ADM SEV is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev ). |
cpu-security.sev.es.enabled | true | Set to ‘true' if ADM SEV-ES is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_es ). |
cpu-security.sev.snp.enabled | true | Set to ‘true' if ADM SEV-SNP is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_snp ). |
cpu-model.vendor_id | string | Comparable CPU vendor ID. |
cpu-model.family | int | CPU family. |
cpu-model.id | int | CPU model number. |
NOTE: the
cpu-rdt.<rdt-flag>
labels are deprecated and will be removed in a future release. They will remain to be available as features for NodeFeatureRule to consume. See customization guide for details how to use NodeFeatureRule objects to create labels.
The CPU label source is configurable, see worker configuration and sources.cpu
configuration options for details.
Flag | Description |
---|---|
ADX | Multi-Precision Add-Carry Instruction Extensions (ADX) |
AESNI | Advanced Encryption Standard (AES) New Instructions (AES-NI) |
APX_F | Intel Advanced Performance Extensions (APX) |
AVX10 | Intel Advanced Vector Extensions 10 (AVX10) |
AVX10_256, AVX10_512 | Intel AVX10 256-bit and 512-bit vector support |
AVX | Advanced Vector Extensions (AVX) |
AVX2 | Advanced Vector Extensions 2 (AVX2) |
AVXIFMA | AVX-IFMA instructions |
AVXVNNI | AVX (VEX encoded) VNNI neural network instructions |
AMXBF16 | Advanced Matrix Extension, tile multiplication operations on BFLOAT16 numbers |
AMXINT8 | Advanced Matrix Extension, tile multiplication operations on 8-bit integers |
AMXFP16 | Advanced Matrix Extension, tile multiplication operations on FP16 numbers |
AMXTILE | Advanced Matrix Extension, base tile architecture support |
AVX512BF16 | AVX-512 BFLOAT16 instructions |
AVX512BITALG | AVX-512 bit Algorithms |
AVX512BW | AVX-512 byte and word Instructions |
AVX512CD | AVX-512 conflict detection instructions |
AVX512DQ | AVX-512 doubleword and quadword instructions |
AVX512ER | AVX-512 exponential and reciprocal instructions |
AVX512F | AVX-512 foundation |
AVX512FP16 | AVX-512 FP16 instructions |
AVX512IFMA | AVX-512 integer fused multiply-add instructions |
AVX512PF | AVX-512 prefetch instructions |
AVX512VBMI | AVX-512 vector bit manipulation instructions |
AVX512VBMI2 | AVX-512 vector bit manipulation instructions, version 2 |
AVX512VL | AVX-512 vector length extensions |
AVX512VNNI | AVX-512 vector neural network instructions |
AVX512VP2INTERSECT | AVX-512 intersect for D/Q |
AVX512VPOPCNTDQ | AVX-512 vector population count doubleword and quadword |
AVXNECONVERT | AVX-NE-CONVERT instructions |
AVXVNNIINT8 | AVX-VNNI-INT8 instructions |
CMPCCXADD | CMPCCXADD instructions |
ENQCMD | Enqueue Command |
GFNI | Galois Field New Instructions |
HYPERVISOR | Running under hypervisor |
MSRLIST | Read/Write List of Model Specific Registers |
PREFETCHI | PREFETCHIT0/1 instructions |
VAES | AVX-512 vector AES instructions |
VPCLMULQDQ | Carry-less multiplication quadword |
WRMSRNS | Non-Serializing Write to Model Specific Register |
By default, the following CPUID flags have been blacklisted: BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SSE, SSE2, SSE3, SSE4, SSE42, SSSE3 and TDX_GUEST. See sources.cpu
configuration options to change the behavior.
See the full list in github.com/klauspost/cpuid.
Flag | Description |
---|---|
IDIVA | Integer divide instructions available in ARM mode |
IDIVT | Integer divide instructions available in Thumb mode |
THUMB | Thumb instructions |
FASTMUL | Fast multiplication |
VFP | Vector floating point instruction extension (VFP) |
VFPv3 | Vector floating point extension v3 |
VFPv4 | Vector floating point extension v4 |
VFPD32 | VFP with 32 D-registers |
HALF | Half-word loads and stores |
EDSP | DSP extensions |
NEON | NEON SIMD instructions |
LPAE | Large Physical Address Extensions |
Flag | Description |
---|---|
AES | Announcing the Advanced Encryption Standard |
EVSTRM | Event Stream Frequency Features |
FPHP | Half Precision(16bit) Floating Point Data Processing Instructions |
ASIMDHP | Half Precision(16bit) Asimd Data Processing Instructions |
ATOMICS | Atomic Instructions to the A64 |
ASIMRDM | Support for Rounding Double Multiply Add/Subtract |
PMULL | Optional Cryptographic and CRC32 Instructions |
JSCVT | Perform Conversion to Match Javascript |
DCPOP | Persistent Memory Support |
Feature | Value | Description |
---|---|---|
kernel-config.<option> | true | Kernel config option is enabled (set ‘y' or ‘m'). Default options are NO_HZ , NO_HZ_IDLE , NO_HZ_FULL and PREEMPT |
kernel-selinux.enabled | true | Selinux is enabled on the node |
kernel-version.full | string | Full kernel version as reported by /proc/sys/kernel/osrelease (e.g. ‘4.5.6-7-g123abcde') |
kernel-version.major | string | First component of the kernel version (e.g. ‘4') |
kernel-version.minor | string | Second component of the kernel version (e.g. ‘5') |
kernel-version.revision | string | Third component of the kernel version (e.g. ‘6') |
The kernel label source is configurable, see worker configuration and sources.kernel
configuration options for details.
Feature | Value | Description |
---|---|---|
memory-numa | true | Multiple memory nodes i.e. NUMA architecture detected |
memory-nv.present | true | NVDIMM device(s) are present |
memory-nv.dax | true | NVDIMM region(s) configured in DAX mode are present |
Feature | Value | Description |
---|---|---|
network-sriov.capable | true | Single Root Input/Output Virtualization (SR-IOV) enabled Network Interface Card(s) present |
network-sriov.configured | true | SR-IOV virtual functions have been configured |
Feature | Value | Description |
---|---|---|
pci-<device label>.present | true | PCI device is detected |
pci-<device label>.sriov.capable | true | Single Root Input/Output Virtualization (SR-IOV) enabled PCI device present |
<device label>
is format is configurable and set to <class>_<vendor>
by default. For more more details about configuration of the pci labels, see sources.pci
options and worker configuration instructions.
Feature | Value | Description |
---|---|---|
usb-<device label>.present | true | USB device is detected |
<device label>
is format is configurable and set to <class>_<vendor>_<device>
by default. For more more details about configuration of the usb labels, see sources.usb
options and worker configuration instructions.
Feature | Value | Description |
---|---|---|
storage-nonrotationaldisk | true | Non-rotational disk, like SSD, is present in the node |
Feature | Value | Description |
---|---|---|
system-os_release.ID | string | Operating system identifier |
system-os_release.VERSION_ID | string | Operating system version identifier (e.g. ‘6.7') |
system-os_release.VERSION_ID.major | string | First component of the OS version id (e.g. ‘6') |
system-os_release.VERSION_ID.minor | string | Second component of the OS version id (e.g. ‘7') |
The custom label source is designed for creating user defined labels. However, it has a few statically defined built-in labels:
Feature | Value | Description |
---|---|---|
custom-rdma.capable | true | The node has an RDMA capable Network adapter |
custom-rdma.enabled | true | The node has the needed RDMA modules loaded to run RDMA traffic |
NFD has many extension points for creating vendor and application specific labels. See the customization guide for detailed documentation.
NFD is able to create extended resources, see the NodeFeatureRule CRD and its extendedResources field for more details.
Note that NFD is not a replacement for the usage of device plugins.
An example use-case for extended resources could be based on custom feature (created e.g. with feature files that exposes the node SGX EPC memory section size. This value will then be turned into an extended resource of the node, allowing PODs to request that resource and the Kubernetes scheduler to schedule such PODs to only those nodes which have a sufficient capacity of said resource left.
Features are advertised as labels in the Kubernetes Node object.
Label creation in nfd-worker is performed by a set of separate modules called label sources. The core.labelSources
configuration option (or -label-sources
flag) of nfd-worker controls which sources to enable for label generation.
All built-in labels use the feature.node.kubernetes.io
label namespace and have the following format.
feature.node.kubernetes.io/<feature> = <value>
+
NOTE: Consecutive runs of nfd-worker will update the labels on a given node. If features are not discovered on a consecutive run, the corresponding label will be removed. This includes any restrictions placed on the consecutive run, such as restricting discovered features with the
-label-whitelist
flag of nfd-master orcore.labelWhiteList
option of nfd-worker.
Feature name | Value | Description |
---|---|---|
cpu-cpuid.<cpuid-flag> | true | CPU capability is supported. NOTE: the capability might be supported but not enabled. |
cpu-hardware_multithreading | true | Hardware multithreading, such as Intel HTT, enabled (number of logical CPUs is greater than physical CPUs) |
cpu-coprocessor.nx_gzip | true | Nest Accelerator for GZIP is supported(Power). |
cpu-power.sst_bf.enabled | true | Intel SST-BF (Intel Speed Select Technology - Base frequency) enabled |
cpu-pstate.status | string | The status of the Intel pstate driver when in use and enabled, either ‘active' or ‘passive'. |
cpu-pstate.turbo | bool | Set to ‘true' if turbo frequencies are enabled in Intel pstate driver, set to ‘false' if they have been disabled. |
cpu-pstate.scaling_governor | string | The value of the Intel pstate scaling_governor when in use, either ‘powersave' or ‘performance'. |
cpu-cstate.enabled | bool | Set to ‘true' if cstates are set in the intel_idle driver, otherwise set to ‘false'. Unset if intel_idle cpuidle driver is not active. |
cpu-rdt.<rdt-flag> | true | DEPRECATED Intel RDT capability is supported. See RDT flags for details. |
cpu-security.sgx.enabled | true | Set to ‘true' if Intel SGX is enabled in BIOS (based on a non-zero sum value of SGX EPC section sizes). |
cpu-security.se.enabled | true | Set to ‘true' if IBM Secure Execution for Linux (IBM Z & LinuxONE) is available and enabled (requires /sys/firmware/uv/prot_virt_host facility) |
cpu-security.tdx.enabled | true | Set to ‘true' if Intel TDX is available on the host and has been enabled (requires /sys/module/kvm_intel/parameters/tdx ). |
cpu-security.tdx.protected | true | Set to ‘true' if Intel TDX was used to start the guest node, based on the existence of the "TDX_GUEST" information as part of cpuid features. |
cpu-security.sev.enabled | true | Set to ‘true' if ADM SEV is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev ). |
cpu-security.sev.es.enabled | true | Set to ‘true' if ADM SEV-ES is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_es ). |
cpu-security.sev.snp.enabled | true | Set to ‘true' if ADM SEV-SNP is available on the host and has been enabled (requires /sys/module/kvm_amd/parameters/sev_snp ). |
cpu-model.vendor_id | string | Comparable CPU vendor ID. |
cpu-model.family | int | CPU family. |
cpu-model.id | int | CPU model number. |
NOTE: the
cpu-rdt.<rdt-flag>
labels are deprecated and will be removed in a future release. They will remain to be available as features for NodeFeatureRule to consume. See customization guide for details how to use NodeFeatureRule objects to create labels.
The CPU label source is configurable, see worker configuration and sources.cpu
configuration options for details.
Flag | Description |
---|---|
ADX | Multi-Precision Add-Carry Instruction Extensions (ADX) |
AESNI | Advanced Encryption Standard (AES) New Instructions (AES-NI) |
APX_F | Intel Advanced Performance Extensions (APX) |
AVX10 | Intel Advanced Vector Extensions 10 (AVX10) |
AVX10_256, AVX10_512 | Intel AVX10 256-bit and 512-bit vector support |
AVX | Advanced Vector Extensions (AVX) |
AVX2 | Advanced Vector Extensions 2 (AVX2) |
AVXIFMA | AVX-IFMA instructions |
AVXVNNI | AVX (VEX encoded) VNNI neural network instructions |
AMXBF16 | Advanced Matrix Extension, tile multiplication operations on BFLOAT16 numbers |
AMXINT8 | Advanced Matrix Extension, tile multiplication operations on 8-bit integers |
AMXFP16 | Advanced Matrix Extension, tile multiplication operations on FP16 numbers |
AMXTILE | Advanced Matrix Extension, base tile architecture support |
AVX512BF16 | AVX-512 BFLOAT16 instructions |
AVX512BITALG | AVX-512 bit Algorithms |
AVX512BW | AVX-512 byte and word Instructions |
AVX512CD | AVX-512 conflict detection instructions |
AVX512DQ | AVX-512 doubleword and quadword instructions |
AVX512ER | AVX-512 exponential and reciprocal instructions |
AVX512F | AVX-512 foundation |
AVX512FP16 | AVX-512 FP16 instructions |
AVX512IFMA | AVX-512 integer fused multiply-add instructions |
AVX512PF | AVX-512 prefetch instructions |
AVX512VBMI | AVX-512 vector bit manipulation instructions |
AVX512VBMI2 | AVX-512 vector bit manipulation instructions, version 2 |
AVX512VL | AVX-512 vector length extensions |
AVX512VNNI | AVX-512 vector neural network instructions |
AVX512VP2INTERSECT | AVX-512 intersect for D/Q |
AVX512VPOPCNTDQ | AVX-512 vector population count doubleword and quadword |
AVXNECONVERT | AVX-NE-CONVERT instructions |
AVXVNNIINT8 | AVX-VNNI-INT8 instructions |
CMPCCXADD | CMPCCXADD instructions |
ENQCMD | Enqueue Command |
GFNI | Galois Field New Instructions |
HYPERVISOR | Running under hypervisor |
MSRLIST | Read/Write List of Model Specific Registers |
PREFETCHI | PREFETCHIT0/1 instructions |
VAES | AVX-512 vector AES instructions |
VPCLMULQDQ | Carry-less multiplication quadword |
WRMSRNS | Non-Serializing Write to Model Specific Register |
By default, the following CPUID flags have been blacklisted: BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SSE, SSE2, SSE3, SSE4, SSE42, SSSE3 and TDX_GUEST. See sources.cpu
configuration options to change the behavior.
See the full list in github.com/klauspost/cpuid.
Flag | Description |
---|---|
IDIVA | Integer divide instructions available in ARM mode |
IDIVT | Integer divide instructions available in Thumb mode |
THUMB | Thumb instructions |
FASTMUL | Fast multiplication |
VFP | Vector floating point instruction extension (VFP) |
VFPv3 | Vector floating point extension v3 |
VFPv4 | Vector floating point extension v4 |
VFPD32 | VFP with 32 D-registers |
HALF | Half-word loads and stores |
EDSP | DSP extensions |
NEON | NEON SIMD instructions |
LPAE | Large Physical Address Extensions |
Flag | Description |
---|---|
AES | Announcing the Advanced Encryption Standard |
EVSTRM | Event Stream Frequency Features |
FPHP | Half Precision(16bit) Floating Point Data Processing Instructions |
ASIMDHP | Half Precision(16bit) Asimd Data Processing Instructions |
ATOMICS | Atomic Instructions to the A64 |
ASIMRDM | Support for Rounding Double Multiply Add/Subtract |
PMULL | Optional Cryptographic and CRC32 Instructions |
JSCVT | Perform Conversion to Match Javascript |
DCPOP | Persistent Memory Support |
Feature | Value | Description |
---|---|---|
kernel-config.<option> | true | Kernel config option is enabled (set ‘y' or ‘m'). Default options are NO_HZ , NO_HZ_IDLE , NO_HZ_FULL and PREEMPT |
kernel-selinux.enabled | true | Selinux is enabled on the node |
kernel-version.full | string | Full kernel version as reported by /proc/sys/kernel/osrelease (e.g. ‘4.5.6-7-g123abcde') |
kernel-version.major | string | First component of the kernel version (e.g. ‘4') |
kernel-version.minor | string | Second component of the kernel version (e.g. ‘5') |
kernel-version.revision | string | Third component of the kernel version (e.g. ‘6') |
The kernel label source is configurable, see worker configuration and sources.kernel
configuration options for details.
Feature | Value | Description |
---|---|---|
memory-numa | true | Multiple memory nodes i.e. NUMA architecture detected |
memory-nv.present | true | NVDIMM device(s) are present |
memory-nv.dax | true | NVDIMM region(s) configured in DAX mode are present |
Feature | Value | Description |
---|---|---|
network-sriov.capable | true | Single Root Input/Output Virtualization (SR-IOV) enabled Network Interface Card(s) present |
network-sriov.configured | true | SR-IOV virtual functions have been configured |
Feature | Value | Description |
---|---|---|
pci-<device label>.present | true | PCI device is detected |
pci-<device label>.sriov.capable | true | Single Root Input/Output Virtualization (SR-IOV) enabled PCI device present |
<device label>
is format is configurable and set to <class>_<vendor>
by default. For more more details about configuration of the pci labels, see sources.pci
options and worker configuration instructions.
Feature | Value | Description |
---|---|---|
usb-<device label>.present | true | USB device is detected |
<device label>
is format is configurable and set to <class>_<vendor>_<device>
by default. For more more details about configuration of the usb labels, see sources.usb
options and worker configuration instructions.
Feature | Value | Description |
---|---|---|
storage-nonrotationaldisk | true | Non-rotational disk, like SSD, is present in the node |
Feature | Value | Description |
---|---|---|
system-os_release.ID | string | Operating system identifier |
system-os_release.VERSION_ID | string | Operating system version identifier (e.g. ‘6.7') |
system-os_release.VERSION_ID.major | string | First component of the OS version id (e.g. ‘6') |
system-os_release.VERSION_ID.minor | string | Second component of the OS version id (e.g. ‘7') |
The custom label source is designed for creating user defined labels. However, it has a few statically defined built-in labels:
Feature | Value | Description |
---|---|---|
custom-rdma.capable | true | The node has an RDMA capable Network adapter |
custom-rdma.enabled | true | The node has the needed RDMA modules loaded to run RDMA traffic |
NFD has many extension points for creating vendor and application specific labels. See the customization guide for detailed documentation.
NFD is able to create extended resources, see the NodeFeatureRule CRD and its extendedResources field for more details.
Note that NFD is not a replacement for the usage of device plugins.
An example use-case for extended resources could be based on custom feature (created e.g. with feature files that exposes the node SGX EPC memory section size. This value will then be turned into an extended resource of the node, allowing PODs to request that resource and the Kubernetes scheduler to schedule such PODs to only those nodes which have a sufficient capacity of said resource left.
Usage instructions.
Usage instructions.
Developer Preview This feature is currently in developer preview and subject to change. It is not recommended to use it in production environments.
The kubectl
plugin kubectl nfd
can be used to validate/dryrun and test NodeFeatureRule objects. It can be installed with the following command:
git clone https://github.com/kubernetes-sigs/node-feature-discovery
+ Kubectl plugin · Node Feature Discovery
Kubectl plugin
Table of contents
Developer Preview This feature is currently in developer preview and subject to change. It is not recommended to use it in production environments.
Overview
The kubectl
plugin kubectl nfd
can be used to validate/dryrun and test NodeFeatureRule objects. It can be installed with the following command:
git clone https://github.com/kubernetes-sigs/node-feature-discovery
cd node-feature-discovery
make build-kubectl-nfd
KUBECTL_PATH=/usr/local/bin/
@@ -13,4 +13,4 @@
*** Labels ***
vendor.io/my-sample-feature=true
NodeFeatureRule "examples/nodefeaturerule.yaml" is valid for NodeFeature "examples/nodefeature.yaml"
-
Node Feature Discovery v0.15
\ No newline at end of file
+
NFD-GC (NFD Garbage-Collector) is preferably run as a Kubernetes deployment with one replica. It makes sure that all NodeFeature and NodeResourceTopology objects have corresponding nodes and removes stale objects for non-existent nodes.
The daemon watches for Node deletion events and removes NodeFeature and NodeResourceTopology objects upon them. It also runs periodically to make sure no node delete event was missed and to remove any NodeFeature or NodeResourceTopology objects that were created without corresponding node. The default garbage collector interval is set to 1h which is the value when no -gc-interval is specified.
In Helm deployments (see garbage collector parameters) NFD-GC will only be deployed when enableNodeFeatureApi
or topologyUpdater.enable
is set to true.
NFD-GC (NFD Garbage-Collector) is preferably run as a Kubernetes deployment with one replica. It makes sure that all NodeFeature and NodeResourceTopology objects have corresponding nodes and removes stale objects for non-existent nodes.
The daemon watches for Node deletion events and removes NodeFeature and NodeResourceTopology objects upon them. It also runs periodically to make sure no node delete event was missed and to remove any NodeFeature or NodeResourceTopology objects that were created without corresponding node. The default garbage collector interval is set to 1h which is the value when no -gc-interval is specified.
In Helm deployments (see garbage collector parameters) NFD-GC will only be deployed when enableNodeFeatureApi
or topologyUpdater.enable
is set to true.
NFD-Master is responsible for connecting to the Kubernetes API server and updating node objects. More specifically, it modifies node labels, taints and extended resources based on requests from nfd-workers and 3rd party extensions.
The NodeFeature Controller uses NodeFeature objects as the input for the NodeFeatureRule processing pipeline. In addition, any labels listed in the NodeFeature object are created on the node (note the allowed label namespaces are controlled).
NFD-Master acts as the controller for NodeFeatureRule objects. It applies the rules specified in NodeFeatureRule objects on raw feature data and creates node labels accordingly. The feature data used as the input is received from nfd-worker instances through NodeFeature objects.
NOTE: when gRPC (DEPRECATED) is used for communicating the features (by setting the flag
-enable-nodefeature-api=false
on both nfd-master and nfd-worker, or via Helm values.enableNodeFeatureApi=false), (re-)labelling only happens when a request is received from nfd-worker. That is, in practice rules are evaluated and labels for each node are created on intervals specified by thecore.sleepInterval
configuration option of nfd-worker instances. This means that modification or creation of NodeFeatureRule objects does not instantly cause the node labels to be updated. Instead, the changes only come visible in node labels as nfd-worker instances send their labelling requests. This limitation is not present when gRPC interface is disabled and NodeFeature API is used.
NFD-Master supports dynamic configuration through a configuration file. The default location is /etc/kubernetes/node-feature-discovery/nfd-master.conf
, but, this can be changed by specifying the-config
command line flag. Configuration file is re-read whenever it is modified which makes run-time re-configuration of nfd-master straightforward.
Master configuration file is read inside the container, and thus, Volumes and VolumeMounts are needed to make your configuration available for NFD. The preferred method is to use a ConfigMap which provides easy deployment and re-configurability.
The provided nfd-master deployment templates create an empty configmap and mount it inside the nfd-master containers. In kustomize deployments, configuration can be edited with:
kubectl -n ${NFD_NS} edit configmap nfd-master-conf
-
In Helm deployments, Master pod parameter master.config
can be used to edit the respective configuration.
See nfd-master configuration file reference for more details. The (empty-by-default) example config contains all available configuration options and can be used as a reference for creating a configuration.
NFD-Master runs as a deployment, by default it prefers running on the cluster's master nodes but will run on worker nodes if no master nodes are found.
For High Availability, you should increase the replica count of the deployment object. You should also look into adding inter-pod affinity to prevent masters from running on the same node. However note that inter-pod affinity is costly and is not recommended in bigger clusters.
Note: When NFD-Master is intended to run with more than one replica, it is advised to use
-enable-leader-election
flag. This flag turns on leader election for NFD-Master and let only one replica to act on changes in NodeFeature and NodeFeatureRule objects.
If you have RBAC authorization enabled (as is the default e.g. with clusters initialized with kubeadm) you need to configure the appropriate ClusterRoles, ClusterRoleBindings and a ServiceAccount for NFD to create node labels. The provided template will configure these for you.
NFD-Master is responsible for connecting to the Kubernetes API server and updating node objects. More specifically, it modifies node labels, taints and extended resources based on requests from nfd-workers and 3rd party extensions.
The NodeFeature Controller uses NodeFeature objects as the input for the NodeFeatureRule processing pipeline. In addition, any labels listed in the NodeFeature object are created on the node (note the allowed label namespaces are controlled).
NFD-Master acts as the controller for NodeFeatureRule objects. It applies the rules specified in NodeFeatureRule objects on raw feature data and creates node labels accordingly. The feature data used as the input is received from nfd-worker instances through NodeFeature objects.
NOTE: when gRPC (DEPRECATED) is used for communicating the features (by setting the flag
-enable-nodefeature-api=false
on both nfd-master and nfd-worker, or via Helm values.enableNodeFeatureApi=false), (re-)labelling only happens when a request is received from nfd-worker. That is, in practice rules are evaluated and labels for each node are created on intervals specified by thecore.sleepInterval
configuration option of nfd-worker instances. This means that modification or creation of NodeFeatureRule objects does not instantly cause the node labels to be updated. Instead, the changes only come visible in node labels as nfd-worker instances send their labelling requests. This limitation is not present when gRPC interface is disabled and NodeFeature API is used.
NFD-Master supports dynamic configuration through a configuration file. The default location is /etc/kubernetes/node-feature-discovery/nfd-master.conf
, but, this can be changed by specifying the-config
command line flag. Configuration file is re-read whenever it is modified which makes run-time re-configuration of nfd-master straightforward.
Master configuration file is read inside the container, and thus, Volumes and VolumeMounts are needed to make your configuration available for NFD. The preferred method is to use a ConfigMap which provides easy deployment and re-configurability.
The provided nfd-master deployment templates create an empty configmap and mount it inside the nfd-master containers. In kustomize deployments, configuration can be edited with:
kubectl -n ${NFD_NS} edit configmap nfd-master-conf
+
In Helm deployments, Master pod parameter master.config
can be used to edit the respective configuration.
See nfd-master configuration file reference for more details. The (empty-by-default) example config contains all available configuration options and can be used as a reference for creating a configuration.
NFD-Master runs as a deployment, by default it prefers running on the cluster's master nodes but will run on worker nodes if no master nodes are found.
For High Availability, you should increase the replica count of the deployment object. You should also look into adding inter-pod affinity to prevent masters from running on the same node. However note that inter-pod affinity is costly and is not recommended in bigger clusters.
Note: When NFD-Master is intended to run with more than one replica, it is advised to use
-enable-leader-election
flag. This flag turns on leader election for NFD-Master and let only one replica to act on changes in NodeFeature and NodeFeatureRule objects.
If you have RBAC authorization enabled (as is the default e.g. with clusters initialized with kubeadm) you need to configure the appropriate ClusterRoles, ClusterRoleBindings and a ServiceAccount for NFD to create node labels. The provided template will configure these for you.
NFD-Topology-Updater is preferably run as a Kubernetes DaemonSet. This assures re-examination on regular intervals and/or per pod life-cycle events, capturing changes in the allocated resources and hence the allocatable resources on a per-zone basis by updating NodeResourceTopology custom resources. It makes sure that new NodeResourceTopology instances are created for each new nodes that get added to the cluster.
Because of the design and implementation of Kubernetes, only resources exclusively allocated to Guaranteed Quality of Service pods will be accounted. This includes CPU cores, memory and devices.
When run as a daemonset, nodes are re-examined for the allocated resources (to determine the information of the allocatable resources on a per-zone basis where a zone can be a NUMA node) at an interval specified using the -sleep-interval
option. The default sleep interval is set to 60s which is the value when no -sleep-interval is specified. The re-examination can be disabled by setting the sleep-interval to 0.
Another option is to configure the updater to update the allocated resources per pod life-cycle events. The updater will monitor the checkpoint file stated in -kubelet-state-dir
and triggers an update for every change occurs in the files.
In addition, it can avoid examining specific allocated resources given a configuration of resources to exclude via -excludeList
Kubelet PodResource API with the GetAllocatableResources functionality enabled is a prerequisite for nfd-topology-updater to be able to run (i.e. Kubernetes v1.21 or later is required).
Preceding Kubernetes v1.23, the kubelet
must be started with --feature-gates=KubeletPodResourcesGetAllocatable=true
.
Starting from Kubernetes v1.23, the KubeletPodResourcesGetAllocatable
feature gate. is enabled by default
NFD-Topology-Updater supports configuration through a configuration file. The default location is /etc/kubernetes/node-feature-discovery/topology-updater.conf
, but, this can be changed by specifying the-config
command line flag.
NOTE: unlike nfd-worker, dynamic configuration updates are not supported.
Topology-Updater configuration file is read inside the container, and thus, Volumes and VolumeMounts are needed to make your configuration available for NFD. The preferred method is to use a ConfigMap which provides easy deployment and re-configurability.
The provided nfd-topology-updater deployment templates create an empty configmap and mount it inside the nfd-topology-updater containers. In kustomize deployments, configuration can be edited with:
kubectl -n ${NFD_NS} edit configmap nfd-topology-updater-conf
-
In Helm deployments, Topology Updater parameters toplogyUpdater.config
can be used to edit the respective configuration.
See nfd-topology-updater configuration file reference for more details. The (empty-by-default) example config contains all available configuration options and can be used as a reference for creating a configuration.
NFD-Topology-Updater is preferably run as a Kubernetes DaemonSet. This assures re-examination on regular intervals and/or per pod life-cycle events, capturing changes in the allocated resources and hence the allocatable resources on a per-zone basis by updating NodeResourceTopology custom resources. It makes sure that new NodeResourceTopology instances are created for each new nodes that get added to the cluster.
Because of the design and implementation of Kubernetes, only resources exclusively allocated to Guaranteed Quality of Service pods will be accounted. This includes CPU cores, memory and devices.
When run as a daemonset, nodes are re-examined for the allocated resources (to determine the information of the allocatable resources on a per-zone basis where a zone can be a NUMA node) at an interval specified using the -sleep-interval
option. The default sleep interval is set to 60s which is the value when no -sleep-interval is specified. The re-examination can be disabled by setting the sleep-interval to 0.
Another option is to configure the updater to update the allocated resources per pod life-cycle events. The updater will monitor the checkpoint file stated in -kubelet-state-dir
and triggers an update for every change occurs in the files.
In addition, it can avoid examining specific allocated resources given a configuration of resources to exclude via -excludeList
Kubelet PodResource API with the GetAllocatableResources functionality enabled is a prerequisite for nfd-topology-updater to be able to run (i.e. Kubernetes v1.21 or later is required).
Preceding Kubernetes v1.23, the kubelet
must be started with --feature-gates=KubeletPodResourcesGetAllocatable=true
.
Starting from Kubernetes v1.23, the KubeletPodResourcesGetAllocatable
feature gate. is enabled by default
NFD-Topology-Updater supports configuration through a configuration file. The default location is /etc/kubernetes/node-feature-discovery/topology-updater.conf
, but, this can be changed by specifying the-config
command line flag.
NOTE: unlike nfd-worker, dynamic configuration updates are not supported.
Topology-Updater configuration file is read inside the container, and thus, Volumes and VolumeMounts are needed to make your configuration available for NFD. The preferred method is to use a ConfigMap which provides easy deployment and re-configurability.
The provided nfd-topology-updater deployment templates create an empty configmap and mount it inside the nfd-topology-updater containers. In kustomize deployments, configuration can be edited with:
kubectl -n ${NFD_NS} edit configmap nfd-topology-updater-conf
+
In Helm deployments, Topology Updater parameters toplogyUpdater.config
can be used to edit the respective configuration.
See nfd-topology-updater configuration file reference for more details. The (empty-by-default) example config contains all available configuration options and can be used as a reference for creating a configuration.
NFD-Worker is preferably run as a Kubernetes DaemonSet. This assures re-labeling on regular intervals capturing changes in the system configuration and makes sure that new nodes are labeled as they are added to the cluster. Worker connects to the nfd-master service to advertise hardware features.
When run as a daemonset, nodes are re-labeled at an default interval of 60s. This can be changed by using the core.sleepInterval
config option.
The worker configuration file is watched and re-read on every change which provides a mechanism of dynamic run-time reconfiguration. See worker configuration for more details.
NFD-Worker supports dynamic configuration through a configuration file. The default location is /etc/kubernetes/node-feature-discovery/nfd-worker.conf
, but, this can be changed by specifying the-config
command line flag. Configuration file is re-read whenever it is modified which makes run-time re-configuration of nfd-worker straightforward.
Worker configuration file is read inside the container, and thus, Volumes and VolumeMounts are needed to make your configuration available for NFD. The preferred method is to use a ConfigMap which provides easy deployment and re-configurability.
The provided nfd-worker deployment templates create an empty configmap and mount it inside the nfd-worker containers. In kustomize deployments, configuration can be edited with:
kubectl -n ${NFD_NS} edit configmap nfd-worker-conf
-
In Helm deployments, Worker pod parameter worker.config
can be used to edit the respective configuration.
See nfd-worker configuration file reference for more details. The (empty-by-default) example config contains all available configuration options and can be used as a reference for creating a configuration.
Configuration options can also be specified via the -options
command line flag, in which case no mounts need to be used. The same format as in the config file must be used, i.e. JSON (or YAML). For example:
-options='{"sources": { "pci": { "deviceClassWhitelist": ["12"] } } }'
-
Configuration options specified from the command line will override those read from the config file.
NFD-Worker is preferably run as a Kubernetes DaemonSet. This assures re-labeling on regular intervals capturing changes in the system configuration and makes sure that new nodes are labeled as they are added to the cluster. Worker connects to the nfd-master service to advertise hardware features.
When run as a daemonset, nodes are re-labeled at an default interval of 60s. This can be changed by using the core.sleepInterval
config option.
The worker configuration file is watched and re-read on every change which provides a mechanism of dynamic run-time reconfiguration. See worker configuration for more details.
NFD-Worker supports dynamic configuration through a configuration file. The default location is /etc/kubernetes/node-feature-discovery/nfd-worker.conf
, but, this can be changed by specifying the-config
command line flag. Configuration file is re-read whenever it is modified which makes run-time re-configuration of nfd-worker straightforward.
Worker configuration file is read inside the container, and thus, Volumes and VolumeMounts are needed to make your configuration available for NFD. The preferred method is to use a ConfigMap which provides easy deployment and re-configurability.
The provided nfd-worker deployment templates create an empty configmap and mount it inside the nfd-worker containers. In kustomize deployments, configuration can be edited with:
kubectl -n ${NFD_NS} edit configmap nfd-worker-conf
+
In Helm deployments, Worker pod parameter worker.config
can be used to edit the respective configuration.
See nfd-worker configuration file reference for more details. The (empty-by-default) example config contains all available configuration options and can be used as a reference for creating a configuration.
Configuration options can also be specified via the -options
command line flag, in which case no mounts need to be used. The same format as in the config file must be used, i.e. JSON (or YAML). For example:
-options='{"sources": { "pci": { "deviceClassWhitelist": ["12"] } } }'
+
Configuration options specified from the command line will override those read from the config file.
Nodes with specific features can be targeted using the nodeSelector
field. The following example shows how to target nodes with Intel TurboBoost enabled.
apiVersion: v1
+ Using node labels · Node Feature Discovery
Using node labels
Nodes with specific features can be targeted using the nodeSelector
field. The following example shows how to target nodes with Intel TurboBoost enabled.
apiVersion: v1
kind: Pod
metadata:
labels:
@@ -10,4 +10,4 @@
name: go1
nodeSelector:
feature.node.kubernetes.io/cpu-pstate.turbo: 'true'
-
For more details on targeting nodes, see node selection.
Node Feature Discovery v0.15
\ No newline at end of file
+
For more details on targeting nodes, see node selection.