From a024b026ad20024113df3da0ede097191bb815e4 Mon Sep 17 00:00:00 2001 From: Arne Wolf Date: Tue, 29 Aug 2023 22:08:07 +0200 Subject: [PATCH 1/8] fd --- DC-SBP-SLES4SAP-HANAonKVM-SLES15SP4 | 19 + adoc/SLES4SAP-HANAonKVM-15SP4-docinfo.xml | 82 + adoc/SLES4SAP-HANAonKVM-15SP4.adoc | 1654 +++++++++++++++++++++ 3 files changed, 1755 insertions(+) create mode 100644 DC-SBP-SLES4SAP-HANAonKVM-SLES15SP4 create mode 100644 adoc/SLES4SAP-HANAonKVM-15SP4-docinfo.xml create mode 100644 adoc/SLES4SAP-HANAonKVM-15SP4.adoc diff --git a/DC-SBP-SLES4SAP-HANAonKVM-SLES15SP4 b/DC-SBP-SLES4SAP-HANAonKVM-SLES15SP4 new file mode 100644 index 000000000..98ecbec7d --- /dev/null +++ b/DC-SBP-SLES4SAP-HANAonKVM-SLES15SP4 @@ -0,0 +1,19 @@ +MAIN="SLES4SAP-HANAonKVM-15SP4.adoc" + +ADOC_TYPE="article" + +ADOC_POST="yes" + +ADOC_ATTRIBUTES="--attribute docdate=2023-08-28" + +# stylesheets +STYLEROOT=/usr/share/xml/docbook/stylesheet/sbp +FALLBACK_STYLEROOT=/usr/share/xml/docbook/stylesheet/suse2022-ns + +XSLTPARAM="--stringparam publishing.series=sbp" + +#DRAFT=yes +ROLE="sbp" +#PROFROLE="sbp" + +DOCBOOK5_RNG_URI="http://docbook.org/xml/5.2/rng/docbookxi.rnc" diff --git a/adoc/SLES4SAP-HANAonKVM-15SP4-docinfo.xml b/adoc/SLES4SAP-HANAonKVM-15SP4-docinfo.xml new file mode 100644 index 000000000..260c849eb --- /dev/null +++ b/adoc/SLES4SAP-HANAonKVM-15SP4-docinfo.xml @@ -0,0 +1,82 @@ + + + https://github.com/SUSE/suse-best-practices/issues/new + SUSE Best Practices for SAP HANA on KVM + + + + + + + SUSE Linux Enterprise Server for SAP Applications + 15 SP4 + + + SUSE Best Practices + SAP + +SUSE Linux Enterprise Server for SAP Applications 15 SP4 + + + + + + Arne + Wolf + + + SAP Solution Architect + SUSE + + + + + Dario + Faggioli + + + Software Engineer Virtualization Specialist + SUSE + + + + + + + + + + + + + + + + + + + + + + SUSEĀ® Linux Enterprise Server for SAP Applications is + optimized in various ways for SAP* applications. + This best practice document describes how SUSE Linux Enterprise Server for SAP Applications 15 SP4 + with KVM should be configured to run SAP HANA for use in production environments. + The setup of the SAP HANA system or other components like HA clusters are beyond the scope of this document. + + + Disclaimer: + Documents published as part of the SUSE Best Practices series have been contributed voluntarily + by SUSE employees and third parties. They are meant to serve as examples of how particular + actions can be performed. They have been compiled with utmost attention to detail. + However, this does not guarantee complete accuracy. SUSE cannot verify that actions described + in these documents do what is claimed or whether actions described have unintended consequences. + SUSE LLC, its affiliates, the authors, and the translators may not be held liable for possible errors + or the consequences thereof. + + diff --git a/adoc/SLES4SAP-HANAonKVM-15SP4.adoc b/adoc/SLES4SAP-HANAonKVM-15SP4.adoc new file mode 100644 index 000000000..7481d0dc9 --- /dev/null +++ b/adoc/SLES4SAP-HANAonKVM-15SP4.adoc @@ -0,0 +1,1654 @@ +:docinfo: + +:localdate: + +// Document Variables +:DocumentName: SUSE Best Practices for SAP HANA on KVM +:slesProdVersion: 15 SP2 +:suse: SUSE +:SUSEReg: SUSE(R) +:sleAbbr: SLE +:sle: SUSE Linux Enterprise +:sleReg: {SUSEReg} Linux Enterprise +:slesAbbr: SLES +:sles: {sle} Server +:slesReg: {sleReg} Server +:sles4sapAbbr: {slesAbbr} for SAP +:sles4sap: {sles} for SAP Applications +:sles4sapReg: {slesReg} for SAP Applications +:haswell: Intel Xeon Processor E7 v3 (Haswell) +:skylake: 1st Generation Intel Xeon Scalable Processor (Skylake) +:cascadelake: 2nd Generation Intel Xeon Scalable Processor (Cascade Lake) +:launchPadNotes: https://launchpad.support.sap.com/#/notes/ + + +//TODO: Add a support checklist, e.g. for support folks (a shortened version of the guide to help support know what to check) +//TODO: add picture to describe CPU core mappings phys/virt +//TODO: add picture to explain VM Scenarios + += {DocumentName} + +{sles4sap} {slesProdVersion} + +[[_sec_introduction]] +== Introduction + +This best practice document describes how {sles4sap} {slesProdVersion} with KVM should be configured to run SAP HANA for use in production environments. +The setup of the SAP HANA system or other components like HA clusters are beyond the scope of this document. + +The following sections describe how to set up and configure the three KVM components required to run SAP HANA on KVM: + +* *<<_sec_hypervisor>>* - The host operating system running the hypervisor directly on the server hardware +* *<<_sec_guest_vm_xml_configuration>>* - The libvirt domain XML description of the guest VM +* *<<_sec_guest_operating_system>>* - The operating system inside the VM where SAP HANA is running + +Follow *<<_sec_supported_scenarios_prerequisites>>* and the respective SAP Notes to ensure a supported configuration. +Most of the configuration options are specific to the libvirt package and therefore require modifying the VM guest's domain XML file. + +[[_sec_definitions]] +=== Definitions + +Virtual Machine:: is an emulation of a computer. +Hypervisor:: The software running directly on the physical sever to create and run VMs (Virtual Machines). +Guest OS:: The operating system running inside the VM (Virtual Machine). +This is the OS running SAP HANA and therefore the one that should be checked for SAP HANA support as per {launchPadNotes}2235581[SAP Note 2235581 "SAP HANA: Supported Operating Systems"] and the https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/appliances.html["`SAP HANA Hardware Directory`"]. +Paravirtualization:: Allows direct communication between the hypervisor and the VM guest resulting in a lower overhead and better performance. +libvirt:: A management interface for KVM. +qemu:: The virtual machine emulator, also seen a process on the hypervisor running the VM. +SI units:: Some commands and configurations use the decimal prefix (for example GB), while other use the binary prefix (for example GiB). In this document we use the binary prefix where possible. + +For a general overview of the technical components of the KVM architecture, refer to section https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-kvm-intro.html["Introduction to KVM Virtualization"] of the Virtualization Guide. + +[[_sec_sap_hana_virtualization_scenarios]] +=== SAP HANA virtualization scenarios + +SAP supports virtualization technologies for SAP HANA usage on a per scenario basis: + +Single-VM:: One VM per hypervisor/physical server for SAP HANA Scale-Up. No other VM or workload is allowed on the same server. +Multi-VM:: Multiple VM`'s per hypervisor/physical server for SAP HANA Scale-Up. +Scale-Out:: For an SAP HANA Scale-Out deployment, distributed over multiple VMs on multiple hosts. + + + +[[_sec_supported_scenarios_prerequisites]] +== Supported scenarios and prerequisites + +Follow the *{DocumentName} - {sles4sap} {slesProdVersion}* + document at hand which describes the steps necessary + to create a supported SAP HANA on KVM configuration. + {sles4sap} must be used for both hypervisor and guest. + +Inquiries about scenarios not listed here should be directed to mailto:saphana@suse.com[saphana@suse.com]. + +[[_sec_supported_scenarios]] +=== Supported scenarios + +At the time of this publication, the following configurations are supported for production use: + +[[_supported_combinations]] +.Supported Combinations +[cols="1,1,1,1", options="header"] +|=== +| CPU Architecture +| SAP HANA scale-up (single VM) +| SAP HANA scale-up (multi VM) +| SAP HANA Scale-out + +// | +// {haswell} +// | +// _Hypervisor:_ {sles4sapAbbr} 12 SP2 +// +// _Guest:_ {sles4sapAbbr} 12 SP1 onwards +// +// _Size:_ max. 4 sockets footnote:max4sockets[Maximum 4 sockets using Intel standard chipsets on a single system board, for example Lenovo* x3850, Fujitsu* rx4770 etc.], 2 TiB RAM +// | +// no +// | +// no +| +{skylake} +| +_Hypervisor:_ {sles4sapAbbr} 15 SP2 + +_Guest:_ {sles4sapAbbr} 15 SP2 onwards + +_Size:_ max. 4 sockets footnote:max4sockets[Maximum 4 sockets using Intel standard chipsets on a single system board, for example Lenovo* x3850, Fujitsu* rx4770 etc.], 3 TiB RAM +| +no +| +no +|=== + + +Check the following SAP Notes for the latest details of supported SAP HANA on KVM scenarios: + +* {launchPadNotes}2284516[SAP Note 2284516 - "SAP HANA virtualized on SUSE Linux Enterprise Hypervisors"] +* {launchPadNotes}3120786[SAP Note 3120786 - "SAP HANA on SUSE KVM Virtualization with SLES 15 SP2"] + +[[_sec_sizing]] +=== Sizing + +When sizing for a virtualized SAP HANA system, some additional factors need to be taken into account. + +[[_sec_resources_hypervisor]] +==== Resources for the hypervisor + +It is recommended to reserve a minimum of about 8% of the hosts's main memory for the hypervisor. + +The hypervisor will consume CPU capacity, approximately 5% to 10% of the SAPS capacity, depending on the workload characteristics: + +* 5% of the SAPS capacity for mainly analytical workloads +* 10% of the SAPS capacity for mainly transactional workloads + +It is however *not* required to dedicate CPUs to the hypervisor. + +[[_sec_memory_sizing]] +==== Memory sizing + +Since SAP HANA runs inside the VM, it is the RAM size of the VM which needs to satisfy the memory requirements from the SAP HANA Memory sizing. + +The memory used by the VM must be smaller than the physical memory of the machine. +It is recommended to reserve at least 8% of the total memory reported by "`/proc/meminfo`" (in the "`MemTotal`" field) for the hypervisor. +This leaves ~ 92% to the VM. + +See <<_sec_memory_backing>> for more details. + +[[_sec_cpu_sizing]] +==== CPU sizing + +//TODO: Check CPU Overhead +Some artificial workload tests on {skylake} CPUs have shown an approximately of up to 20% overhead when running SAP HANA on KVM. +Therefore a thorough test of the configuration for the required workload is highly recommended before "`go live`". + +There are two main ways to deal with CPU sizing from a sizing perspective: + +1. Follow the fixed memory-to-core ratios for SAP HANA as defined by SAP +2. Follow the SAP HANA TDI "`Phase 5`" rules as defined by SAP + +Both ways are described in the following sections. + +===== Following the fixed memory-to-core ratios for SAP HANA + +The certification of the SAP HANA Appliance hardware to be used for KVM prescribes a fixed maximum amount of memory (RAM) which is allowed for each CPU core, also known as *memory-to-core ratio*. The specific ratio also depends on what workload the system will be used for, that is the Appliance Type: OLTP (Scale-up: SoH/S4H) or OLAP (Scale-up: BWoH/BW4H/DM/SoH/S4H). + +The relevant memory-to-core ratio required to size a VM can be easily calculated as follows: + +* Go to the https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/appliances.html["SAP HANA Certified Hardware Directory"]. +* Select the required SAP HANA Appliance and Appliance Type (for example CPU Architecture "Intel Skylake SP" for Appliance Type "Scale-up: BWoH"). +* Look for the largest certified RAM size for the number of CPU Sockets on the server (for example 3 TiB/3072 GiB on 4-Socket). +* Look up the number of cores per CPU of this CPU Architecture used in SAP HANA Appliances. The CPU model numbers are listed at: https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/index.html#details (for example 28). +* Using the above values calculate the total number of cores on the certified Appliance by multiplying number of sockets by number of cores (for example 4x28=112). +* Now divide the Appliance RAM by the total number of cores (not hyperthreads) to give you the *memory-to-core* ratio (for example 3072 GiB/112 = approx. 28 GiB per core). + +<<_sap_hana_core_to_memory_ratio_examples>> below has some current examples of SAP HANA memory-to-core ratios. + +[[_sap_hana_core_to_memory_ratio_examples]] +.SAP HANA memory-to-core ratio examples +[cols="1,1,1,1,1,1", options="header"] +|=== +| CPU Architecture +| Appliance Type +| Max Memory Size +| Sockets +| Cores per Socket +| SAP HANA memory-to-core ratio + +| {skylake} | OLTP | 6 TiB / 6144 GiB | 4 | 28 | 54.86 GiB/core +| {skylake} | OLAP | 3 TiB / 3072 GiB | 4 | 28 | 27.43 GiB/core +|=== + + +// TODO: Remove or change the following + +From your memory requirement, calculate the RAM size the VM needs to be compliant with the appropriate memory-to-core ratio defined by SAP. + +* To get the memory per socket, multiply the memory-to-core ratio by the number of cores (not threads) of a single socket in your host +* Divide the memory requirement by the memory per socket, and round the result up to the next full number, and multiply that number by the memory per socket again + + +.Calculation example +==== +* From an S/4HANA sizing you get a memory requirement for SAP HANA of 2000 GiB. +* Your CPUs have 28 cores per socket. The memory per socket is `28 cores * 54.86 GiB/core = 1536 GiB`. +* Divide your memory requirement `2000 GiB / 1536 GiB = 1.2987` and round this result up to 2. Then multiply `2 * 1536 GiB = 3072 GiB` +* 3072 GiB is now the memory size to use in the VM configuration as described in <<_sec_memory_backing>> +==== + + +===== Following the SAP HANA TDI "Phase 5" rules +** SAP HANA TDI "Phase 5" rules allow customers to deviate from the above described SAP HANA memory-to-core sizing ratios in certain scenarios. +The KVM implementation however must still adhere to the *SUSE Best Practices for SAP HANA on KVM - {sles4sap} {slesProdVersion}* document at hand. +Details on SAP HANA TDI Phase 5 can be found in the following blog https://blogs.sap.com/2017/09/20/tdi-phase-5-new-opportunities-for-cost-optimization-of-sap-hana-hardware/["TDI Phase 5: New Opportunities for Cost Optimization of SAP HANA Hardware"] from SAP. +** Since SAP HANA TDI Phase 5 rules use SAPS based sizing, SUSE recommends applying the same overhead as measured with SAP HANA on KVM for the respective KVM Version/CPU Architecture. SAPS values for servers can be requested from the respective hardware vendor. + + +The following SAP HANA sizing documentation should also be useful: + +// Not Found: * SAP Best Practice "`Sizing Approaches for SAP HANA`": https://websmp203.sap-ag.de/~sapidb/011000358700000050632013E +* https://help.sap.com/viewer/eb3777d5495d46c5b2fa773206bbfb46/2.0.03/en-US/d4a122a7bb57101493e3f5ca08e6b039.html["SAP HANA Master Guide: Sizing SAP HANA"] +* http://sap.com/sizing["General SAP Sizing information"] + + +[[_sec_kvm_hypervisor_version]] +=== Configuring the KVM hypervisor version + +The hypervisor must be configured according to the *SUSE Best Practices for SAP + HANA on KVM - {sles4sap} {slesProdVersion}* guide at hand and fulfill the following minimal requirements: + +* {sles4sap} {slesProdVersion} ("Unlimited Virtual Machines" subscription) +** kernel (Only major version 5.3, minimum package version 5.3.18-24.24.1) +** libvirt (Only major version 6.0, minimum package version 6.0.0-13.3.1) +** qemu (Only major version 4.2, minimum package version 4.2.1-11.10.1) + + +[[_sec_hypervisor_hardware]] +=== Hypervisor hardware + +Use SAP HANA certified servers and storage as per SAP HANA Hardware Directory at https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/. + +[[_sec_guest_vm]] +=== Guest VM + +The guest VM must: + +* run {sles4sap} 15 SP2 or later. +* be a {sles} supported VM guest as per Section 7.1 "Supported VM Guests" of the https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-virt-support.html#virt-support-guests[SUSE Virtualization Guide]. +* comply with KVM limits as per https://www.suse.com/releasenotes/x86_64/SUSE-SLES/15-SP2/#allArch-virtualization-kvm-limits["SUSE Linux Enterprise Server 15 SP2 release notes]". +* fulfill the SAP HANA Hardware and Cloud Measurent Tools (HCMT) storage KPI's as per {launchpadnotes}2493172[SAP Note 2493172 "SAP HANA Hardware and Cloud Measurement Tools"]. + Refer to <<_sec_storage>> for storage configuration details. +* be configured according to the *SUSE Best Practices for SAP HANA on KVM - {sles4sap} {slesProdVersion}* document at hand. + + +[[_sec_hypervisor]] +== Setting up and configuring the hypervisor + +The following sections describe how to set up and configure the hypervisor for a virtualized SAP HANA scenario. + +[[_sec_kvm_hypervisor_installation]] +=== Installing the KVM hypervisor + +For details refer to section 6.4 "Installation of Virtualization Components" of the SUSE Virtualization Guide (https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-vt-installation.html#sec-vt-installation-patterns) + +Install the KVM packages using the following Zypper patterns: + +---- +zypper in -t pattern kvm_server kvm_tools +---- + +In addition, it is also useful to install the `lstopo` tool which is part of the `hwloc` package contained inside the *HPC Module* for SUSE Linux Enterprise Server. + +[[_sec_configure_networking_on_hypervisor]] +=== Configuring networking on the hypervisor + +To achieve maximum performance required for productive SAP HANA workloads, one of the host networking devices must be assigned directly to the KVM guest VM. +A Network Interface Card (NIC) including support for the technology that goes under the name of Single Root I/O Virtualization (SR-IOV) is required. +This guarantees that the overhead in which we would have incurred if using IO Virtualization is avoided. + +To check whether such technology is available, assuming that `17:00.0` is the address of the NIC on the PCI bus (as visible in the output of the `lspci` tool), the following command can be issued: + +---- +lspci -vs 17:00.0 +17:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01) + Subsystem: Intel Corporation Ethernet Converged Network Adapter X710-2 + Flags: bus master, fast devsel, latency 0, IRQ 247, NUMA node 0 + Memory at 9c000000 (64-bit, prefetchable) [size=8M] + Memory at 9d008000 (64-bit, prefetchable) [size=32K] + Expansion ROM at 9d680000 [disabled] [size=512K] + Capabilities: [40] Power Management version 3 + Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ + Capabilities: [70] MSI-X: Enable+ Count=129 Masked- + Capabilities: [a0] Express Endpoint, MSI 00 + Capabilities: [e0] Vital Product Data + Capabilities: [100] Advanced Error Reporting + Capabilities: [140] Device Serial Number d8-ef-c3-ff-ff-fe-fd-3c + Capabilities: [150] Alternative Routing-ID Interpretation (ARI) + Capabilities: [160] Single Root I/O Virtualization (SR-IOV) + Capabilities: [1a0] Transaction Processing Hints + Capabilities: [1b0] Access Control Services + Capabilities: [1d0] #19 + Kernel driver in use: i40e + Kernel modules: i40e +---- + +The output should contain a line similar to the following: `Single Root I/O Virtualization (SR-IOV)`. +If such line is not present, it might be the case that SR-IOV needs to be explicitly enabled in the BIOS. + +[[_sec_assign_network_port_at_pci_nic_level]] +==== Preparing a Virtual Function (VF) for a guest VM + +After checking that the NIC is SR-IOV capable, the host and the guest VM should be configured to use one of the available Virtual Functions (VFs) as (one of) the guest VM's network device(s). +More information about SR-IOV as a technology and how to properly configure everything that is necessary for it to work well in the general case can be found in the SUSE Virtualization Guide for SUSE Linux Enterprise Server 15 SP2 (https://documentation.suse.com/sles/15-SP2/single-html/SLES-virtualization), +and specifically in section "Adding SR-IOV Devices" (https://documentation.suse.com/sles/15-SP2/single-html/SLES-virtualization/#sec-libvirt-config-io). + + +*Enabling PCI passthrough for the host kernel* + +Make sure that the host kernel boot command line contains these two parameters: `intel_iommu=on iommu=pt`. +This is done by editing [path]_/etc/default/grub_: + +* Append `intel_iommu=on iommu=pt` to the string that is assigned to the variable `GRUB_CMDLINE_LINUX_DEFAULT`. +* Then run `update-bootloader` (more detailed information is provided later in the document). + +*Loading and configuring SR-IOV host drivers* + +Before starting the VM, SR-IOV must be enabled on the desired NIC, and the VFs must be created. + +Always make sure that the proper SR-IOV-capable driver is loaded. For example, for an *Intel Corporation Ethernet Controller X710* NIC, the driver resides in the `i40e` kernel module. +It can be loaded with the `modprobe` command, but chances are high that it is aldready loaded by default. + +If the SR-IOV-capable module is not in use by default and it also fails to load with `modprobe`, this might mean that another driver, potentially one that is not SR-IOV-capable, +is the one that is currently loaded. In which case, it should be removed with the `rmmod` command. + +When the proper module is loaded, creating at least one VF happens with the following command (which creates four of them): + +---- +echo 4 > /sys/bus/pci/devices/0000\:17\:00.0/sriov_numvfs +---- + +Or, assuming that the designated NIC corresponds to the symbolic name of `eth10`, use the following command: + +---- +echo 4 > /sys/class/net/eth10/device/sriov_numvfs +---- + +The procedure can be automated to run at boot time: Create the following `systemd` unit file [path]_/etc/systemd/system/after.local_: + +---- +[Unit] +Description=/etc/init.d/after.local Compatibility +After=libvirtd.service +Requires=libvirtd.service +[Service] +Type=oneshot +ExecStart=/etc/init.d/after.local +RemainAfterExit=true + +[Install] +WantedBy=multi-user.target +---- + +After that, create the script [path]_/etc/init.d/after.local_: + +---- +#! /bin/sh +# +# Copyright (c) 2010 SuSE LINUX Products GmbH, Germany. All rights reserved. +# ... +echo 4 > /sys/class/net/eth10/device/sriov_numvfs +---- + +[[_sec_storage_hypervisor]] +=== Configuring storage on the hypervisor + +As with compute resources, the storage used for running SAP HANA must also be SAP certified. +Therefore only the storage from SAP HANA Appliances or SAP HANA Certified Enterprise Storage (https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/#/solutions?filters=v:deCertified;storage) is supported. +In all cases the SAP HANA storage configuration recommendations from the respective hardware vendor and the SAP HANA Storage Requirements for TDI (https://archive.sap.com/kmuuid2/70c8e423-c8aa-3210-3fae-e043f5c1ca92/SAP%20HANA%20TDI%20-%20Storage%20Requirements.pdf) should be followed. + +There are two supported storage options to use for the SAP HANA database: Fibre Channel (FC) storage and Network Attached Storage (NAS). + +==== Network attached Storage +The SAP HANA storage is attached via the NFSv4 protocol. +In this case, nothing needs to be configured on the hypervisor. +Do make sure though that the VM has access to one or more dedicated 10 Gbit Ethernet interfaces for the network traffic to the network-attached storage. + +==== Fibre Channel storage + +As described in <<_sec_configure_networking_on_hypervisor>>, to reach the adequate level of performance, the storage drives for actual SAP HANA data are attached to the guest VM via directly assigning the SAN HBA controller to it. +One difference, though, is that there is no counterpart of SR-IOV commonly available for storage controllers. +Therefore, a full SAN HBA controller must be dedicated and directly assigned to the guest VM. + +To figure out which SAN HBA should be used, check the available ones, for example with the `lspci` command: + +---- +lspci | grep -i "Fibre Channel" +85:00.0 Fibre Channel: QLogic Corp. ISP2722-based 16/32Gb Fibre Channel to PCIe Adapter (rev 01) +85:00.1 Fibre Channel: QLogic Corp. ISP2722-based 16/32Gb Fibre Channel to PCIe Adapter (rev 01) +ad:00.0 Fibre Channel: QLogic Corp. ISP2722-based 16/32Gb Fibre Channel to PCIe Adapter (rev 01) +ad:00.1 Fibre Channel: QLogic Corp. ISP2722-based 16/32Gb Fibre Channel to PCIe Adapter (rev 01) +---- + +The HBAs that are assigned to the guest VM must not be in use on the host. + +The remaining storage configuration details, such as how to add the disks and the HBA controllers to the guest VM configuration file, +and what to do with them from inside the guest VM itself, are available in <<_sec_storage>>. + +[[_sec_hypervisor_operating_system_configuration]] +=== Configuring the hypervisor operating system + +The hypervisor host operating system needs to be configured to assure compatibility and maxized performance for an SAP HANA VM. + + +[[_sec_vhostmd]] +==== Installing `vhostmd` +The hypervisor needs to have the `vhostmd` package installed and the corresponding `vhostmd` service enabled and started. +This is described in {launchPadNotes}1522993[SAP Note 1522993 - "Linux: SAP on SUSE KVM - Kernel-based Virtual Machine"]. + + +[[_sec_tuned]] +==== Tuning the generic host with `tuned` + +To apply some less specific, but nevertheless effective, tuning to the host, the *TuneD* tool (https://tuned-project.org/) can be used. + +When installed (the package name is `tuned`), one of the preconfigured profiles can be selected, or a custom one created. +Specifically, the `virtual-host` profile should be chosen. +Do not use the `sap-hana profile` on the hypervisor. +This can be achieved with the following commands: + +---- +zypper in tuned + +systemctl enable tuned + +systemctl start tuned + +tuned-adm profile virtual-host +---- + +The `tuned` daemon should now start automatically at boot time, and it should always load the `virtual-host` profile, so there is no need to add any of the above commands in any custom start-up script. +If in doubt, it is possible to check with the following command whether `tuned` is running and what the current profile is : + +---- +tuned-adm profile + +Available profiles: +- balanced - General non-specialized tuned profile +... +- virtual-guest - Optimize for running inside a virtual guest +- virtual-host - Optimize for running KVM guests +Current active profile: virtual-host +---- + +[[_sec_verify_tuned_has_set_cpu_frequency_governor_and_performance_bias]] +===== Power management considerations + +The CPU frequency governor should be set to *performance* to avoid latency issues because of ramping the CPU frequency up and down in response to changes in the system's load. +The selected `tuned` profile should have done this already, and with the following command, it is possible to verify that it actually did: + +---- +cpupower -c all frequency-info +---- + +The governor setting can be verified by looking at the *current policy*. + +Additionally, the performance bias setting should also be set to 0 (performance). The performance bias setting can be verified with the following command: + +---- +cpupower -c all info +---- + +Modern processors also attempt to save power when they are idle, by switching to a lower power state. +Unfortunately this incurs latency when switching in and out of these states. + +To avoid that, and achieve better and more consistent the performance, the CPUs should not be allowed to go into too aggressive power saving modes (known as C-states). +It therefore is recommended that only C0 and C1 are used. + +This can be enforced by adding the following parameters to the kernel boot command line: `intel_idle.max_cstate=1`. + +To double check that only the desired C-states are actually available, the following command can be used: + +---- +cpupower idle-info +---- + +The idle state settings can be verified by looking at the line containing`Available idle states:`. + + +[[_sec_irqbalance]] +==== `irqbalance` + +The `irqbalance` service should be disabled because it can cause latency issues when the _/proc/irq/*_ files are read. +To disable `irqbalance` run the following command: + +---- +systemctl disable irqbalance.service + +systemctl stop irqbalance.service +---- + +[[_sec_no_ksm]] +==== Kernel Samepage Merging (ksm) + +Kernel Samepage Merging (KSM, https://www.kernel.org/doc/html/latest/admin-guide/mm/ksm.html ) is of no use, because there is only one single VM. Thus it should be disabled. +The following command makes sure that it is tuned off and that any sharing and de-duplication activity that may have happened, in case it was enabled, is reverted: + +---- +echo 2 > /sys/kernel/mm/ksm/run +---- + +[[_sec_customize_the_linux_kernel_boot_options]] +==== Customizing the Linux kernel boot options + +To edit the boot options for the Linux kernel, perform the following steps: + +. Edit [path]_/etc/defaults/grub_ and add the following boot options to the line *GRUB_CMDLINE_LINUX_DEFAULT* (a detailed explanation of these options will follow). ++ + +---- +mitigations=auto kvm.nx_huge_pages=off numa_balancing=disable kvm_intel.ple_gap=0 transparent_hugepage=never intel_idle.max_cstate=1 default_hugepagesz=1GB hugepagesz=1GB hugepages= intel_iommu=on iommu=pt intremap=no_x2apic_optout +---- ++ + +. Run the following command: ++ + +---- +update-bootloader +---- +. Reboot the system: ++ + +---- +reboot +---- + + +[[_sec_technical_explanation_of_the_above_described_configuration_settings]] +==== Technical explanation of the above described configuration settings + +*Hardware vulnerabilities mitigations (mitigations=auto kvm.nx_huge_pages=off)* + +Recently, a class of side channel attacks exploiting the branch prediction and the speculative execution capabilities of modern CPUs appeared. +On an affected CPU, these problems cannot be fixed, but their effect and their actual exploitability can be mitigated in software. +However, this sometimes has a non-negligible impact on the performance. + +For achieving the best possible security, the software mitigations for these vulnerabilities are being enabled (`mitigations=auto`) with the only exception of the one that deals with "Machine Check Error Avoidance on Page Size Change (CVE-2018-12207, also known as "iTLB Multiht"). + +//TODO: We probably want a more generic and little bit more detailed section about mitigations? + +*Automatic NUMA balancing (numa_balancing=disable)* + +Automatic NUMA balancing can result in increased system latency and should therefore be disabled. + +*KVM PLE-GAP (kvm_intel.ple_gap=0)* + +Pause Loop Exit (PLE) is a feature whereby a spinning guest CPU releases the physical CPU until a lock is free. +This is useful in cases where multiple virtual CPUs are using the same physical CPU but causes unnecessary delays when the system is not overcommitted. + +*Transparent huge pages (transparent_hugepage=never)* + +Because 1 GiB pages are used for the virtual machine, then there is no additional benefit from having THP enabled. +Disabling it will avoid `khugepaged` interfering with the virtual machine while it scans for pages to promote to hugepages. + +*Processor C-states (intel_idle.max_cstate=1)* + +Optimal performance is achieved by limiting the processor to states C0 (normal running state) and C1 (first lower power state). + +Note that, while there is an exit latency associated with C1 states, it is offset on hyperthread-enabled platforms by the fact sibling cores can borrow resources from sibling cores +if they are in the C1 state and some CPUs can boost the CPU frequency higher if siblings are in the C1 state. + +*Huge pages (default_hugepagesz=1 GiB + hugepagesz=<1 GiB hugepages=number of hugepages>)* + +The use of 1 GiB huge pages is to reduce overhead and contention when the guest is updating its page tables. +This requires allocation of 1 GiB huge pages on the host. +The number of pages to allocate depends on the memory size of the guest. + +1 GiB pages are not pageable by the OS. Thus they always remain in RAM and therefore the `locked` definition in libvirt XML files is not required. + +It also important to ensure the order of the huge page options. Specifically the `` option must be placed *after* the 1 GiB huge page size definitions. + +.Calculating value +[NOTE] +==== +The value for `` should be calculated by taking the number GiB`'s of RAM minus approx. 8% for the hypervisor OS. +For example, 3 TiB RAM (3072 GiB) minus 8% are approximately 2770 huge pages. +==== + +*PCI Passthrough (intel_iommu=on iommu=pt)* + +For being able to directly assign host devices (like storage controllers and NIC Virtual Functions), with PCI Passthrough and SR-IOV, the IOMMU must be enabled. +On top of that, `iommu=pt` makes sure that you set up the devices for the best performance (that is, passthrough mode). + +*Interrupt remapping (intremap=no_x2apic_optout)* + +Interrupt remapping interrupts from devices to be intercepted, validated and routed to a specific CPU (for example, one where a virtual CPU of the guest VM that has the device assigned is running). +This parameter makes sure that such feature is always enabled. + +[[_sec_guest_vm_xml_configuration]] +== Configuring the guest VM + +This section describes the modifications required to the libvirt XML definition of the guest VM. +The libvirt XML may be edited using the following command: + +---- +virsh edit Guest VM name +---- + +[[_sec_create_an_initial_guest_vm_xml]] +=== Creating an initial guest VM XML + +Refer to section 9 "Guest Installation" of the SUSE Virtualization Guide (https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-kvm-inst.html ). + +[[_sec_global_vcpu_configuration]] +=== Configuring global vCPU + +The virtual CPU configuration of the VM guest should reflect the host CPU configuration as close as possible. +There cannot be any overcommitting of memory or CPU resources. + +The CPU model should be set to `host-passthrough`, and any `check` should be disabled. +In addition, the `rdtscp`, `invtsc` and `x2apic` features are required. + +[[_sec_memory_backing]] +=== Backing memory + +Huge pages, sized 1 GiB (that is, 1048576 KiB), must be used for all the guest VM memory. +This guarantees optimal performance for the guest VM. + +It is necessary that each NUMA cell of the guest VM have a whole number of huge pages assigned to them (that is, no fractions of huge pages). +All the NUMA cells should also have the same number of huge pages assigned to them (that is, the guest VM memory configuration must be balanced). + +Therefore the number of huge pages needs to be dividable by the number of NUMA cells. + +For example, if the host has 3169956100 KiB (that is, 3 TiB) of memory and we want to leave 91.75% of it to the hypervisor (see <<_sec_memory_sizing>>), and there are 4 NUMA cells, each NUMA cell will have the following number of huge pages: + +* (3169956100 * (91.75/100)) / 1048576 / 4 = 693 + +This means that, in total, there will need to be the following number of huge pages: + +* 693 * 4 = 2772 + +Such number must be passed to the host kernel command line parameter on boot (that is `hugepages=2772`, see <<_sec_technical_explanation_of_the_above_described_configuration_settings>>). + +Both the total amount of memory the guest VM should use and the fact that such memory must come from 1 GiB huge pages need to be specified in the guest VM configuration file. + +It must also be ensured that the `memory` and the `currentMemory` element have the same value, to disable memory ballooning, which, if enabled, would cause unacceptable latency: + +---- + + + 2906652672 + 2906652672 + + + + + + + + +---- + +.Memory Unit +[NOTE] +==== +The memory unit can be set to GiB to ease the memory computations. +==== + +[[_sec_vcpu_and_vnuma_topology]] +=== Mapping vCPU and vNUMA topology and pinning + +It is important to map the host topology into the guest VM, as described below. +This allows HANA to spread its own workload threads across many virtual CPUs and NUMA nodes. + +For example, for a 4-socket system, with 28 cores per socket and hyperthreading enabled, the virtual CPU configuration will also have 4 sockets, 28 cores, 2 threads. + +Always make sure that, in the guest VM configuration file: + +* the `cpu` `mode` attribute is set to `host-passthrough`. +* the `cpu` `topology` attribute describes the vCPU NUMA topology of the guest, as discussed above. +* the attributes of the `numa` elements describe which vCPU number ranges belong to which NUMA cell. Care should be taken since these number ranges are not the same as on the host. Additionally: +** the `cell` elements describe how much RAM should be distributed per NUMA node. In this 4-node example enter 25% (or 1/4) of the entire guest VM memory. +Also refer to <<_sec_memory_backing>> and <<_sec_memory_sizing>> of this paper for further details. +** each NUMA cell of the guest VM has 56 vCPUs. +** the distances between the cells are identical to those of the physical hardware (as per the output of the command`numactl --hardware`). + +---- + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +---- + +It is also necessary to pin virtual CPUs to physical CPUs, to limit the overhead caused by virtual CPUs being moved around physical CPUs by the host scheduler. +Similarly, the memory for each NUMA cell of the guest VM must be allocated only on the corresponding host NUMA node. + +Note that KVM/QEMU uses a static hyperthread sibling CPU APIC ID assignment for virtual CPUs, irrespective of the actual physical CPU APIC ID values on the host. +For example, assuming that the first hyperthread sibling pair is CPU 0 and CPU 112 on the host, you will need to pin that sibling pair to vCPU 0 and vCPU 1. + +It is recommended to pin both the various sibling pairs of vCPUs to (the corresponding) sibling pairs of host CPUs. +For example, vCPU 0 should be pinned to pCPU 0 and 112, and the same applies to vCPU 1. +As far as both the vCPUs always run on the same physical core, the host scheduler is allowed to execute them on either thread, for example in case only one is free while the other is busy executing host or hypervisor activities. + +Using the above information, the CPU and memory pinning section of the guest VM XML can be created. +Below find a practical example based on the hypothetical example above. + +Make sure to take note of the following configuration components: + +* The `vcpu placement` element lists the total number of vCPUs in the guest. +* The `cputune` element contains the attributes describing the mappings of vCPUs to physical CPUs. +* The `numatune` element contains the attributes to describe distribution of RAM across the virtual NUMA nodes (CPU sockets). +** The `mode` attribute should be set to `strict`. +** The appropriate number of nodes should be entered in the `nodeset` and `memnode` attributes. In this example, there are 4 sockets, therefore the values are `nodeset=0-3` and `cellid` 0 to 3. + +---- + + 224 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +---- + +The following script generates a section of the domain configuration according to the described specifications: + +---- +#!/usr/bin/env bash +NUM_VCPU=$(ls -d /sys/devices/system/cpu/cpu[0-9]* | wc -l) +echo " ${NUM_VCPU}" +echo " " +THREAD_PAIRS="$(cat /sys/devices/system/cpu/cpu*/topology/core_cpus_list | sort -n | uniq )" +VCPU=0 +for THREAD_PAIR in ${THREAD_PAIRS}; do + for i in 1 2; do + echo " " + VCPU=$(( VCPU + 1 )) + done +done +echo " " +---- + +The following commands can be used to determine the CPU details on the hypervisor host: + +---- +lscpu --extended=CPU,SOCKET,CORE + +lstopo-no-graphics +---- + +It is not necessary to isolate the guest VM's `iothreads`, nor to statically reserve any host CPU to either them or any other kind of host activity. + +[[_sec_network]] +=== Configuring networking + +One of the Virtual Functions prepared in <<_sec_configure_networking_on_hypervisor>> must be added to the guest VM as (one of) its network adapter(s). +This can be done by putting the following details in the guest VM configuration file: + +---- + + + + + + + + +
+ + + + + + +---- + +The various properties (for example `domain`, `bus`, etc.) of the `address` element should contain the proper values for pointing at the desired device (check with `lspci`). + +[[_sec_storage]] +=== Configuring storage + +The storage configuration is critical, as in plays an important role in terms of performance. + +[[_sec_storage_configuration_for_operating_system_volumes]] +==== Configuring storage for operating system volumes + +The performance of storage where the operating system is installed is not critical for the performance of SAP HANA. +Therefore any KVM supported storage may be used to deploy the operating system itself. See an example below: + + +---- + + + + + + + + + + + + + +---- + +The `dev` attribute of the `source` element should contain the appropriate path. + +[[_sec_storage_configuration_for_sap_hana_volumes]] +==== Configuring storage for SAP HANA volumes + +The configuration depends on the type of storage used for the SAP HANA Database. + +In any case, the storage for SAP HANA must be able to fulfill the storage requirements for SAP HANA from within the VM. +The SAP HANA Hardware and Cloud Measurement Tools (HCMT) can be used to assess if the storage meets the requirements. +For details on HCMT refer to {launchpadnotes}2493172[SAP Note 2493172 - "SAP HANA Cloud and Hardware Measurement Tools"]. + +===== Network attached storage + +Follow the SAP HANA specific best practices of the storage system vendor. +Make sure though that the VM has access to one or more dedicated 10 GiB Ethernet interfaces for the network traffic to the network attached storage. + +===== Fibre Channel storage + +Since storage controller passthrough is used (see <<_sec_storage_hypervisor>>), any LVM (Logical Volume Manager) and Multipathing configuration should, if wanted, be made inside the guest VM, always following the storage layout recommendations from the appropriate hardware vendor. + +The guest VM XML configuration must be based on the underlying storage configuration on the hypervisor (see <<_sec_storage_hypervisor>>) + +Since the storage for HANA (`/data`, `/log/` and `/shared` volumes) is performance critical, it is recommended to take advantage of an SAN HBA that is passed through to the guest VM. + + +Note that it is not possible to only use one function of the adapter, and both must always be attached to the guest VM. +An example guest VM configuration with storage passthrough configured would look like the below (adjust the domain, bus, slot and function attributes of the `address` elements to match the adapter you chose): + +---- + + + + + + +
+ + + + +
+ + + + + + +---- + +More details about how to directly assign PCI devices to a guest VM are described in section 14.7 "Adding a PCI Device" of the Virtualization Guide (https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-libvirt-config-virsh.html#sec-libvirt-config-pci-virsh). + +[[_sec_vhostmd_guest]] +=== Setting up a `vhostmd` device + +The `vhostmd` device is passed to the VM so that the `vm-dump-metrics command` can retrieve metrics about the hypervisor provided by `vhostmd`. +You can use either a vbd disk or a virtio-serial device (preferred) to set this up (see {launchPadNotes}1522993[SAP Note 1522993 - "Linux: SAP on SUSE KVM - Kernel-based Virtual Machine"] for details). + + +[[_sec_clocks_timers]] +=== Setting up clocks and timers + +Make sure that the clock timers are set up as follows, in the guest VM configuration file: + +---- + + + + + + + + + +---- + +[[_sec_virtio_rng]] +=== Setting up the Virtio Random Number Generator (RNG) device + +The host /dev/random file should be passed through to QEMU as a source of entropy using the virtio RNG device: + +---- + + + + + + /dev/urandom + + + + + +---- + +[[_sec_features]] +=== Configuring special features + +It is necessary to enable for the guest VM a set of optimizations that are specific for the cases when the vCPUs are pinned and have (semi-)dedicated pCPUs all for themselves. +This is done by having the following in the guest VM configuration file: + +---- + + + + + + + + + + +---- + +Note that this is a requirement for making it possible to load and use the "`cpuidle-haltpoll`" kernel module inside of the guest VM OS (see <<_sec_cpuidle_haltpoll>>). + + +[[_sec_guest_operating_system]] +== Installing the guest operating system + +[[_sec_install_sles_for_sap_inside_the_guest_vm]] +=== Installing SUSE Linux Enterprise Server for SAP Applications inside the Guest VM + +Refer to the https://documentation.suse.com/sles-sap/15-SP2/[SUSE Guide "`SUSE Linux Enterprise Server for SAP Applications 15]. + + +[[_sec_guest_operating_system_configuration_for_sap_hana]] +=== Configuring the guest operating system for SAP HANA + +Install and configure {sles4sap} {slesProdVersion} and SAP HANA as described in: + +* {launchPadNotes}1944799[SAP Note 1944799 - "SAP HANA Guidelines for SLES Operating System Installation"] +* {launchPadNotes}2205917[SAP Note 2684254 - "SAP HANA DB: Recommended OS settings for SLES 15 / SLES for SAP Applications 15"] + +[[_sec_customizing_linux_cmdline_guest]] +==== Customizing the Linux kernel parameters of the guest + +Like the hypervisor host, the VM also needs special kernel parameters to be set. +To edit the boot options for the Linux kernel to the following: + +. Edit [path]_/etc/defaults/grub_ and add the following boot options to the line "`GRUB_CMDLINE_LINUX_DEFAULT`". ++ + +---- +mitigations=auto kvm.nx_huge_pages=off intremap=no_x2apic_optout +---- ++ + +A detailed explanation of these parameters has been given in <<_sec_technical_explanation_of_the_above_described_configuration_settings>>. + +[[_sec_enabling_host_monitoring_guest]] +==== Enabling host monitoring + +The VM needs to have the `vm-dump-metrics` package installed, which dumps the metrics provided by the `vhostmd` service running on the hypervisor. This enables SAP HANA can collect data about the hypervisor. +{launchPadNotes}1522993[SAP Note 1522993 - "Linux: SAP on SUSE KVM - Kernel-based Virtual Machine"] describes how to set up the virtual devices for `vhostmd` and how to configure it. +When using a virtual disk for `vhostmd`, the virtual disk device must be world-readable, which is ensured with the boot time configuration below. + + +[[_sec_configuring_guest_at_boot_time]] +==== Configuring the Guest at boot time + +The folling settings need to be configured at boot time of the VM. +To persist these configurations it is recommended to put the commands provided below into a script which is executed as part of the boot process. + +===== Disabling `irqbalance` + +The irqbalance service should be disabled because it can cause latency issues when the `/proc/irq/*` files are read. +To disable irqbalance run the following command: + +---- +systemctl disable irqbalance.service +systemctl stop irqbalance.service +---- + +===== Activating and configuring `sapconf` or `saptune` + +The following parameters need to be set in `sapconf` version 5. Edit the file `/etc/sysconfig/sapconf` to reflect the settings below, and then restart the `sapconf` service. + +---- +GOVERNOR=performance +PERF_BIAS=performance +MIN_PERF_PCT=100 +FORCE_LATENCY=5 +---- + +NOTE: When using `sapconf` version 5, stop and disable the `tuned` service and instead enable and start the `sapconf` service. + +If you use `saptune`, configure it accordingly: + +* Apply the `HANA` solution: `saptune solution apply HANA` +* Create the file `/etc/saptune/override/2684254` with the following content. +---- +[cpu] +force_latency=5 +---- +* Re-apply the recommendations for SAP Note 2684254: `saptune note apply 2684254` + +Detailed documentation on `saptune` is available in chapter https://documentation.suse.com/sles-sap/15-SP2/html/SLES-SAP-guide/cha-tune.html[Tuning systems with `saptune`] of the {sles4sap} Guide. + + + +[[_sec_cpuidle_haltpoll]] +===== Activating and configuring `haltpoll` + + + +---- +POLL_NS=800000 +GROW_START=200000 +modprobe cpuidle-haltpoll +echo $POLL_NS > /sys/module/haltpoll/parameters/guest_halt_poll_ns +echo $GROW_START > /sys/module/haltpoll/parameters/guest_halt_poll_grow_start +---- + +===== Setting the clock source + +The clock source needs to be set to `tsc`. + +---- +echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource +---- + +===== Disabling Kernel Same Page Merging + +Kernel Same Page Merging (KSM) needs to be disabled, like on the hypervisor (see <<_sec_no_ksm>>). + +---- +echo 2 >/sys/kernel/mm/ksm/run +---- + +===== Implementing automatic configuration at boot time +The following script is provided as an example for a script implementing above recommendations, to be executed at boot time of the VM. + +.Script +---- +#!/usr/bin/env bash +# +# Configure KVM guest for SAP HANA +# + +POLL_NS=800000 +GROW_START=200000 + +# disable irqbalance +systemctl disable --now irqbalance + +modprobe cpuidle-haltpoll +echo $POLL_NS > /sys/module/haltpoll/parameters/guest_halt_poll_ns +echo $GROW_START > /sys/module/haltpoll/parameters/guest_halt_poll_grow_start + +# Set clocksource to tsc +echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource + +# disable Kernel Samepage Merging +echo 2 >/sys/kernel/mm/ksm/run +# 2: disable it, but make sure you also purify everything with fire! + +# fix access to vhostmd device, so that SIDadm can read it +# see function setup_vhostmd_guest_device() in qacss-schwifty-common + +# the vhostmd device has exactly 256 blocks, try to catch that from /proc/partitions +VHOSTMD_DEVICE=$(grep " 256 " /proc/partitions | awk '{print $4}' ) +if [ -n "$VHOSTMD_DEVICE" ]; then + chmod o+r /dev/"$VHOSTMD_DEVICE" +else + echo "Missing vhostmd device, please check you XML file." +fi +---- + +Both `sapconf` and `saptune` apply their settings at boot time automatically and do not need to be included in the script above. + +[[_sec_guest_operating_system_storage_configuration_for_sap_hana_volumes]] +=== Configuring the guest operating system storage for SAP HANA volumes + +* Follow the storage layout recommendations from the appropriate hardware vendors. +* Only use LVM (Logical Volume Manager) inside the VM for SAP HANA. Nested LVM is not to be used. + + +[[_sec_performance_considerations]] +== Performance considerations + +The Linux kernel has code to mitigate existing vulnerabilities of the {skylake} CPUs. Our testing showed no visible impact of those mitigations with regard to SAP HANA performance, except for the https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/multihit.html[iTLB Multihit] mitigation. This mitigation can be controlled by the kernel parameter `kvm.nx_huge_pages` (see https://www.suse.com/support/kb/doc/?id=000019411[SUSE support document 7023735]). + +In general, the setting of parameter `kvm.nx_huge_pages` has an impact on performance. +The implications on performance need to be considered as laid out in the Skylake example below. + +Performance deviations for virtualization as measured on Intel Skylake (Bare Metal to single VM): + +* Setting `kvm.nx_huge_pages=off` +** The measured performance deviation for OLTP or mixed OLTP/OLAP workload is below +10%. +** The measured performance deviation for OLAP workload is below 5%. +* Setting `kvm.nx_huge_pages=auto` +** The measured performance deviation for OLTP or mixed OLTP/OLAP was impacted by +this setting. +For S/4HANA standard workload, OLTP transactional request times show an overhead of up to 30 ms. +This overhead leads to an additional transactional throughput loss, but did not exceed 10%, running at a very high system load, when compared to the underlying bare metal environment. +** The measured performance deviation for OLAP workload is below 5%. +** During performance analysis with standard workload, most of the test cases stayed within the defined KPI of 10% performance degradation compared to bare metal. +However, there are low-level performance tests in the test suite exercising various HANA kernel components that exhibit a performance degradation of more than 10%. +This also indicates that there are particular scenarios which might not be suited for SAP HANA on SUSE KVM with kvm.nx_huge_pages = AUTO; especially those workloads generating high resource utilization, which must be considered when sizing SAP HANA instance in a SUSE KVM virtual machine. +Thorough test of configuration for all workload conditions are highly recommended. + + + +[[_sec_administration]] +== Administration + +For a full explanation of administration commands, refer to official SUSE Virtualization documentation such as: + +* Section 10 "Basic VM Guest Management" and others in the SUSE Virtualization Guide for SUSE Linux Enterprise Server 15 (https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-libvirt-managing.html) +* SUSE Virtualization Best Practices for SUSE Linux Enterprise Server 15 (https://documentation.suse.com/sles/15-SP2/html/SLES-all/article-vt-best-practices.html) + + +[[_sec_useful_commands_on_the_hypervisor]] +=== Useful commands on the hypervisor + +Check kernel boot options used: + +---- +cat /proc/cmdline +---- + +Check huge page status (This command can also be used to monitor the progress of huge page allocation during VM start): + +---- +cat /proc/meminfo | grep Huge +---- + +List all VM guest domains configured on the hypervisor: + +---- +virsh list --all +---- + +Start a VM (Note: VM start times can take some minutes on larger RAM systems, check the progress with `/proc/meminfo | grep Huge`: + +---- +virsh start VM/Guest Domain name +---- + +Shut down a VM: + +---- +virsh shutdown VM/Guest Domain name +---- + +This is the location of VM guest configuration files: + +---- +/etc/libvirt/qemu +---- + +This is the location of VM Log files: + +---- +/var/log/libvirt/qemu +---- + +[[_sec_useful_commands_inside_the_vm_guest]] +=== Useful commands inside the VM guest + +Checking L3 cache has been enabled in the guest: + +---- +lscpu | grep L3 +---- + +Validate guest and host CPU topology: + +---- +lscpu +---- + +[[_sec_examples]] +== Examples + + +[[_sec_example_guest_vm_xml]] +=== Example guest VM XML + +.XML configuration example +[WARNING] +==== +The XML file below is only an *example* showing the key configurations based on the about command outputs to assist in understanding how to configure the XML. +The actual XML configuration must be based on your respective hardware configuration and VM requirements. +==== + +Points of interest in this example (refer to the detailed sections of the *SUSE Best Practices for SAP HANA on KVM - {sles4sap} {slesProdVersion}* document at hand for a full explanation): + +* Memory +** The hypervisor has 3 TiB RAM (or 3072 GiB), of which 2772 GiB has been allocated as 1 GB huge pages and therefore 2772 GiB is the max VM size in this case +** 2772 GiB = 2906652672 KiB +** In the `numa` section memory is split evenly over the 4 NUMA nodes (CPU sockets) +* CPU pinning +** Note the alternating CPU pinning on the hypervisor, see <<_sec_vcpu_and_vnuma_topology>> for details +** Note the topology of the guest VM mirrors the one of the hypervisor (4x28 CPU cores) +* Network I/O +** Virtual functions of the physical network interface card have been added as PCI devices +* Storage I/O +** A single SAN HBA is passed through to the VM as `hostdev` device (one for each function/port) +** See <<_sec_storage>> for details +* `rng model='virtio'`, for details see <<_sec_virtio_rng>> +* `qemu:commandline` elements to describe CPU attributes, for details see <<_sec_global_vcpu_configuration>> + + +The following VM definition is an example for a VM configured to consume a 4-socket server with 3 TiB of main memory. It is taken from our actual validation machine. +Note that this file is abridged for clarity; the cut is denoted by a `[...]` mark. + +---- +# cat /etc/libvirt/qemu/SUSEKVM.xml +!-- +WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE +OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: + virsh edit SUSEKVM +or other application using the libvirt API. +-- + + + kvmvm11 + f529e0b0-93cc-4e83-87dc-65cb9922336d + kvmvm11 + + + + + + 2906652672 + 2906652672 + + + + + + + 224 + + + + + + + + + + + + + +[...] + + + + + + + + + + + + + + + + + + + + /machine + + + hvm + /usr/share/qemu/ovmf-x86_64-smm-ms-code.bin + /var/lib/libvirt/qemu/nvram/kvmvm12_VARS.fd + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + destroy + restart + destroy + + + + + + /usr/bin/qemu-system-x86_64 + + + + +
+ + + + + + +
+ + +
+ + +
+ + + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + + + +
+ + +
+ + + + + +
+ + + + + +
+ +
+ + + + + + + + + + + +
+ + + +
+ + +
+ + + + + + + + +
+ +