Have an account?

  •   Personalized content
  •   Your products and support

Need an account?

Create an account

Intel Optane DC Persistent Memory Devices on Cisco UCS Servers for Microsoft SQL 2019 Databases

Networking Solutions White Paper

Available Languages

Download Options

  • PDF
    (1.2 MB)
    View with Adobe Reader on a variety of devices
Updated:February 17, 2020

Available Languages

Download Options

  • PDF
    (1.2 MB)
    View with Adobe Reader on a variety of devices
Updated:February 17, 2020
 

 

Executive Summary

Storage has been the most essential critical resource for many enterprise applications. Over a period, there have been many features and advancements which enabled exponential growth in compute processing power leaving the storage performance far behind and thereby creating huge gap between the compute and storage performance. Additionally, ever-increasing amount of data and the need to access more data quickly have further increased the gap. To some extent, performance gap between these two areas has been filled by introducing an intermediate temporary storage using Non-Volatile Memory devices. However, due to huge volumes of data, the modern applications like large system databases, IoT applications, BigData analytical engines, need higher memory capacity to deal with larger datasets and there is also a need to keep larger amounts of data closer to the CPU.

Intel has recently introduced a disruptive Datacenter Persistent memory technology to address the challenges discussed above. This document provides steps for provisioning Intel Optane DC Persistent Memory Devices (Intel Optane PMEMs) using Cisco UCS system as well as Operating Systems along with some of the DCPMM and BIOS recommendations for specific SQL Server database deployments.

This document focuses on the following aspects.

     Configuring Intel Optane DC Persistent Memory Devices using Cisco Unified Manager (UCSM) and Cisco Integrated Management controller (IMC)

     Steps for provisioning Intel Optane DC Persistent Memory Devices through Windows 2019 and RHEL 8 Operating Systems along with few DCPMM and BIOS configuration recommendations for SQL Server database workloads.

     Few SQL Server database use cases which can benefit by using Intel Optane DC Persistent Memory Devices.

Audience & Scope

This document can be referenced by IT professionals, infrastructure administrators and database specialists who work on planning, designing and implementing Microsoft SQL Server database solutions using Intel Optane DC Persistent Memory Devices. It is expected that readers have some knowledge about Intel Optane persistent memory devices, Cisco UCS Manager and Cisco IMC etc.

The document focuses on configuration recommendations using Intel Optane DC Persistent Memory Devices for SQL Server 2019 databases for specific use cases on BareMetal deployments only. This document does not cover other management and troubleshooting aspects of Intel Optane DC Persistent Memory Devices.

This paper does not capture performance details as the performance gains are workload dependent.

Technology Overview

Intel Optane Datacenter Persistent Memory

Intel Optane DC persistent memory is an innovative memory technology that delivers a unique combination of affordable large capacity and support for data persistence. Intel Optane DC Persistent Modules are based on Intel 3D XPoint non-volatile memory technology sits in between memory and storage tiers and will deliver the best of both the worlds through the convergence of memory and storage product characteristics. This technology introduces architectural changes in the servers where in the DC Persistent Memory Modules are DDR4 socket compatible and can co-exists with conventional DDR4 DRAM DIMMs on the same platform.

These devices can be used as regular non-volatile memory devices as well as persistent storage media depends on how they are configured. Currently Intel Optane DC Persistent Memory Devices are supported by Intel Xeon 2nd Generation Scalable processors and are available in three different capacities of 128GiB, 256GiB and 512GiB. These capacities are much larger alternative to DRAM which currently caps at 128 GiB. The following screen shot shows Intel DC Persistent memory Module.

Related image, diagram or screenshot

Operating modes of Intel DC Persistent Memory Modules

Intel DC Persistent Memory Modules (PMEM) can be configured as Memory mode, App Direct Mode, or a combination of both the modes (also referred as Mixed mode/Dual mode).

Memory Mode: In this mode, Intel Optane PMEMs act as nonvolatile memory and since these devices are available in larger sizes, it enables larger memory capacities available to the Operating Systems. Hence it is kind of memory-extender while operating as DRAM like speed. The actual DRAM is used as Cache for DCPMM memory. CPU memory controller uses the DRAM as cache and the Intel Optane DC persistent memory as addressable main memory. Hence Operating system sees Intel Optane DC Persistent memory as main memory while the DRAM memory is invisible to the OS.

Since the Intel Optane PMEMs are available in large capacities than the DDR4 based DRAM devices, the system will have large memory capacities up to 6TiB in two-socket server and 12TiB in 4-socker server. This huge memory capacity within a single server enables certain memory bound applications, like In-Memory databases, traditional large working set database and Bigdata applications etc., to deal with larger datasets. It also enables customers to consolidate many virtual machines in a single server platform in virtualization environments.

App Direct Mode: In this mode, Intel Optane PMEM devices act as persistent storage media and provides blazing IO performance to the applications. In App Direct mode, these devices can be configured in the following ways.

Interleaved sets: When Intel Optane PMEM Devices configured as interleaved sets, all the PMEM devices within the socket are consolidated and presented as single logical disk to the Operating System enabling read/write operation striped across all the PMEM devices. For SQL Server databases deployments, it is recommended to use PMEM in this fashion for better IO performance and reduced maintenance.

Non-Interleaved sets: When Intel Optane PMEMs are configured as Non-Interleaved fashion, each individual PMEM device is presented to the Operating System as individual logical storage disk. This option gives more granularity and better control over data placement. For SQL Server databases, this gives more control on how data and log files can be distributed across the individual PMEM disks.

In the App Direct mode, Intel Optane PMEM storage can be used in two different ways. Firs, they can be used as regular block storage in which Intel Optane PMEM storage is exposed as block storage like traditional SATA/SAS and NVMe block devices. They can be formatted with traditional file systems like NTFS, ReFS, XFS and EXT4, and can be consumed by any legacy applications. It does not need any application level changes and hence it provides better adaptability of these devices in the market.

Intel Optane PMEMs can also be used as byte-addressable devices when formatted with Direct Access (DAX) file system option. Dax is a mechanism that enables Operating Systems to get direct access to the files stored on persistent memory devices by implementing memory mapping of stored files. Direct access to the PMEM devices bypasses the tradition kernel IO subsystem resulting in breakthrough IO acceleration. Some of the persistent memory aware applications, such as Microsoft SQL Server databases, will greatly benefit from this type of access as the read and write requests to such devices do not go through the software stack of the file system. Instead such applications can directly access these devices using user space load/store methods which are typical memory related operations involved in reading and writing pages from DRAM memory.

Note that DAX mode comes at the cost of allocating per-page metadata required for memory mapping of the files stored on these PMEM devices. For every 4KiB, 64 Bytes are required for maintaining the memory mapping data structures. This additional storage can be allocated from either DRAM memory or reserved portion of persistent memory itself. For instance, the amount of metadata storage requirement for a 3TiB persistent volumes is 48GiB. On Linux systems, while creating namespaces on the PMEM devices, use --map=mem option for consuming DRAM memory and –map=dev for consuming PMEM storage itself. Because Intel Optane PMEMs are available in larger capacities, the metadata storage for memory mapping can be stored on PMEM volumes itself. If there is enough DRAM memory available in the system, DRAM memory can be used which will also result in faster access to the metadata.

Mixed Mode: This mode is the combination of both Memory and App Direct modes. It allows us to configure percentage of DCPMM capacity used in Memory Mode and the remaining in App Direct Mode. This mode provides flexibility to customers to split the persistent memory capacity to best fit their business needs.

Intel Optane DC Persistent Memory Devices can be managed and monitored configured using two ways.

1.     Infrastructure management tools like Cisco UCS and IMC.

2.     Operating Systems utilities like ipmctl (Linux), ndctl and PowerShell (Windows) etc.

Red Hat Enterprise Linux 8

RHEL 8 provides support for a wide range of new and innovative technologies, which includes support for Intel DC Persistent Memory. Support for PMEM has been available since RHEL 7.3 as block storage and enhanced to great extent in RHEL 8 and supports DAX option. DAX option is supported by XFS and EXT4 filesystems. An official support statement from Red Hat can be found at: https://access.redhat.com/articles/4070821. ipmctl and ndctl utilities can be used to manage PMEM devices.

Windows Server 2019

Intel Optane Persistent Memory is fully supported in Windows Server 2019. Memory and App Direct modes are supported Window 2019. Persistent Memory devices are discovered by a new media type called Storage Class Memory (SCM). In the App Direct mode, NTFS and ReFS filesystems supports these devices when PMEMs configured in traditional block storage devices while only NTFS filesystem supports DAX capability. Power shell command-lets can be used to manage the Intel Optane DC Persistent Memory Devices.

SQL Server 2019

SQL Server 2019 is the latest release from Microsoft which includes support for Persistent memory devices in both Linux and Windows Operating systems. SQL Server Hybrid Buffer Pool and Enlightenment mode are the new features that leverages Intel DC Persistent Memory devices for blazing IO performance. Tail of Log (also referred as Transaction commit acceleration) is another use case which can leverage PMEMs and supported since SQL Server 2016 SP1. Upcoming sections will provide more details on different use cases which can benefit from using Intel DC Persistent memory devices.

Cisco UCSM

Cisco UCS Manager (UCSM) provides unified, embedded management for all software and hardware components in the Cisco UCS. Using Single Connect technology, it manages, controls, and administers multiple chassis for thousands of virtual machines. Administrators use the software to manage the entire Cisco Unified Computing System as a single logical entity through an intuitive GUI, CLI, or an XML API. Cisco UCS Manager resides on a pair of Cisco UCS 6300 and 6400 Series Fabric Interconnects using a clustered, active-standby configuration for high availability.

Cisco UCS Manager offers unified embedded management interface that integrates server, network, and storage. Cisco UCS Manager performs auto-discovery to detect inventory, manage, and provision system components that are added or changed. It offers comprehensive set of XML API for third part integration, exposes 9000 points of integration and facilitates custom development for automation, orchestration, and to achieve new levels of system visibility and control.

Service profiles benefit both virtualized and non-virtualized environments and increase the mobility of non-virtualized servers, such as when moving workloads from server to server or taking a server offline for service or upgrade. Profiles can also be used in conjunction with virtualization clusters to bring new resources online easily, complementing existing virtual machine mobility.

For more information about Cisco UCS Manager, go to: https://www.cisco.com/c/en/us/products/servers-unified-computing/ucs-manager/index.html

Cisco Integrated Management Controller (IMC)

The Cisco® Integrated Management Controller is a baseboard management controller that provides embedded server management for Cisco UCS™ C-Series Rack Servers and Cisco S-Series Storage Servers. The Cisco IMC enables system management in the data center and across distributed branch-office locations. It supports multiple management interfaces, including a Web User Interface (Web UI), a Command-Line Interface (CLI), and an XML API that is consistent with the one used by Cisco UCS Manager. IMC also supports industry-standard management protocols, including Redfish v1.01, Simple Network Management Protocol Version 3 (SNMPv3), and Intelligent Platform Management Interface Version 2.0 (IPMIv2.0).

Cisco IMC and UCS Manager release 4.0(4) introduces support for Intel Optane DC Persistent Memory Devices on the Cisco UCS M5 servers that are based on the Intel Xeon Second Generation Scalable Processors. Cisco UCS policies allows us to configure Intel Optane PMEM devices in various modes as discussed above sections and allows it to be consumed in the Server profiles which will be applied to the blade or rack server.

Intel Persistent Memory devices can also be configured using Cisco CIMC when IMC is used to manage Cisco UCS rack server in stand-alone mode (Not UCSM managed). It supports various ways to configure Intel Optane DC Persistent Memory Devices like IMC GUI, CLI and XML etc.

Cisco UCS M5 Server for Intel Optane DC Persistent Memory Devices

Various Cisco UCS M5 rack and blade servers which are based on Intel 2nd Gen Scalable Processors support Intel Optane PMEMs in all the operating modes described in the above sections. The list of servers that support Intel Optane PMEMs:

     Cisco UCS C240 M5

     Cisco UCS C220 M5

     Cisco UCS B200 M5

     Cisco UCS C480 M5

     Cisco UCS B480 M5

     Cisco UCS S3260 M5

Cisco UCSM and IMC—Intel Optane DC Persistent Memory Devices configuration

This section provides steps for configuring Intel Optane PEMs using Cisco UCSM GUI and Cisco IMC.

Note:       Intel Optane PMEMs can also be configured using in various ways Cisco IMC CLI, IMC XML, UCSM CLI. For complete management and troubleshooting steps for configuring Intel Optane PMEM, refer the link: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/persistent-memory/b_Configuring_Managing_DC-Persistent-Memory-Modules.html

Configuring Intel Optane DC Persistent Memory Devices using Cisco UCSM policies

Cisco CIMC and UCS Manager release 4.0(4) introduces support for Intel Optane PMEM on the Cisco UCS M5 servers that are based on the Intel Xeon 2nd Generation Scalable Processors.

To ensure the best server performance, it is important that to follow memory performance guidelines and population rules before you install or replace persistent memory modules.

Refer following link for more details on the population guidelines for 2-socker based servers ( C240M5, C220M5 and B200 M5): https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/hw/C240M5/install/C240M5/C240M5_chapter_010.html - concept_v1f_mtr_tgb

Refer follow link for more details on the population guidelines for 4-socker based servers ( C480M5 and B480M5): https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/hw/C480M5/install/C480M5/C480M5_chapter_011.html - concept_v1f_mtr_tgb

Once the server is populated with DDR4 & Intel Optane PMEMs , below steps to be followed to do the configuration.

UCSM Policy for configuring Intel Optane DC Persistent Memory Devices

To configure Intel Optane PMEM devices using Cisco UCSM, a persistent memory policy needs to be created. This policy will be consumed by a Service Profile which will be associated to a rack or blade server. Below are the steps for creating a UCMS policy for Intel Optane DC Persistent Memory Devices configuration.

1.     Log on to the UCSM and navigate to servers -> Policies -> root -> Persistent Memory Policy. Right click on the Persistent Memory Policy and select create option. On the new prompted window enter name of the Policy and Description.

2.     Under general tab, Click on Add button to create a Goal.

3.     Enter 100 under Memory Mode (%) for completely configuring Intel Optane PMEM as volatile memory. “Persistent Memory Type” setting is not applicable for Memory Mode. Also creating Namespace is not applicable for when PMEMs are used in memory mode.

4.     Enter 0 under Memory Mode (%) for completely configuring Intel Optane PMEM as Persistent memory.

5.     Select App Direct or App Direct NonInterleaved options depending on the workload requirement.

The following figure shows how to configure Intel Optane DC Persistent Memory Devices as Interleaved App Direct mode using UCSM policy.

Related image, diagram or screenshot

6.     It is a good practice to create namespaces using vendor-agnostic utilities like ndctl in RHEL and PowerShell in Windows. Hence skip the Namespace creation at this stage. Creating Namespaces will be covered in detail in the corresponding sections of Windows 2019 and RHEL 8.

7.     To secure the Intel Optane PMEMs, click on “Create Local Security” and enter Secure Passphrase and Deployed Secure Passphrase and click Ok.

8.     Click Ok to close the Create Policy window.

Using DCPMM Policy in Service Profile

Once UCSM policy for Intel Optane DC Persistent Memory Devices is created, it needs to be used in a service profile and then apply the service profile to a server. Creation of full server profile is out of the scope of this document (refer UCSM documentation). To use the DCPMM policy in the service profile, follow the below steps.

1.     Go to Server tab in the UCSM and select the service profile in which you want to include the persistent memory service profile.

2.     In the Work pane, click the Policies tab.

3.     In the Policies area, expand Persistent Memory Policy.

4.     From the Persistent Memory Policy drop-down list, select the persistent memory policy that you want to include in this service profile

5.     Click Save Changes.

Finally, apply the service profile to a server which has the Intel Optane persistent memory DIMMS installed. The following screen shot shows applying “DCPMM-AppDirect” persistent memory policy to a sample service profile named as ‘TEST1’.

Related image, diagram or screenshot

The following two screen shot show more details and temperature statistics of Intel Optane DCPMM devices collected by Cisco UCSM manager.

Related image, diagram or screenshot

Related image, diagram or screenshot

Once the service profile is applied to the server successfully, install required Operating System and then Namespaces can be configured using steps explained in the following sections.

Configuring Intel Optane DC Persistent Memory Devices using Cisco CIMC

This section provides steps to configure Intel Optane DC Persistent Memory Devices devices using Cisco IMC.

1.     Log on to the Cisco IMC web page of a stand-alone server and click on Compute on the left pane. On the right-hand side pane, click on persistent Memory tab. This tab will display all the Intel Optane DC Persistent memory devices under DIMM Details section.

2.     Click on “Configure Memory Usage” on the top. This opens up a pop up to configure the Intel Optane PMEMs.

3.     On the Configure Memory Usage window, enter value for Memory Mode text box. Enter ‘100’ to configure Intel Optane PMEMs in Memory mode. Enter ‘0’ to configure PMEMs in App Direct mode. Persistent Memory Type Option is enabled only when the Intel Optane PMEMs are configured in App Direct mode. From the drop down list, select either App Direct (interleaved) or AppDirectNonInterleaved option based on requirement as shown in the below figure.

Related image, diagram or screenshot

4.     Click on ‘Create Goal” button to create the goal.

5.     Do not create any Namespaces now. As mentioned earlier in the UCMS section, it is recommended to create Namespace within Operating System.

6.     Click on ‘Save’ to save the configuration. IMC will now reboot the server to apply the configuration.

7.     Once the server is rebooted, you can view the Region details when Intel Optane PMEMs are configured in AppDirect Mode as show in the following screenshot.

Related image, diagram or screenshot

Once the server is online, install required Operating System and then Namespaces can be configured using steps explained in the below sections.

Operating System—Intel Optane DC Persistent Memory Devices configuration

Several Operating Systems and different distributions of Linux include support for both App Direct and Memory mode. Refer the complete list of OS support for Intel Optane DC Persistent Memory Devices: https://www.intel.com/content/www/us/en/support/articles/000032860/memory-and-storage/data-center-persistent-memory.html

This section provides steps to provision Intel Optane PMEMs in RedHat Enterprise Linux 8 and Windows Server 2019. In RHEL8, ipmctl and ndctl utilities are used to provision Intel Optane PMEMs while in Windows 2019, ndctl utility or Power shell command-lets are used. Make sure to install ipmctl and ndctl tools in RHEL 8.

Note that creating Namespaces is only applicable when Intel Optane PMEMs are configured as either “App Direct” or “App Direct Non-Interleaved” mode. Namespaces are not applicable when DCPMM is configured in Memory Mode.

Note that the prerequisite before proceeding with the following steps is to create a goal (which ultimately creates regions automatically) using either UCSM or IMC.

Configuring Intel Optane DC Persistent Memory Devices for RHEL

This section provides steps to create Namespaces using in RedHat Enterprise Linux 8 Server.

In the supported Linux distributions for persistent devices, ipmctl and ndctl are the two tools being used to completely manage the goals and namespaces. However, since goal is already configured using UCSM or IMC at the platform level, the following sections covers listing goals using ipmctl and managing namespaces using ndctl tool.

Open command shell with root login and follow the below steps to create namespaces or persistent memory disks in RHEL 8 server.

1.     Ipmctl tool can be to manage the Intel Optane DC Persistent Memory Devices in the Linux based Operating Systems. Use ipmctl show -topology command to list both DDR4 DRAM and Intel Optane PMEM devices as shown below.

Related image, diagram or screenshot

2.     The ipmctl show -dimm command displays the persistent memory modules discovered in the system and verifies that software can communicate with them.

Related image, diagram or screenshot

3.     To check the capacity provisioned for use in different operating modes, use ipmctl show -memoryresources command. The MemoryCapacity and AppDirectCapacity values can be used to determine if the system was configured in Memory mode, App Direct mode, or mixed mode. The screen shot below shows that the persistent memory modules are currently configured in App Direct mode.

Related image, diagram or screenshot

Note that ipmctl is powerful utility which can be used to create/delete goals and capable of performing other troubleshooting actions as well. For the complete list of supported commands by ipmctl execute ipmctl show -help.

Since Goal is already configured using Cisco UCSM/IMC at the infrastructure level, it is not advised to create goal using ipmctl tool. Therefore, the next step is to proceed with creating name spaces (partitions) using ndctl utility.

4.     ndctl tool gives better control and more options when provisioning Persistent memory devices. one should focus on the following options of ndctl utility for optimal configuration of Persistent Memory Devices.

     --mode=fsdax: Filesystem-DAX mode is the default mode of a namespace when specifying ndctl create-namespace with no options. It creates a block device (/dev/pmemX[.Y]) that supports the DAX capabilities of Linux filesystems (xfs and ext4 to date). This needs PMEM aware applications to be able to benefit from the Direct access provided by DAX option.

     --mode=raw: Raw mode is effectively just a memory disk that does not support DAX. This mode is compatible with other operating systems, but again, does not support DAX operation. This mode can be used for legacy applications.

     --map=mem: When PMEMs are configured with fsdax mode, the metadata required for maintaining memory mapping will be stored in DRAM memory. Use this option when you have enough DRAM memory available in the system.

     --map=dev: When PMEMs are configured with fsdax mode, the metadata required for maintaining memory mapping, will be stored in PMEMs itself.

Note that --map option is applicable only when fsdax mode is used. Other modes do not maintain metadata hence does not need additional storage.

5.     The following screen shot shows two namespaces created with fsdax option and stores the metadata in DRAM memory (--map=mem). This was tested on a two socket Cisco C240 M5 server populated with 12x 64G DDR4 and 12x 512G Intel Optane PMEMs. Verify that Namespaces are created with fsdax option in order use the PMEM for direct access byte addressable storage.

Related image, diagram or screenshot

Note that multiple Namespaces (partitions) can created in a region, The size of the namespace is same as the region size minus size of the metadata by default. We can specify size of the namespace by using --size option.

6.     Once a name space is created with DAX option, Linux PMEM drivers creates new device in /dev/pmemX.Y format. X represents Region id and Y represents Namespace number which is sequentially increasing number.

7.     To view the existing Namespaces, execute ndctl list. To list all the Namespaces in the given Region, execute ndctl list –region <region Id>.

8.     Once a Namespace is created on the PMEM device, format the volumes with a filesystem. RHEL8 supports xfs and Ext4 filesystem with fsdax option. The below screen shot shows formatting PMEM devices with XFS filesystem and mounting with --o dax option and finally setting 2MB as stripe unit size which is recommended for SQL Server database deployments.

Related image, diagram or screenshot

9.     When planning to leverage Intel Optane PMEMs for SQL Server databases on Linux platforms, it is recommended enable the SQL Server trace flag 3979 as per the recommendation suggested by Microsoft: https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-pmem?view=sql-server-ver15

For other SQL on Linux best practices: https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-performance-best-practices?view=sql-server-linux-2017

Configuring Intel Optane DC Persistent Memory Devices for Windows

The following sections provides steps to create Namespaces using Powershell commands in Windows Server 2019.

Open Powershell with administrator rights and follow the below steps to create namespaces or persistent memory disks in Windows server 2019.

1.     First list all the available Intel Optane PMEM devices in Windows server by running “Get-PmemPhysicalDevice” as shown below.

Related image, diagram or screenshot

2.     Execute “Get-PmemUnusedRegion” to list the regions created by the goal that you set in the server platform using UCSM or IMC.

3.     Next step is to create a Namespace or persistent memory disk using New-PmemDisk command let. It accepts the RegionId as input which you can pipe it using Get-PmemUnusedRegion. If you do not pass the specific RegionId, it creates one Persistent Device for each unused region.

Note that when New-PmemDisk is executed with “-AtomicityType BlockTranlationTable” option, AppDicrect storage device will be used as traditional storage device where access to these devices needs to go thorugh the complete IO stack of filesystem.

Related image, diagram or screenshot

Related image, diagram or screenshot

4.     Once the persistent memory volumes created as shown above, the device manager and Disk Management UI will list all the available Persistent memory devices as shown below.

 

Related image, diagram or screenshot

5.     Next step is to initialize, create partition and format the persistent memory devices with DAX option. Microsoft recommends using the largest allocation unit size available for NTFS (2 MB in Windows Server 2019) when formatting PMEM devices as referenced here: https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/hybrid-buffer-pool?view=sql-server-ver15 - best-practices-for-hybrid-buffer-pool

Related image, diagram or screenshot

Related image, diagram or screenshot

6.     Verify if the device is formatted for DAX by using fsutil.exe utility as shown above.

Now the persistent memory devices have been configured with DAX option and ready for consumption as storage by SQL Server databases.

SQL Server Database use cases

Different configuration modes of Intel Optane DC Persistent memory allow us to use them in different databases use cases. The following sections discusses few use cases in which SQL Server databases can be deployed for getting maximum benefit from Intel Optane DC Persistent memory devices when configured in App Direct mode.

Persistent Memory Devices for Transaction Log Files and Tail of Log Caching

In Large scale transactional systems, log commit rate is one of the important factors that can affect the overall transactional throughput. The faster the log performance (hardening log buffers to persistent storage media), the higher is the transaction throughput. Intel Optane DC Persistent Memory Devices can be leveraged to improve SQL Server transaction throughput by using them as persistent storage medium.

Persistent Memory Devices as traditional Block storage

When Intel Optane DC Persistent Memory Devices formatted without DAX option, they can be accessed as regular block devices just like any other SATA/SAS/NVMe devices. Intel Optane PMEM devices, which are installed on the memory bus closer to the CPUs, provide much lower device access latencies than today’s fastest NVMe storage devices. Also, this option does not need any application changes. Old legacy applications can leverage these high performing persistent memory devices for data persistence without any changes in the application.

Typically, in a traditional high-volume transitional system, transaction log becomes bottleneck as log buffers need to be hardened to the persistent media before committing any transaction. When transaction log files of such databases are migrated from legacy block devices to the persistent memory devices, the transaction throughput can be improved by multi fold due to lower access latencies of persistent memory devices.

Moving user database transaction log file to the persistent memory device is a simple step which can be referred here: https://docs.microsoft.com/en-us/sql/relational-databases/databases/move-user-databases?view=sql-server-ver15

Tail of Log

Tail of Log Caching feature (also known as Transaction commit acceleration) was first introduced in SQL Server 2016 SP1 in combination with Windows Server 2016 when NVDIMMs were introduced to the market.

When Persistent memory devices are formatted using filesystem with DAX option, Operating system allows byte-addressable direct access to the Intel Optane PMEM devices by eliminating IO kernel software stack. This benefits PMEM aware applications, such as SQL Server, to directly access these devices at much lesser latencies using user space load/store commands. This enables SQL Server to directly store log buffers (typically ranges 512 Bytes to 60KB) on PMEM storage which gives DRAM-like performance. Since PMEM devices are natively persistent, there is no need to flush log buffers again to the other persistent media (like SSDs/HDDs). Typically, the log flushing to traditional persistent devices (SSDs/HDDs) is the costly IO operation and have been the main bottleneck in high transactional systems which is now avoided to a large extent using PMEM storage, thereby increasing the transaction throughput by multi folds.

For more details on Tail of Log Caching implementation in SQL Server 2016, refer the link: https://docs.microsoft.com/en-us/archive/blogs/bobsql/how-it-works-it-just-runs-faster-non-volatile-memory-sql-server-tail-of-log-caching-on-nvdimm.

This use case does not need much PMEM storage as Intel Optane DC Persistent Memory Devices are available in much bigger sizes (compared to NVDIMMs). Also Tail of Log caching needs only 20MB size of PMEM storage for adding additional Log file on the PMEM storage. Hence select least size (128G) Intel Optane DC Persistent Memory Devices and populate them in 2-1-1 fashion as explained here: https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/data-center-persistent-mem/Population-Configuration.pdf

Once the persistent memory volume is configured and formatted as a DAX option, all that remains is to add a new log file to the database using the same syntax as any other log file, where the file resides on the DAX volume. The log file on the DAX volume will be sized at 20MB regardless of the size specified with the ADD FILE command as shown below:

 

Related image, diagram or screenshot

In order to disable the persistent log buffer feature, remove the log file from the DAX volume using below T-SQL commands:

ALTER DATABASE < User-DB> SET SINGLE_USER

ALTER DATABASE < User-DB> REMOVE FILE < Name of DAXLOG >

ALTER DATABASE < User-DB> SET MULTI_USER

Any log records being kept in the log buffer will be written to disk, and at that point there is no unique data in the persistent log buffer, so it can be safely removed.

Note that as Intel Optane DC Persistent Memory Devices are available bigger sizes, complete database can now be stored on the DAX enabled PMEM storage volumes. The following sections discuss use cases and features which can benefit from large Intel Optane DC Persistent Memory Device sizes.

Hybrid Buffer Pool (HBP)

Hybrid Buffer Pool is all about extending SQL Server limited DRAM-based buffer pool sizes to much bigger sizes by memory mapping all the database files onto Persistent Memory devices which are near DRAM performant and natively persistent too.

Hybrid Buffer Pool is a new feature introduced in SQL Server 2019. It requires Persistent Memory devices (PMEM) to be formatted with DAX option. It allows database engine to directly access data pages in database files stored on persistent memory devices. This option is currently supported by NTFS, XFS and EXT4 files systems.

When databases files are stored on the PMEM devices, SQL Server will automatically detect if data files reside on an appropriately formatted PMEM device and perform memory mapping in user space. This mapping happens during startup, when a new database is attached, restored, created, or when the hybrid buffer pool feature is enabled for a database.

In a traditional system without PMEM, SQL Server caches data pages in the buffer pool. With hybrid buffer pool, SQL Server skips performing a copy of the page into the DRAM-based portion of the buffer pool, and instead accesses the page directly on the database file that lives on a PMEM device. Read access to data files on PMEM devices for hybrid buffer pool is performed directly by following a pointer to the data pages on the PMEM device.

Only clean pages can be accessed directly on a PMEM device. When a page is marked as dirty it is copied to the DRAM buffer pool before eventually being written back to the PMEM device and marked as clean again. This will occur during regular checkpoint operations. Hybrid buffer pool feature is available for both Windows and Linux. Read heavy workloads with larger working set which require more memory would benefit from this feature. Below section lists the steps for configuring Hybrid Buffer Pool.

Configuring Hybrid Buffer Pool

In SQL Server 2019, Hybrid Buffer Pool (HBP) must be configured first at instance level (server-scoped). Then HBP can be enabled or disabled at individual database level (database-scoped).

Use the following T-SQL command to first enable the Hybrid Buffer Pool at instance level.

ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED HYBRID_BUFFER_POOL = ON;

Note that HBP is disabled by default at the instance level. For the setting change to take effect, SQL Server instance must be restarted. A restart is needed to facilitate allocating enough hash pages, to account for total PMEM capacity on the server.

The next is to enable HBP for the individual user databases by using the below T-SQL command. It is assumed that the database files, for which HBP is enabled, are stored on DAX enabled persistent memory volumes.

ALTER DATABASE <user-DBName> SET MEMORY_OPTIMIZED = ON;

By default, hybrid buffer pool is set to enable at all the user databases as well as system databases. You can disable HBP for system databases manually.

Note that if the instance-scoped setting for the hybrid buffer pool is set to disabled, the hybrid buffer pool will not be used by any user database.

HBP can be disabled at the instance level and individual databases using following commands.

When HBP is disabled at the instance level, it is required to restart the SQL instance for the change to take effect.

ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED HYBRID_BUFFER_POOL = OFF;

ALTER DATABASE < user-DBName > SET MEMORY_OPTIMIZED = OFF;

The following screen shot shows T-SQL commands to enable HBP at instance level and database level and shows some important HBP related messages captured in the SQL Server error log.

Related image, diagram or screenshot

Note:       Please ensure to enable trace flag 1810 in order to view the memory mapping information of database files in the SQL Server errorlog files.

For optimal performance, while creating user databases, make sure the sizes of database files are created in multiples of 2MB as the volumes are formatted with 2MB allocation unit size in windows as well as RHEL.

In Windows environments, enable Lock Pages in Memory setting for SQL Server service account using Windows Local Group Policy editor (gpedit.msc). Refer the link for configuring this option: https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/enable-the-lock-pages-in-memory-option-windows?view=sql-server-ver15

When you enable HBP for a database, it is recommended to store all the data and transaction log files on the DAX enabled storage volumes. Storing few data files in DAX enabled storage volumes and few data files on traditional block storage volumes is not supported.

To view the current HBP configuration, execute the below command to report HBD status for each database.

SELECT * FROM sys.server_memory_optimized_hybrid_buffer_pool_configuration;
go

SELECT name, is_memory_optimized_enabled FROM sys.databases;
go

SQL Server 2019 Enlightenment Mode

As discussed earlier, when Intel Optane DC Persistent Memory Devices formatted with DAX option in the AppDirect mode, offer byte-addressable storage bypassing the Kernel IO stack. SQL Server 2019 leverages this feature (called enlightenment mode) which allows us to store data and log files directly on the DAX formatted volumes there by enabling faster access at much lower latencies and better transactional throughput. As of today, SQL Server 2019 enlightenment mode is fully supported on Linux operating systems only (refer here: https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-pmem?view=sql-server-ver15). To leverage this feature, create a database on DAX formatted volumes and it automatically get enabled for enlightenment mode.

Conclusion

The new Intel Optane DC persistent memory redefines traditional architectures, offering a large and persistent memory tier at affordable cost. With breakthrough blazing IO performance, higher capacity persistent memory and combined with Second Gen Intel Xeon Scalable processors, Intel Optane DC persistent memory accelerates IT transformation to support the demands of the data era, with faster-than-ever-before analytics, cloud services, and next-generation communication services. PMEM aware application, like SQL Server databases, will get maximum benefit from the Intel optane DC Persistent memory. It is recommended to test applications with Intel Optane DCPMM options before production rollout, to ensure the performance gains as the mileage may vary based on different parameters such as data volumes, user concurrency etc

Appendix

Linux utilities

Refer the below github links for ipmctl and ndctl utilities.

https://github.com/intel/ipmctl

https://github.com/pmem/ndctl

Recommended BIOS Setting when using Intel Optane DC Persistent Memory

The following list provides recommended BIOS settings when using Intel DC Persistent Memory devices in App Direct mode.

For SQL Server database deployments on Cisco UCS M5 server with Intel Optane Persistent Memory Modules, the following tables provides some of the BIOS options are recommended for high performance.

Package C State Limit

C0 C1 State

Processor C-States (C, C1E, C3, C6, C7)

Set processor C State

Sub NUMA Clustering

Disabled

IMC Interleave

Auto

XPT, UPI, LLC Prefetch

Enabled

Turbo Mode

Enabled

Hardware P-States

HWPM Native Mode

Power Performance Tuning

OS

Energy/Performance Bias Config

Balanced Performance

Processor EPP profile

Performance

Learn more