SMB Multichannel and SMB Direct (RDMA)
SMB Multichannel and SMB Direct (often referred to simply as RDMA) are two technologies enabling high performance computing for the SMB network filesystem protocol. They differ in their benefits and implications for SMB client/server workloads. This is a discussion of the two technologies and recommendations on implementation for Fusion File Share.
SMB Multichannel
SMB Multichannel is part of the SMB 3.0 protocol and, as its name suggests, it creates multiple connections for a single SMB session between client and server. This allows for increased throughput by utilizing multiple connections either through a single adapter or multiple adapters. And in the case of multiple adapters, if one adapter goes down then a client can continue uninterrupted due to the fault tolerance of a simultaneous connection available on another adapter. And finally, SMB Multichannel usually does not require much if any configuration.
Although throughput and fault tolerance are often seen as the primary benefits of multichannel, preventing CPU bottleneck is an important one especially for workloads with many small IO operations. RSS-capable (Receiver Side Scaling) network adapters allow SMB Multichannel to assign each TCP connection to a separate core thereby load-balancing across the available CPU resource. This prevents the bottleneck that can be caused by a single TCP connection for the entire workload consuming a single core and leading to congestion.
For Fusion File Share, SMB Multichannel is automatically detected in RSS-capable NICs. However, occasionally Fusion may not be able to detect an RSS-capable NIC due to a specific network driver. For this reason, it is recommended that multichannel be explicitly enabled in the tsmb.conf file with:
listen = ANY,0.0.0.0,IPv4,445,DIRECT_TCP,RSS=2
By default, the Windows client will create four connections per RSS-capable interface but can be configured as high as 16. Fusion will accept the number of connections that the client initiates. We recommend the client be configured for 16 connections per interface with:
PS C:\>Set-SmbClientConfiguration -ConnectionCountPerRssNetworkInterface 16
Verifying Multichannel
It is important to verify multichannel to ensure you have multiple TCP connections that will prevent CPU bottlenecking.
On the Windows client, first identify the adapter used for SMB traffic. In the example below it is the 100Gbps Mellanox ConnectX-5 adapter named ‘Slot 03 3 Port2’. Then check if RSS is enabled.
Verify that the Windows SMB client has multichannel enabled:
Next verify that the SMB client can properly detect the NICs RSS capability:
After a share is mounted, write a file to the share and then verify the connection:
SMB Direct (RDMA)
While SMB Multichannel creates multiple simultaneous TCP connections, SMB Direct bypasses the TCP stack and allows for remote direct memory access (RDMA) between client and server. For simplicity’s sake, we’ll refer to SMB Direct simply as RDMA.
RDMA provides not only the benefit of high throughput but also low latency and perhaps most importantly low CPU utilization. And like multichannel, fault tolerance is achieved when multiple adapters are used.
Configuring RDMA should include verifying MTU along every link between server and client as well as any network adapter manufacturer’s recommendations such as QoS.
Fusion uses SMB Multichannel to automatically detect RDMA-capable network adapters. So along with enabling multichannel, the tsmb.conf should explicitly enable RDMA by adding a line to the listen parameters:
listen = ANY,0.0.0.0,IPv4,445,DIRECT_TCP,RSS=2
listen = ANY,0.0.0.0,RDMA_IPv4,445,SMBD
Similar to multichannel, Fusion will accept the number of RDMA connections initiated by the Windows client. We recommend the Windows client be configured for 8 RDMA connections per interface:
PS C:\> Set-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters" ConnectionCountPerRdmaNetworkInterface -Type DWORD -Value 8 -Force
If establishing RDMA between client and server fails for some reason, Fusion with fall back to the TCP stack. For this reason, it is important to verify RDMA since you may be connecting via TCP rather than RDMA.
On the Windows client, first identify the adapter used for SMB traffic. In the example below it is the 100Gbps Mellanox ConnectX-5 adapter named ‘Slot 03 3 Port2’. Then check if RDMA is enabled.
Next verify that the SMB client can properly detect the NICs RDMA capability:
After a share is mounted, write a file to the share and then verify the connection:
Important note: Even if a SMB client indicates it can establish RDMA connection to use SMB Direct (SMBD), our tests with a Windows SMB client shows that it only establishes an RDMA connection when there is some form of metadata or data intensive workload from the client. This is implementation-specific behaviour of a SMB client and may be subject to change.