Having earlier run memchached performance tests on the Sun Fire X2270 Server running OpenSolaris, Sun Senior Staff Engineer Shanti reports on a similar test involving RHEL5 in a blog entitled OpenSolaris Beats Linux on Memchached.
Shanti writes that a 10GBE Intel Oplin card was used in order to achieve the high throughput rates possible with these servers, though using this card on Linux called for driver and kernel re-builds.
- With the default ixgb driver from the RedHat distribution (version 1.3.30-k2 on kernel 2.6.18), the interface simply hung during the benchmark test.
- This led to downloading the driver from the Intel site (1.3.56.11-2-NAPI) and re-compiling it. This version does work, giving a maximum throughput of 232K operations/sec on the same Linux kernel (2.6.18). This version of the kernel does not have support for multiple rings, however.
- The kernel version 2.6.29 includes support for multiple rings but still doesn't have the latest ixgb driver which is 1.3.56-2-NAPI. Shanti reports that it was necessary, therefore, to download, build and install these versions of the kernel and driver. This worked well giving a maximum throughput of 280K with some tuning.
Results Comparison
The system running OpenSolaris and memcached 1.3.2 delivered a maximum throughput of 350K ops/sec as previously reported. The same system running RHEL5 (with kernel 2.6.29) and the same version of memcached resulted in 280K ops/sec. Shanti's conclusion: OpenSolaris outperforms Linux by 25%.
Linux Tuning
The following Linux tunables were changed in an effort to attain the best performance:
- net.ipv4.tcp_timestamps = 0
- net.core.wmem_default = 67108864
- net.core.wmem_max = 67108864
- net.core.optmem_max = 67108864
- net.ipv4.tcp_window_scaling = 0
- net.core.netdev_max_backlog = 300000
- net.ipv4.tcp_max_syn_backlog = 200000
Shanti provides the ixgb specific settings that were used (2 transmit, 2 receive rings):
RSS = 2,2 InterruptThrottleRate = 1600,1600
OpenSolaris Tuning
The following settings in /etc/system were used to set the number of MSIX:
- set ddi_msix_alloc_limit=4
- set pcplusmp:apic_intr_policy=1
For the ixgbe interface, 4 transmit and 4 receive rings gave the best performance, the tests revealed:
tx_queue_number=4, rx_queue_number=4
Binding the crossbow threads was also done, the blogger reports:
dladm set-linkprop -p cpus=12,13,14,15 ixgbe0
More Information
Sun Fire X2270 Server
OpenSolaris
[...read more...]