When we launched the OneConnect® OCe14000, our latest Ethernet and Converged Network Adapters (CNAs), we touched on a number of performance and data center efficiency claims that are significant enough to expand on. The design goals of the OCe14000 family of adapters was to take the next step beyond what we have already delivered with our three previous generations of adapters, by providing the performance, efficiency and scalability needs of the data center, Web-scale computing and evolving cloud networks. We believe we delivered on those goals and can claim some very innovative performance benchmarks.
A 4x Improvement in Packet Performance vs. Previous Generation Adapters Fundamental to delivering high Network Interface Card (NIC) performance under all conditions is to have a characteristic of handling a high rate of incoming/outgoing Ethernet packets. This is often referred to as frame rate or packet rate and expressed in terms of how many can be transferred per second, so the terms FPS or PPS (frames per second or packets per second) are often used. There are a few reasons why this is important. First is the fact that if the number of frames coming into the receiver is higher than can be processed by the NIC, the remainder are simply dropped. There are a number of application allowances or upper level protocol methods to work around dropped frames, all of which are less ideal than simply not dropping the frames in the first place. We designed the OCe14000 family to perform at 4x the frame rate of the previous OCe11100 family. Not to say that the OCe11100 family was bad, in fact, it had a higher frame rate than any other adapter with hardware storage offloads in the current market. There is an industry standard testing procedure which guidelines fair practices for testing Ethernet devices for frame rate, which dates back to 1999, and is published under the IETF as RFC2544. We use the guidelines of the RFC2544 test to measure throughput through a device using a network load generator at various frame sizes. The load generator increases the load until it reaches a point that the device cannot return all the frames sent to it. This load level is the frame rate, and a resulting calculation of the frame size and frame rate gives us the “no-drop” rate measured as percentage of 10Gb Ethernet (10GbE) line rate. The first significant finding in the RFC2544 performance charts above is that the OCe14102 adapters provides 4x greater frame rate. Second is the improvement in “no drop” rate in the small-to-medium frame size range. This can help relieve network congestion, since an adapter that is not dropping frames is not retransmitting to recover the lost frames. Up to 1.5M I/O Operations Per Second (IOPS) Block Storage I/O Performance
The OCe14102 offers hardware offloaded and accelerated block protocol support for iSCSI and Fibre Channel over Ethernet (FCoE) with a maximum transaction performance of up to 1.5M IOPS. While 1.5 million IOPS sounds like a big number, and it is nearly 2x higher than the OCe11102 CNA that precedes it, that isn’t the whole story. This is where network and storage adapters are like fast cars, to accelerate quickly, you need lots of horsepower. Achieving 1.5 million IOPS at 512 data byte block sizes is an indicator of the horsepower our adapter has, but the real goal is to perform as many transactions that the 10GbE link bandwidth permits at a data block size that high performance applications require. This typically starts at 4k block sizes or larger. So we can liken the need to have an engine that will make our car reach top speed as soon as we get on the freeway, to needing the processing power of our adapter to get up to line rate bandwidth at 4k block sizes. Think of our 1.5 million IOPS claim as just the smoky burnout in a drag race. The chart below shows the relationship between IOPS and throughput of a unidirectional 10GbE link. The maximum link bandwidth is around 1150MB/s for FCoE at 10GbE. One port of the OCe14102 is able to almost reach the link bandwidth limit at 2k data block sizes and solidly reaches it at 4k and above. This is a key point because once data throughput reaches link bandwidth, no greater amount of IOPS is possible unless another port is added, and high transaction applications such as Oracle RDBMS, SQL Server, MS Exchange and others, all require high IOPS at data block sizes 4k and bigger.
Lower Server Power Consumption The hardware offload technology of the OCe14102 can help lower server power consumption by up to 50 watts for a typical two socket rack optimized server. The basis for this claim comes from comparing the server power consumption when measured at the AC receptacle while I/O performance benchmarks are performed. Modern multi-core servers are designed to be power efficient and employ technologies and profiles to save on power consumption. For every new generation, these servers are under market pressure to use less power at times of low activity and provide ever-increasing performance during peak load times. The effectiveness of these server power savings strategies can easily be verified by monitoring a server during various levels of activity with a simple AC power measuring tool. I/O operations can have a costly effect on server utilization as well as power consumption, depending on how much of the I/O’s upper level protocol work must be performed within the operating system. The OCe14000 series of adapters employ a number of power saving strategies, including the ability to offload much of the protocol processing from the server onto the adapter’s internal processors, allowing the server’s processors to remain in a low power consumption state, even while processing very high levels of I/O through the adapters. Adapters that do not have hardware offloads for iSCSI and FCoE or that process overlay network encapsulation in software (losing other adapter offloads), will be less server power efficient, and the server will consume much more power at an equivalent workload. One goal for an adapter should be to be server power efficient, performing high performance I/O without triggering the server to consume high amounts of energy. The charts below are a comparison between the hardware offloaded FCoE protocol storage adapter function of the OCe14102 against the commodity, volume-leading10GbE NIC that uses a software-based FCoE initiator. The first chat below shows the relationship between the extra server CPU resources required to run software FCoE and the server AC input power consumption during equivalent activity loads. The second chart shows the relationship between I/O workload and server power efficiency. Examples of server power efficiency are also true for our hardware accelerated iSCSI function, and NIC Virtual Extensible LAN (VXLAN) and Network Virtualization using Generic Routing Encapsulation (NVGRE) features.
With the next generation of any product, one would expect an improvement over the past. With the OCe14000 family of adapters, you can see that we have made some big improvements. We made notable improvements in performance and efficiency, while adding new features, some of which are still to come…