VNX iSCSI and TCP Delayed Acknowledgement

vnx-5500I recently sat in on an internal VNX (and CLARiiON) performance crash course that was put together to help our new hires get up to speed.  Once of the things that stuck out to me was the subject of iSCSI and how it works with host TCP delayed acknowledgement (Delayed ACK).

 

Background Information

So what is delayed ACK?  As part of TCP, for every packet that is sent to a destination server, that server must send some sort of acknowledgement back to the source server.  This way the source server knows the information was successfully transmitted.  This adds a good amount of overhead, so in an effort to improve performance, TCP Delayed Acknowledgement (RFC 1122) was created which allows a destination server to respond to every 2nd packet instead.  This has become so popular, that support for delayed ACK is enabled by default in many popular client operating systems including Microsoft Windows and VMware ESX/i.

 

The problem with this is that many storage arrays do not support delayed ACK for one reason or another (usualy has to do with chipset drivers).  What happens in this case is that the array will send a packet, it will then wait for an acknowledgement before sending a 2nd packet.  Meanwhile, the host is waiting for a 2nd packet before sending an acknowledgement.  This standoff between the array and the host will last until the acknowledgement timeout (usually around 200ms) before continuing on.  This wreaks havoc on performance if every packet has to wait 200 milliseconds before sending another.  So if you’ve setup iSCSI and you are immediately seeing a performance issue, check your hosts to see if Delayed ACK is enabled, and turn it off to see if performance improves.

 

Disabling Delayed ACK in Microsoft Windows

 

In Microsoft Windows operating systems, you can simply set the TcpAckFrequency registry value to 1.  More information can be found Microsoft kb 328890.  On a side note, I found that if the registry value is missing, you can create it in the path specified in the kb and reboot the host.

 

Disabling Delayed ACK in VMware ESX and ESXiimage

VMware has created KB 1002598 to address this as well.  This adjustment is made per adapter instance and you can change this setting on a discovery address, a target, or (in my case) globally.  Once you’ve made your change, reboot the host and enjoy the performance boost.

 

I hope you’ve found this information useful.  It may not solve your iSCSI performance problem, but it is a good place to start.

New code to make your VNX better!

bugfixTo state it right off the bat, this code does not include the features I talked about here and here but this is still a very important update.  Yesterday marked the release of the latest update to the VNX with the general availability of VNX OE File 7.0.53.1 and OE Block 05.31.000.5.720 (both of which are available on the VNX product support page or by using the Unisphere Service Manager (USM) tool.

 

So you may be asking yourself, if this doesn’t come with all those cool features Sean talked about last week, why should I bother upgrading?  Well I’m glad you asked that question.  In addition to the many bug fixes incorporated in this service pack, this release contains 3 very important updates.

 

The first is support for the latest VMAX Enginuity code version 5876.82.57 that was released recently as well.  The 2nd enhancement covers those using iSCSI.  Anyone actively using iSCSI on their VNX should read ETA emc291837.  The 3rd and final fix eliminate the erroneous over temperature alerts that were reporting on some power supplies that was previously covered in Primus emc278973.

 

As with all new code releases, I encourage everyone to upgrade as soon as possible and to not fall too far behind the latest code levels.  I have started a discussion here on ECN incase you have questions about this release and the fixes contained within.