Wednesday, December 15, 2010

(Discovered) Best Practice Config for OpenFiler iSCSI Storage for ESXi4

Update: OpenFiler performance and reliability remained low. I've since moved to another platform, and no, this isn't an ad so I'm not saying where I ended up.

That title sucks, but I'm hoping Google will like it.

The OpenFiler iSCSI box at the Eastern site bailed, due largely to congestion on the still-not-redundant switch we have in place (new stuff still in shipping, argh) and after the fallout I found some config tips for the OF 2.3 server.  This thing's a physical box with nothing but iSCSI on it, so it may be different if you're simulating some ESXi-ESXi virtual worlds in your mini-lab at work.
  1. Disable Delayed ACK in the iSCSI storage adapter advanced properties on the ESXi host.  (scroll down)(via)
  2. Reduce the VMs-per-LUN count to reduce the reservation count and conflict  Smaller LUNs, kids, despite what you hear.(via)
  3. Change the default iSCSI timeout on your ESX server to 14 seconds: esxcfg-advcfg -s 14000 /VMFS3/HBTokenTimeout (via, via, via)
  4. Keep your snapshots down:  they increase load, with the obvious impact.(via)
  5. Consider upgrading your intel e1000 or exgb NICs, code here, intel instructions here, openFiler-derived procedure here.
  6. Don't use more than one NIC (bonding) or more than one interface (different subnets) at a time with Openfiler. (via, via, instructions for OF)
  7. Target tuning (via):
    1. DefaultTime2Wait should be changed to "10"
    2. ImmediateData should be changed to "Yes"
    3. MaxRecvDataSegmentLength should be changed to "262144"
    4. MaxXmitDataSegmentLength should be changed to "262144"
  8. LUN tuning
    1. R/W mode should be changed to "write-back"
      (it should go without saying that a BBU is essential for your RAID controller)
    2. Transfer mode should be changed to "fileio"
      (this is controversial)
  9. NEVER, EVER share LUNs between ESX hosts using Openfiler and IET.
    (Not sure how doable this one is in an ESXi cluster, but that's the suggestion.) (via)
So in my case, we're looking to implement a few of those.  Hopefully we'll not see this kind of issue any time soon, but watch this space in case we do see something after the upgrade/tuning.

The good news is that OF is considering another iSCSI target, and the changes are supposed to be very beneficial for performance and for us.

Labels: , , , , ,