ESXi 6.5 vmw_ahci SSD extreme high latency and vm freeze issues

Looks like it’s fixed in 6.5 Update 1? Although had some response that it does not..

Since the upgrade to ESXi 6.5 on my homelab i got bugged with a strange issue.
All my SATA SSD got really high latency and my VM’s randomly froze when i generated a big file copy actions. Latencies in the 5000/15000ms! Yes IEW!

In the logs on my ESXi host i saw random:

Warning
Lost access to volume 583fe5a3-45d215a5-8eb9-00151791b8b6 (SSD01) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.

With ESXI 6.5 the AHCI driver has been updated to a newer version, the native driver, vmwahci is used out of the box.

1
2
3
[root@vDrone-ESX01:~] esxcli software vib list | grep ahci
sataahci 3.022vmw.650.0.0.4564106 VMW VMwareCertified 20161116
vmwahci 1.0.032vmw.650.0.0.4564106 VMW VMwareCertified 20161116

So i tried disabling this driver and let it use the sata-ahci driver.
Via SSH: esxcli system module set –enabled=false –module=vmw_ahci
After entering this command, reboot your host.
Using the host client you can check now which driver AHCI is using.

screen-shot-2016-12-01-at-20-47-39

Booted up all my VM’s and presto, no more high latency and freezes. Looks like the new vmw_ahci driver is not that stable yet on some intel chipsets. In my case a X99 chipset.

28 thoughts on “ESXi 6.5 vmw_ahci SSD extreme high latency and vm freeze issues

  1. F5SSE

    WOW, Thanks So Very Much for this Blog Post!!! I had been pulling my hair out over the slowness issues on my two ESX hosts over the past week plus. I just made these changes to both hosts and they each displayed more adapters after reboot and performance has been so much better and no more event logs statements about loosing access to vmfs volume! Thanks again!!

    Reply
  2. Tim Zuidema

    Thanks alot for the post. I had the same problem on ESXi 6.5 with Samsung SSD.

    Reply
  3. Guy

    Hello,

    Thanks very mucn for this post.

    I get same issue and I think it come from my SSD but not 🙂

    Reply
  4. P

    After some struggles and consulting Google for longer than I would have liked, this post finally helped me resolve the issues I had with two different SSDs as ESXi datastores.

    Thanks for posting this, it’s much appreciated!

    Reply
  5. max raufer

    Thanks for this, I had the same problem. I disabled vmw_ahci module and rebooted the host and if came up fine, but when I checked the storage adapter was again using the vmw_ahci driver. Should it not be using a different one, and will I run into the same ssd issues again after a while?

    Reply
      1. max raufer

        Yes, the command was right and it did work but only after I put the host in maintenance mode. after reboot the sata-ahci driver is being used. I’ll monitor for SSD errors for a while, hope they’re gone.
        Actually, the errors (when using vmw-ahci driver) eventually caused “loss” of data stores (local and network) and VM freezes but after reboot all was back – until enough errors accumulated to cause the same problem again. So no actual data was lost but it was highly inconvenient.
        Thanks for your post.

        Reply
  6. culture

    The same problem here. NUC6. Unfortunately it seems news releases get through with plenty of issues :/

    How would you reenable it if new release fixes it?

    Reply
    1. LaurensvanDuijn Post author

      Via SSH: esxcli system module set —enabled=true —module=“vmw_ahci” 🙂

      Reply
  7. Steffen

    Thanks a lot mate, it really helped me as my lab environment was constantly freezing due to this problem. I’ve got the problem since last year and almost downgraded my server. Great work!!!!

    Reply
  8. Pingback: ESXI 6.5 在NUC下出现SSD传输缓慢问题 | 新视漫影 :: NVACG :: 【备用端口:5000】

  9. RobL

    This has fixed the speed issue with my SSD drives but I have an additional 4 port Syba PCIe controller card which is now not detected after disabling vmw_ahci. Has anyone else got an add-in card which is still working after running the above command?

    Reply
    1. LaurensvanDuijn Post author

      Probably not, the driver of your card is in the VMW_AHCI driver and not in the native.
      You could try to make a custom ESXi 6.5 image with the VMW_AHCI driver of ESXi6.0U3. #NoSupport 🙂

      Reply
      1. RobL

        Thanks for the reply. Does anyone know if this is logged as a bug report with VMWare as its a pretty fundamental issue to the usability of 6.5. Before finding this post, I have spend at least 3 days trying to fix the issue, replacing hardware and generally going round in circles. They should pull 6.5 as a download until this is fixed.

        Reply
        1. LaurensvanDuijn Post author

          Yes reported it but the thing is that the problems only seem to happen with non HCL hardware.

          Reply
  10. Pingback: Intel NUC に ESXi 6.5 を入れる | kurokobo.com

  11. Nague

    Thanks so much ! You save my day !
    By the way, copy/past the command didn’t work for me. I used this one:
    esxcli system module set -e=false -m=vmw_ahci

    Reply
  12. BJ

    I got a gen5 NUC NUC5i5MYHE with a Crucial SSD (MX200 500GB) connected via M.2 and BIOS Mapping it to SATA. Booting ESXi from USB Stick.
    After Update to 6.5, I had the same Problems. Slow and buggy. When I tried your solution, it did not find the SSD at all. After a lot try-and-error I found out that the custom NUC Drivers/VIB need to be installed for it to fall back to old ahci.

    If you got the Problem, go to: https://vibsdepot.v-front.de/wiki/index.php/Sata-xahci
    get he latest drivers/mappings package and install it:
    esxcli software vib install -d /vmfs/volumes/… (zip-file)
    then disable vmw_ahci and reboot.

    Reply
  13. Ronald

    Hi, thank you very much for this! You save my day!!
    Had also some trouble with the command copied from your post. The right format is:
    esxcli system module set –enabled=false –module=vmw_ahci
    works like a sharm.
    Thanks
    Best
    Ronald

    Reply
  14. Ronald

    ahh… there is a format issue in the comment field. :-/
    It must be a Double hyphen before enabled and module.

    cu

    Reply
  15. Pingback: ESXi 6.5 vmw-ahci から sata-ahci に変更して改善! | Dark Night Memory

  16. Wolfgang Busch

    great help…
    saved me after alread wanted to send back whole server.
    HW is actual supermicro serverboard with Intel SSD and Toshiba SSD. After disabing the VMW-ahci driver now all is fine. Really bad bug of the actual ESXI Version!

    Thanks again for the post!!
    Wolf / Germany

    Reply
  17. Oliver

    There are some news that Update 1 of ESXi 6.5 (6.5U1) should have solved the issues as it comes with a newer ahci driver according the official release notes. It didn’t in my case and I’ll follow up this posts recommendations later.

    By the way. During debugging I figured out, that the disks became much faster with below configuration (from 12’000 to 4’000 m/s latency – which sure is still a lot):
    – Change controller to SATA (default SCSI)
    – Thick provisoning lazy or full

    Oliver / Switzerland

    Reply
  18. Oliver

    I was a bit too happy to early. The disks became faster, yes but the system itself rather unstable.. It is rebooting every 2 to 3 hours. I’ll roll back.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.