I was deploying a new Horizon 7 environment where we had to enable RSA for 2-factor auth on the VMware UAG. Normally this is an out of the box option but I was running into some weird issue.
I could deploy the UAG fine on either 3.6, 3.7.2 and 3.8. Configuring the horizon edge would go as expected and worked fine. Then I would set up the SecurID adapter.
This also went fine and RSA worked… However, after a reboot of the UAG, the RSA functionality would stop working.
Let the troubleshooting begin
So I thought, let’s apply the config again… Wrong:(
Could not even disable the adapter or change the config. Looking in the authbroker.log which can be found in /opt/vmware/gateway/logs I came across the following error.
So I gave up, destroyed the UAG and redeployed. This is the way the UAG’s lifecycle is. Better redeploy then put a lot of time in it. Redeployed UAG 3.6, configured it and it worked again. Then that voice in the back of the head started telling me, Reboot it! So I did… and behold, RSA died again:(
Now I redeployed it again and again and the same thing happened on UAG 3.6. Alright, let’s deploy UAG 3.8. Sounded like a foolproof plan… Again configured the whole thing but now I was running into another issue… I Could not select SecurID in the Horizon Edge… Must be a new version thing right? So destroyed the UAG 3.8 and deployed 3.7.2. This must be the version for success.
With 3.7.2, configured and all, seemed to work. RSA worked, desktops showed. Until I rebooted the UAG again! RSA stopped working… So now I kinda got bonkers and wanted a workaround to fix this without redeploying it. There is no documentation on this so we had to go old skool.
Found out that the RSA config is saved in /opt/vmware/gateway/data/authbroker/states/AP/1.
Now you can delete the following files.
After deleting these files you can apply the RSA configuration again from the GUI and it will work.
Tailing the authbroker.log shows that the RSA config is loaded again and indeed it’s working.
However… when I reboot the UAG it’s broken again. At this point, it was time to call in a favor from VMware GSS. We shot in a ticket and are waiting on a permanent fix for this. For now, we just do not reboot or fix it via CLI. I will update this post when we know more!
The problem is documentation… where the UAG info box says: “Fill in the external address of the UAG” and the PowerShell template states: “Fill in UAG IP” When using the IP in both fields the config works and survives a reboot.
This is where the documentation is different.