————–UPDATE————–
Since this post was written I performed some work with FSLogix and drafted the below update. Please read that one.
————–UPDATE————–
I’ve been doing some work with FSLogix, it’s a good product and it’s pretty simple to configure. The logs are quite easy to find but there are a few things missing from the offering in my opinion. One of the main things missing are recommendations on sizing metrics for backend storage.
But before we dive in, a bit of background.
Terminal Services does not play well with Outlook in cached mode, for those of you who do not know – cached mode is when your Outlook client pulls copies of all your emails down onto your computer for a specified period of time. This process uses quite a lot of disk space and increase the work of the disk a fair bit. On a single PC or Virtual Machine this is not too much of a problem. When you start looking at Multiple Virtual machines though with hundreds of users it does become a problem.
The solution to this problem has always been Outlook in Online mode, this is great when your exchange server is next door but right now with the rapid adoption of Office 365; Houston, we have a problem!
You have two choices – drag all your emails directly over your internet pipe in Online mode or implement some sort of storage location for an Outlook cache to be stored persistently. It’s worth noting that Microsoft will not support issues of a performance related nature with Outlook .ost files stored on an SMB share so, unless you tread carefully it’s not an option.
The solution! FSLogix Profile and Office 365 Containers, this software solution allows the users profile and the users Office Applications (Outlook, OneDrive, Skype GAL and Windows Search) to roam with the user on compatible OSes.
Due to the fact that the containers are VHD files stored on the network instead of the SMB shares there is only a single connection that is always active to the VHD meaning that there is a smaller overhead on the file transfers. This all makes sense but coming back to the point, what about sizing?
What sort of backend storage requirements are there for FSLogix? There is nothing on the website, what sort of metrics should you be looking at the gauge if your deployment is struggling? This is what this article is all about.
Identifying an Issue:
If you are using Citrix/Terminal Services within your environment and your login is slow with FSLogix then I recommend you run perfmon and capture some keys indicators for a period of a day.
The metrics I captured in particular were as follows (Both Physical and Logical where applicable):
- Current Disk Queue Length
- Average Disk Queue Length
- Disk Reads/sec
- Disk Writes/sec
- Average Disk Reads/sec
- Average Disk Writes/sec
These metrics will give an indication if the disks in your machine hosting your container VHD’s is having an issue. If anyone needs to know how to setup perfmon to do this, let me know and I’ll write another post.
These are the metrics I got specifically displaying Avg Disk Reads and Writes, Average Queue Length and Current Queue Length.
Profile Container Server
Office 365 Container Server
You can see by these metrics that the average disk queue length sits at around 10 for both servers, these particular machine did have disks raided together, 4 in total. So to do the maths and work out what the real queue length is per disk, just divide the logical volume queue length by the number of disks.
That a queue length of 10 – 2.5 per disk. That’s too high and will impact performance for sure.
In this particular example, concurrency sits at around 450 users during this busy period.
Profile Container Server
94 reads a second maximum and 24 writes a second maximum gives us 94+24 = 116 Input/Output operations per second. The I/O is not being dealt with quick enough so it’s being queued.
This storage in particular is capped as it is hosted in a cloud DataCenter environment.
4 x 500 IOPs Disks with 60MB/s cap.
A Total of 2,000 IOPs and 240MB/s transfer rate.
Office 365 Container Server
95 writes a second maximum and 93 reads a second maximum gives us 94+93 = 187 Input/Output operations per second. The I/O is not being dealt with quick enough so it’s being queued.
This storage in particular is capped as it is hosted in a cloud DataCenter environment.
4 x 500 IOPs Disks with 60MB/s cap.
A Total of 2,000 IOPs and 240MB/s transfer rate.
Next Steps:
Two new file servers were built with more capable disks to enable a better experience. Comparing the metrics above with the new storage available yields the below information.
New Profile Container Server
The queue length stays consistently low for this new storage provisioned. The disks are able to process requests more quickly and the queue peaks and dives, that’s how we like it.
This storage is also capped as it is hosted in a cloud DataCenter environment.
4 x 5,000 IOPs Disks with 200MB/s cap.
A Total of 20,000 IOPs and 600MB/s transfer rate.
New Office 365 Container Server
The queue length stays consistently low for this new storage provisioned. The disks are able to process requests more quickly and the queue peaks and dives, that’s how we like it.
This storage is also capped as it is hosted in a cloud DataCenter environment.
4 x 5,000 IOPs Disks with 200MB/s cap.
A Total of 20,000 IOPs and 600MB/s transfer rate.
Performance and Sizing Guidelines
Performance with these new servers is much faster, logins are quicker and Outlook is much much faster to respond when the cache is redirected to the FSLogix VHD.
So what’s the user users profile:
- Entire User Profile is redirected
- Outlook cache is redirected (6 months)
- OneDrive cache is redirected
- Skype GAL is redirected
My recommendations are based on my own experience and nothing more.
I’d like FSLogix to come up with a more solid recommendation when it comes to backend storage sizing as at the moment there is none that I know of and I feel it’s imperative when you sell software that you are able to specify what’s needed to support it.
FSLogix Profile Containers:
Plan 12 IOPs per user
FSLogix O365 Containers:
Plan 44 IOPs per user
These recommendations are based on the above user profile and don’t directly apply to .ost files redirected to an SMB share. The VHD(X) connection does elevate an overhead in the SMB connection.
I welcome queries or feedback. Thanks for reading.
Thanks for this post! Hopefully FSLogix will publish something for that. Office 365 is a big challenge in this constellation.
Cheers Ben, I hear something is in the pipeline. For me I needed something now for a project that’s running.
Wondering if you’ve compared fslogix containers to liquidware’s layers for o365.
I’ve been looking at both solutions, but the additional cost for our citrix environment always drives me away. But we’ve been pushing o365 and onedrive so hard I may have to fight for the budget to get on of these solutions.
Hi Jason
I’ve not had the time to test both yet. My recommendation if money is tight, speak to fslogix first. Happy to make an introduction. There are some alternatives, Citrix app layering has a user layer now, worth taking a look at and would cover outlook caching. There is also a OneDrive mapping script on the technet library which is great. Let me know if you want links. Also, what OS and Citrix license edition are you running?
You mention “Due to the fact that the containers are VHD files stored on the network instead of the SMB shares ” but I believe the VHD files are still stored on an SMB Share on your file server.
I guess your point is there is only one disk file per user so there is only one SMB connection per user.
You are correct Daeus, but that is exactly what I was getting at, there is only one file-open action in the SMB stack to mount the VHD.
Hey Leee,
Thanks for the write up. We are currently Fslogix customers and as we continue to add more users to our VDI environment I grow concerned on performance issues like you mentioned here and also HA/redundancy. Having a single point of failure (like a Windows based Profile server) for my entire VDI environment is a little scary. I’m sure some of this can be mitigated with something like DFS-N/R, but even then there would still be some kind of outage when you include profiles corrupting due to a crash. Thoughts on redundancy?
On performance without a good way to load balance your Profile servers I feel this solution doesn’t scale very well. You mentioned you built “Two new file servers were built with more capable disks to enable a better experience.” but you also said the profile servers were in a hosted cloud environment. When you built these new servers did you ask your cloud provider to create these on better storage?
Hi Calib, this was actually on Azure and was a scale out file server. With a scale out file server the underlying storage is replicated between two nodes at block level and the cluster share made continuously available, in this scenario the load is balanced between the two nodes. Alternatively a standard file cluster based on correctly sized storage with continuous availability would also give you high availability. How many users are you talking?
Unfortunately this appears to be the Achilles heel of FSLogix. It’s great in concept, but introduces a single point of failure, unless you’re replicating your file servers- which in turn doubles your storage costs. You also can’t distribute the storage across multiple file servers- at least not a way I’ve found.
We’re using it now, but OneDrive and FSLogix still has some limitations on Windows 2016, and we’re using exorbitant amounts of storage. Users have become data hoarders. It would be more manageable if Windows 2016 had the ability to use Files on Demand, but that appears to be a Windows 2019 thing only.
Spot on Cambo, Server 2019 supports files on demand due to the filesystem needing some additions. This was never going to be ported back to 2019. Check out bvckup2 for replicating your VHDs to a different location or read into cloud cache, personally I prefer the replication to a secondary location.
Hi Leee,
I am doing some desk research on the iops and storage requirements for Fslogix.
And came across this kb article: https://support.fslogix.com/index.php/knowledge-base?view=kb&kbartid=187
Can you explain the big difference in required iops (2 vs 44) ?
Hi Marc, I dont work for FSLogix so can’t really comment on the source for the article that you’ve referenced.
The testing I performed was conducted using LoginVSI in Azure with Premium Storage. I used perfmon to record storage metrics and myself and Jim Moyle analysed the figures (Jim does work for FSLogix).
The results are published as they stand. I’d use my figures if I were you.
The only difference I can really see between the two, I used a 2GB mailbox and wasn’t sending or recieving mails, just caching down the emails from exchange. That may be the variance, that being said; I’m not sure how you can calculate IOPS on mails send/received accurately.
I hope this helps.