Lab results: Benchmarking a Nutanix Block

The last weekend I spend some time in our solutions center at work to play around with a Nutanix block.

I thought it would be interesting to benchmark the system to see how much performance you can get out of it and would like to share the results with you in this post.

Hardware & Software Setup:
NX3050
NOS 4.1.2
vSphere 5.5U2
4 Nodes each with:
2x Intel E5-2670@2,6GHz
256GB RAM
2x 800GB SSDs
4x 1TB HDDs
2x 10GbE

Test Setup:
-Four Windows Server 2012R2 VMs (one running on each node)
-8 vCPUs & 16GB RAM for each VM
-8 additional VMDKs per VM, distributed accross multiple PVSCSI controllers
-IOmeter 1.1.0 installed, controlled via one master server to ensure simultaneous start time
-Two workload profiles with reasonably real-world workload settings (4k / 60% read/ 95% random & 8k / 50% read / 80% random)
-8 outstanding IOs per worker
-10 minutes runtime for each test
-Multiple runs for each profile

Platform specific settings:
-Dedup & Compression disabled
-CVM settings: 8 vCPUs & 32GB RAM

Results:

4k_iometer_ntnx

8k_iometer_ntnx

It´s important to look also at the latency and not only at the amount of IOPS. With the “# of Outstanding I/Os” setting within IOmeter you can heavily influence the amount of IOPS you will get as result. In my opinion it´s important to choose a value, which keeps the latency at an acceptable level. There is not much sense to get 20k IOPS more with the drawback of a latency at 100ms for example, which would make a production environment quite unusable.
Anyway, I thought it could be interesting to have also another value at another latency region (may for comparison with other products, which don´t get such pretty nice latency numbers with the same settings), so I did the first test also with 32 OIOs:

4k_iometer_ntnx_oio

Findings (in no specific order):
-For the Nutanix platform a single test run of each workload profile would have been sufficient. The results of each run have been very close together. Also the results of the single IOmeter worker / threads have been very similar.

-Depending on the workload the CVMs on the Nutanix nodes utilized the allocated 8 vCPUs during the benchmark very heavily (between 80-100%). As my nodes only had 8 physical cores per socket this lead to a total vSphere cluster CPU usage of ~50% only for the CVMs. For sure with the current node models with up to 24 cores per socket this wouldn´t have been such a big deal.

-After each IOmeter run was completed, I noticed that the Nutanix nodes had still a lot of activity on the SSDs and also on the network. Depending on the workload it took up to 20min until the systems were idle again.

Comments

  1. We’ve been running almost identical lab tests this week. We upgraded the NOS from 4.1.1 to 4.1.2, but since then we’ve had major issues with iometer tests running like a dog for the first 2-3 minutes of the test IF WE HAD LEFT I/O QUIET FOR 20 MINUTES PRIOR TO THE TEST. Be interested to see whether you have the same issue. Running iometer tests straight after a previous tests gave great results. Looked to all like something was going “cold” after 20 minutes of inactivity.

Speak Your Mind

*