Skip to main content

Windows Server 2012 storage = awesome sauce

We've been playing with the Windows Server 2012 release candidate on a new NAS system, and the combination of Storage Spaces and deduplication make for an impressive combination (see screenshot).

89% deduplication rate
We copied a week's worth of database and disk-image backups from a few servers to a deduplication-enabled volume on the test system. This amounted to a total of 845 GiB of raw, uncompressed data files. After waiting a bit for the deduplication to kick in, we ended up with a 90% savings in space.

This is the kind of result usually seen on purpose-built and reassuringly expensive dedplication appliances such as those from Data Domain.

The data copy process itself was also quite interesting. We configured twelve 2TB 7200 RPM drives into a Windows Storage Spaces pool, and set up a 5 TB NTFS volume on them in parity mode. Storage Spaces give you much of the flexibility of something like ZFS or Drobo: you create a pool of raw disks, and can carve it up into thin-provisioned volumes with different RAID and size policies. These volumes can be formatted with NTFS, ReFS, or shared out as raw iSCSI to other systems. Disks of different sizes can be added or removed and the pool will re-balance data automatically.

We copied the files from another NAS using ROBOCOPY with two threads, and the Windows 2012 system was able to write out the data at 100% of network speed (about 120 MiB/s) while using just 2% of a single Xeon E5-2620. Parity calculations are not a bottleneck here. Supposedly Microsoft also supposedly has some tricks in Storage Spaces to prevent the "software RAID-5 write hole" for parity volumes a la ZFS. The actual deduplication process took a few hours after the data was ingested, as it is a post-process system that runs at a low priority in the background.

There are caveats with the new deduplication feature, making it unsuitable for things like live VM disks or live databases. But it's certainly great for backup data, archival data, and general purpose file sharing. Management of the Storage Spaces and Deduplication features is dead-simple through the GUI, with sensible defaults. There is also a wealth of PowerShell commands to let you dig into the details not exposed in the GUI.

Finally, you can't beat the cost, which is basically "free" if you were already buying Windows Server 2012 anyway.

Comments

Popular posts from this blog

Fixing slow NFS performance between VMware and Windows 2008 R2

I've seen hundreds of reports of slow NFS performance between VMware ESX/ESXi and Windows Server 2008 (with or without R2) out there on the internet, mixed in with a few reports of it performing fabulously.
We use the storage on our big Windows file servers periodically for one-off dev/test VMware virutal machines, and have  been struggling with this quite a bit recently. It used to be fast. Now it was very slow, like less than 3 MB/s for a copy of a VMDK. It made no sense.
We chased a lot of ideas. Started with the Windows and WMware logs of course, but nothing significant showed up. The Windows Server performance counters showed low CPU utilization and queue depth, low disk queue depth, less than 1 ms average IO service time, and a paltry 30 Mbps network utilization on bonded GbE links.
So where was the bottleneck? I ran across this Microsoft article about slow NFS performance when user name mapping wasn't set up, but it only seemed to apply to Windows 2003. Surely the patch me…

Google's public NTP servers?

I was struggling with finding a good set of low-ping NTP servers for use as upstream sources in the office. Using pool.ntp.org is great and all, but the rotating DNS entries aren't fabulous for Windows NTP clients (or really any NTP software except the reference ntpd implementation).

ntpd resolves a server hostname to an IP once at startup, and then sticks with that IP forever. Most other NTP clients honor DNS TTLs, and will follow the rotation of addresses returned by pool.ntp.org. This means Windows NTP client using the built-in Windows Time Service will actually be trying to sync to a moving set of target servers when pointed at a pool.ntp.org source. Fine for most client, but not great for servers trying to maintain stable timing for security and logging purposes.

I stumbled across this link referencing Google's ntp servers at hostname time[1-4].google.com. These servers support IPv4 and IPv6, and seem to be anycast just like Google's public DNS servers at 8.8.8.8. time…

Presets versus quality in x264 encoding

I'm scoping a project that will require re-encoding a large training video library into HTML5 and Flash-compatible formats. As of today, this means using H.264-based video for best compatability and quality (although WebM might become an option in a year or two).
The open source x264 is widely considered the state of the art in H.264 encoders. Given the large amount of source video we need to convert as part of the project, finding the optimal trade-off between encoding speed and quality with x264-based encoders (x264 itself, FFmpeg, MEencoder, HandBrake, etc.) is important.
So I created a 720p video comprised of several popular video test sequences concatenated together. All of these sequences are from lossless original sources, so we are not re-compressing the artifacts of another video codec. The sequences are designed to torture video codecs: scenes include splashing water, flames, slow pans, detailed backgrounds and fast motion. I did several two-pass 2500 kbps encodings using …