Wednesday, August 17, 2016

Sparse Files and Multiple Avenues of Attack

I'm working with sparse files today. I have a 1TB qcow2 file that contains all of about 100GB of actual data. The file is currently residing on a FreeNAS box, and I needed to get it to a Proxmox box elsewhere on the network. The network is full gigabit, but the FreeNAS box itself is a Frankenstein box, cobbled together out of whatever hardware we had available. I think there's a Barton core Athlon XP in there. Yes, it's 2016, and I just said Barton core Athlon XP.

It does have a couple of gigabit cards in there for expansion, but whether or not the rest of the system can flood gigabit is another story. But then, there is yet one more piece to this. Does scp respect sparse files?

The short answer is no, it doesn't. But there's more to this adventure. Just for grins, I went ahead with the scp anyways. This is a local, gigabit network, so I used -c arcfour to ease the pressure on the CPU. I ended up getting between 60 and 70MB/s across the network, and scp estimated that this would take about 3.5 hours to complete.

scp was launched in screen from the Proxmox machine, so I let that run. I hopped over to the FreeNAS box and checked gstat. The disks were bored. Whatever the bottleneck was, it wasn't the disks. I'm not sure I'd be so quick to jump all over the CPU for being the bottleneck either. It handled itself pretty well (even though I did end up abandoning the scp attempt).

But given that the disks weren't stressed, I decided to get a copy to an external drive that I had plugged in. Since this was FreeNAS I just made the external drive its own zpool, and cp'd the disk image to the external drive. Before doing that, I did a quick du -h on the file to verify the size. Roughly 100GB.

I'll admit, I got sloppy and left the machines to their own devices. I puttered around the house, made breakfast, watched some Voyager, talked with the wife about after school activities for the kids (Fencing for my 7 year old! Woooo!). When I came back about 2 hours later, the copy command had finished, and there was still an hour and a half left on the scp command.

I hopped into the FreeNAS web GUI, exported the external drive, and since I wasn't in the office I had to email for some help getting the external drive from FreeNAS to Proxmox.

That didn't take very long at all. I imported the zpool in Proxmox from the command line, and began copying the copy from the external drive to the image directory where the qcow2 file would live from here on out. I also took the opportunity to cancel the scp command. There was basically no way that the scp command with an hour and a half left on it was going to beat out the cp command. Beyond that, the actual file size of the scp'd file was well past 750GB, more than seven times the size of the actual data in the file.

And to no one's surprise, the cp command finished in like 15-20 minutes. The interesting piece that I wasn't expecting was that the actual size of the file had gone from ~100GB to almost 240GB. I knew that the cp command was respecting the sparse file size, because as the copy was happening, the listed size was at about 600GB, which is when I first noticed the actual size on disk at over 200GB. If it didn't support sparse file size, listed size and size on disk would be the same.

It took me well longer than it should have, but I figured it out a few minutes later. The origin file system was ZFS, the external drive file system was ZFS, the destination file system was ext4.

Duh. lz4 compression. But that is a wonderful example of how awesome compression is in ZFS.

No comments:

Post a Comment