Testing netCDF Formats in Python


I had been using various versions of netCDF3 for storing Lagrangian drifter tracks. But, my group had discussed using netCDF4 for improving storage via compression and to potentially use multiple unlimited dimensions. One reason I was holding back was that I had read I would not be able to use MFDataset to combine drifter tracks if I switched to netCDF4. When I started doing simulations with more and more drifters recently, things were really getting bogged down, so I decided it was worth the time to investigate.

Big picture: netCDF4-CLASSIC works well for my purposes so far and is a big improvement over netCDF3.

netCDF4-CLASSIC combines many of the good parts of netCDF3 with some of the abilities of netCDF4. It allows for compression. It does not allow for multiple unlimited dimensions, but I haven't needed that. It does allow for the use of MFDataset for aggregation. It also sped up the process of saving by a huge amount, which for some reason was getting massively bogged down with the large number of drifters being used recently.

I opted to stick with 64 bit format for saving the tracks and lossless compression of information.

Test

I ran the shelf_transport project with 1km initial spacing throughout the domain with no volume transport for 2004-01-01 (drifters run for 30 days). This ran 267622 surface drifters with sampling of the tracks every 48 minutes. A nan is stored once a drifter exits the domain and forever after in the time for that drifter.

This results in a large amount of output and really slowed down getting through runs previously, using netCDF3_64bit.

Results

Note that the simulation run time does not include time for saving the tracks. Also, I ran the netCDF4 test after the other two, while running 7 other simulations, so I suspect the time difference is mostly due to sharing memory. Regardless, the file size is the same between netCDF4 and netCDF4-CLASSIC. Better timing tests could be done in the future.


netCDF3_64BIT netCDF4-CLASSIC % decrease in 4C netCDF4
Simulation run time [s] 1038 1038 0 1423
File save time [s] 3527 131 96 152
File size [GB] 3.6 2.1 42 2.1

More Information