Contribute¶
This project follows the standard open-source project github workflow, which is described in other projects like matplotlib or scikit-image.
Testing¶
To run self-contained tests, from Python:
import hdf5plugin.test
hdf5plugin.test.run_tests()
Or, from the command line:
python -m hdf5plugin.test
To also run tests relying on actual HDF5 files, run from the source directory:
python test/test.py
This tests the installed version of hdf5plugin.
Building documentation¶
Documentation relies on Sphinx.
To build documentation, run from the project root directory:
python setup.py build
PYTHONPATH=build/lib.<os>-<machine>-<pyver>/ sphinx-build -b html doc/ build/html
Guidelines to add a compression filter¶
This briefly describes the steps to add a HDF5 compression filter to the zoo.
Add the source of the HDF5 filter and compression algorithm code in a subdirectory in
src/[filter]. Best is to usegit subtreeelse copy the files there (including the license file). A released version of the filter + compression library should be used.git subtreecommand:git subtree add --prefix=src/[filter] [git repository] [release tag] --squash
Update
setup.pyto build the filter dynamic library by adding an extension using theHDF5PluginExtensionclass (a subclass ofsetuptools.Extension) which adds extra files and compile options to enable dynamic loading of the filter. The name of the extension should behdf5plugin.plugins.libh5<filter_name>.Add a “CONSTANT” in
src/hdf5plugin/__init__.pynamed with theFILTER_NAMEwhich value is the HDF5 filter ID (See HDF5 registered filters).Add a
"<filter_name>": <FILTER_ID_CONSTANT>entry inhdf5plugin.FILTERS. You must use the same filter_name as in the extension insetup.py(without thelibh5prefix) . The names inFILTERSare used to find the available filter libraries.In case of import errors related to HDF5-related undefined symbols, add eventual missing functions under
src/hdf5_dl.c.Add a compression options helper function named
<filter_name>_optionsinhdf5plugins/__init__.pywhich should return a dictionary containing the values for thecompressionandcompression_optsarguments ofh5py.Group.create_dataset. This is intended to ease the usage ofcompression_opts.Add tests:
In
test/test.pyfor testing reading a compressed file that was produced with another software.In
src/hdf5plugin/test.pyfor tests that writes data using the compression filter and the compression options helper function and reads back the data.
Update the
doc/information.rstfile to document:The version of the HDF5 filter that is embedded in
hdf5plugin.The license of the filter (by adding a link to the license file).
Update the
doc/usage.rstfile to document:The
hdf5plugin.<FilterName>compression argument helper class.
Update
doc/contribute.rstto document the format ofcompression_optsexpected by the filter (see Compression filters can be configured with the ``compression_opts` argument of h5py.Group.create_dataset method by providing a tuple of integers.
Low-level compression filter arguments¶
Compression filters can be configured with the compression_opts argument of h5py.Group.create_dataset method by providing a tuple of integers.
The meaning of those integers is filter dependent and is described below.
bitshuffle¶
compression_opts: (block_size, lz4 compression)
block size: Number of elements (not bytes) per block. It MUST be a mulitple of 8. Default: 0 for a block size of about 8 kB.
lz4 compression: 0: disabled (default), 2: enabled.
By default the filter uses bitshuffle, but does NOT compress with LZ4.
blosc¶
compression_opts: (0, 0, 0, 0, compression level, shuffle, compression)
First 4 values are reserved.
compression level: From 0 (no compression) to 9 (maximum compression). Default: 5.
shuffle: Shuffle filter:
0: no shuffle
1: byte shuffle
2: bit shuffle
compression: The compressor blosc ID:
0: blosclz (default)
1: lz4
2: lz4hc
3: snappy
4: zlib
5: zstd
By default the filter uses byte shuffle and blosclz.
lz4¶
compression_opts: (block_size,)
block size: Number of bytes per block. Default 0 for a block size of 1GB. It MUST be < 1.9 GB.
zfp¶
For more information, see zfp modes and hdf5-zfp generic interface.
The first value of compression_opts is mode. The following values depends on the value of mode:
Fixed-rate mode: (1, 0, rateHigh, rateLow, 0, 0) Rate, i.e., number of compressed bits per value, as a double stored as:
rateHigh: High 32-bit word of the rate double.
rateLow: Low 32-bit word of the rate double.
Fixed-precision mode: (2, 0, prec, 0, 0, 0)
prec: Number of uncompressed bits per value.
Fixed-accuracy mode: (3, 0, accHigh, accLow, 0, 0) Accuracy, i.e., absolute error tolerance, as a double stored as:
accHigh: High 32-bit word of the accuracy double.
accLow: Low 32-bit word of the accuracy double.
Expert mode: (4, 0, minbits, maxbits, maxprec, minexp)
minbits: Minimum number of compressed bits used to represent a block.
maxbits: Maximum number of bits used to represent a block.
maxprec: Maximum number of bit planes encoded.
minexp: Smallest absolute bit plane number encoded.
Reversible mode: (5, 0, 0, 0, 0, 0)