Contribute
This project follows the standard open-source project github workflow, which is described in other projects like matplotlib or scikit-image.
Testing
To run self-contained tests, from Python:
import hdf5plugin.test
hdf5plugin.test.run_tests()
Or, from the command line:
python -m hdf5plugin.test
To also run tests relying on actual HDF5 files, run from the source directory:
python test/test.py
This tests the installed version of hdf5plugin.
Building documentation
Documentation relies on Sphinx.
To build documentation, run from the project root directory:
python setup.py build
PYTHONPATH=build/lib.<os>-<machine>-<pyver>/ sphinx-build -b html doc/ build/html
Guidelines to add a compression filter
This briefly describes the steps to add a HDF5 compression filter to the zoo.
Add the source of the HDF5 filter and compression algorithm code in a subdirectory in
src/[filter]
. Best is to usegit subtree
else copy the files there (including the license file). A released version of the filter + compression library should be used.git subtree
command:git subtree add --prefix=src/[filter] [git repository] [release tag] --squash
Update
setup.py
to build the filter dynamic library by adding an extension using theHDF5PluginExtension
class (a subclass ofsetuptools.Extension
) which adds extra files and compile options to enable dynamic loading of the filter. The name of the extension should behdf5plugin.plugins.libh5<filter_name>
.Add a “CONSTANT” in
src/hdf5plugin/__init__.py
named with theFILTER_NAME
which value is the HDF5 filter ID (See HDF5 registered filters).Add a
"<filter_name>": <FILTER_ID_CONSTANT>
entry inhdf5plugin.FILTERS
. You must use the same filter_name as in the extension insetup.py
(without thelibh5
prefix) . The names inFILTERS
are used to find the available filter libraries.In case of import errors related to HDF5-related undefined symbols, add eventual missing functions under
src/hdf5_dl.c
.Add a compression options helper function named
<filter_name>_options
inhdf5plugins/__init__.py
which should return a dictionary containing the values for thecompression
andcompression_opts
arguments ofh5py.Group.create_dataset
. This is intended to ease the usage ofcompression_opts
.Add tests:
In
test/test.py
for testing reading a compressed file that was produced with another software.In
src/hdf5plugin/test.py
for tests that writes data using the compression filter and the compression options helper function and reads back the data.
Update the
doc/information.rst
file to document:The version of the HDF5 filter that is embedded in
hdf5plugin
.The license of the filter (by adding a link to the license file).
Update the
doc/usage.rst
file to document:The
hdf5plugin.<FilterName>
compression argument helper class.
Update
doc/contribute.rst
to document the format ofcompression_opts
expected by the filter (see Compression filters can be configured with the ``compression_opts` argument of h5py.Group.create_dataset method by providing a tuple of integers.
Low-level compression filter arguments
Compression filters can be configured with the compression_opts
argument of h5py.Group.create_dataset method by providing a tuple of integers.
The meaning of those integers is filter dependent and is described below.
bitshuffle
compression_opts: (block_size, lz4 compression)
block size: Number of elements (not bytes) per block. It MUST be a mulitple of 8. Default: 0 for a block size of about 8 kB.
lz4 compression: 0: disabled (default), 2: enabled.
By default the filter uses bitshuffle, but does NOT compress with LZ4.
blosc
compression_opts: (0, 0, 0, 0, compression level, shuffle, compression)
First 4 values are reserved.
compression level: From 0 (no compression) to 9 (maximum compression). Default: 5.
shuffle: Shuffle filter:
0: no shuffle
1: byte shuffle
2: bit shuffle
compression: The compressor blosc ID:
0: blosclz (default)
1: lz4
2: lz4hc
3: snappy
4: zlib
5: zstd
By default the filter uses byte shuffle and blosclz.
lz4
compression_opts: (block_size,)
block size: Number of bytes per block. Default 0 for a block size of 1GB. It MUST be < 1.9 GB.
zfp
For more information, see zfp modes and hdf5-zfp generic interface.
The first value of compression_opts is mode. The following values depends on the value of mode:
Fixed-rate mode: (1, 0, rateHigh, rateLow, 0, 0) Rate, i.e., number of compressed bits per value, as a double stored as:
rateHigh: High 32-bit word of the rate double.
rateLow: Low 32-bit word of the rate double.
Fixed-precision mode: (2, 0, prec, 0, 0, 0)
prec: Number of uncompressed bits per value.
Fixed-accuracy mode: (3, 0, accHigh, accLow, 0, 0) Accuracy, i.e., absolute error tolerance, as a double stored as:
accHigh: High 32-bit word of the accuracy double.
accLow: Low 32-bit word of the accuracy double.
Expert mode: (4, 0, minbits, maxbits, maxprec, minexp)
minbits: Minimum number of compressed bits used to represent a block.
maxbits: Maximum number of bits used to represent a block.
maxprec: Maximum number of bit planes encoded.
minexp: Smallest absolute bit plane number encoded.
Reversible mode: (5, 0, 0, 0, 0, 0)