{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Abstract\n", "\n", "[**hdf5plugin**](https://github.com/silx-kit/hdf5plugin) is a *Python* package (1) providing a set of [**HDF5**](https://portal.hdfgroup.org/display/HDF5/) compression filters (namely: Blosc, Blosc2, BitShuffle, BZip2, FciDecomp, LZ4, SZ, SZ3, ZFP, ZStandard) and (2) enabling their use from the *Python* programming language with [**h5py**](https://docs.h5py.org/) a thin, pythonic wrapper around `libHDF5`.\n", "\n", "This presentation illustrates how to use **hdf5plugin** for reading and writing compressed datasets from *Python* and gives an overview of the different HDF5 compression filters it provides.\n", "\n", "License: [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbsphinx": "hidden", "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "# Notebook requirements\n", "# A recent version of Pillow is required!\n", "#%pip install numpy h5py hdf5plugin h5glance rise jupyterlab matplotlib ipympl Pillow" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbsphinx": "hidden", "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "# Creates data.h5 used for demos\n", "from matplotlib import pyplot as plt\n", "import h5py\n", "import hdf5plugin\n", "import numpy\n", "from PIL import Image\n", "import urllib.request\n", "\n", "url = \"https://www.desy.de/e409/e116959/e119238/media/7795/1963_Luftbild_DESY-Gelaende_1284_gf_01sw_a3.jpg\"\n", "filename = urllib.request.urlretrieve(url)[0]\n", "image = numpy.array(Image.open(filename).convert(\"L\"))\n", "plt.imshow(image, cmap=\"gray\")\n", "\n", "h5file = h5py.File(\"data.h5\", mode=\"w\")\n", "h5file[\"copyright\"] = \"DESY\"\n", "h5file.attrs[\"url\"] = url\n", "h5file.create_dataset(\"/data\", data=image)\n", "h5file.create_dataset(\n", " \"/compressed_data\",\n", " data=image,\n", " chunks=image.shape,\n", " **hdf5plugin.Blosc2('lz4', filters=hdf5plugin.Blosc.BITSHUFFLE)\n", ")\n", "h5file.close()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Restart kernel once the file is created!**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import os\n", "os._exit(0) # Makes the kernel restart" ] }, { "cell_type": "markdown", "metadata": { "nbsphinx": "hidden", "slideshow": { "slide_type": "slide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " \n", "\n", "