Save data as OME-ZARR - Halfway to I2K 2025: OME-ZARR basics

OME-ZARR is an emerging standard for storing bioimaging data in a cloud-friendly format. In this tutorial, we will learn how to save microscopy images as OME-ZARR files using Python.

We are going to use/explore two libraries (out of many) that support writing OME-ZARR files:

import ngff_zarr as nz
from ome_zarr.writer import write_image
from ome_zarr.io import parse_url
from ome_zarr.format import FormatV05
from ome_zarr.scale import Scaler
from dask import array as da
import zarr
import shutil
from bioio import BioImage

from skimage import data

A simple example: ngff-zarr¶

Before we go big, let’s test and interact with ome-zarr data from a few common python libraries. Let’s go with the cells3d dataset from skimage as an example. The data comes in ZCYX format:

image = data.cells3d()
image.shape

(60, 2, 256, 256)

We first lazily convert the image to an NgffImage like this:

ngff_image = nz.to_ngff_image(
    data=image,
    dims=['z', 'c', 'y', 'x'],
    name='cells3d')
ngff_image

NgffImage(data=dask.array<array, shape=(60, 2, 256, 256), dtype=uint16, chunksize=(60, 2, 256, 256), chunktype=numpy.ndarray>, dims=['z', 'c', 'y', 'x'], scale={'z': 1.0, 'y': 1.0, 'x': 1.0}, translation={'z': 0.0, 'y': 0.0, 'x': 0.0}, name='cells3d', axes_units=None, axes_orientations=None, computed_callbacks=[])

We can also inspect what ngff_zarr intends to do with the data when saving it as OME-ZARR:

ngff_image.data

Chunking¶

That’s a bit boring. The real strength of OME-ZARR is its chunked nature, which splits large datasets into smaller pieces (chunks) that can be accessed independently. Let’s specifiy chunk sizes when converting to an NgffImage. For this, we turn the image into a lazy dask array first, and then pass it on to to_ngff_image.

lazy_array = da.from_array(image, chunks=(10, 1, 64, 64))  # z, c, y, x
reordered_array = lazy_array.transpose(1, 0, 2, 3)  # c, z, y, x

ZCYX is a bit of an uncommon dimension order, so let’s also specify the dimension names when converting to an NgffImage. In our case, we choose a chunksize of (1, 10, 64, 64), meaning that each chunk will contain 10 z-slices, 1 channel, and a 64x64 pixel area in y and x.

Let’s assume that we know the pixel scaling for the image data. We can specify this as well when converting to an NgffImage:

ngff_image = nz.to_ngff_image(
    data=reordered_array,
    dims=['c', 'z', 'y', 'x'],
    name='cells3d',
    scale={'z': 0.5, 'y': 0.5, 'x': 0.5}
    )
ngff_image.data

Multiscales¶

Another important feature of OME-ZARR is the support for multiscale data. This means that multiple resolutions of the same image are stored together, which allows for efficient visualization and analysis of large images. The cool thing about ngff_zarr and ome_zarr_py is that they can automatically generate multiscale images for us and calculate the correct metadta (scale, etc)

In other words, when zooming out, lower resolution versions of the image can be used, which are faster to load and render. Let’s try it!

ngff_multiscales = nz.to_multiscales(
    data=ngff_image,
    scale_factors = [2, 4]
)

ngff_multiscales

Multiscales(images=[NgffImage(data=dask.array<rechunk-merge, shape=(2, 60, 256, 256), dtype=uint16, chunksize=(2, 60, 128, 128), chunktype=numpy.ndarray>, dims=['c', 'z', 'y', 'x'], scale={'z': 0.5, 'y': 0.5, 'x': 0.5}, translation={'z': 0.0, 'y': 0.0, 'x': 0.0}, name='cells3d', axes_units=None, axes_orientations=None, computed_callbacks=[]), NgffImage(data=dask.array<setitem, shape=(2, 30, 128, 128), dtype=uint16, chunksize=(2, 30, 128, 128), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 1.0, 'y': 1.0, 'x': 1.0}, translation={'z': 0.25, 'y': 0.25, 'x': 0.25}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[]), NgffImage(data=dask.array<setitem, shape=(2, 15, 64, 64), dtype=uint16, chunksize=(2, 15, 64, 64), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 2.0, 'y': 2.0, 'x': 2.0}, translation={'z': 0.75, 'y': 0.75, 'x': 0.75}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[])], metadata=Metadata(axes=[Axis(name='c', type='channel', unit=None, orientation=None), Axis(name='z', type='space', unit=None, orientation=None), Axis(name='y', type='space', unit=None, orientation=None), Axis(name='x', type='space', unit=None, orientation=None)], datasets=[Dataset(path='scale0/cells3d', coordinateTransformations=[Scale(scale=[1.0, 0.5, 0.5, 0.5], type='scale'), Translation(translation=[0.0, 0.0, 0.0, 0.0], type='translation')]), Dataset(path='scale1/cells3d', coordinateTransformations=[Scale(scale=[1.0, 1.0, 1.0, 1.0], type='scale'), Translation(translation=[0.0, 0.25, 0.25, 0.25], type='translation')]), Dataset(path='scale2/cells3d', coordinateTransformations=[Scale(scale=[1.0, 2.0, 2.0, 2.0], type='scale'), Translation(translation=[0.0, 0.75, 0.75, 0.75], type='translation')])], coordinateTransformations=None, omero=None, name='cells3d', version='0.4', type='itkwasm_gaussian', metadata=MethodMetadata(description='Smoothed with a discrete gaussian filter to generate a scale space, ideal for intensity images. ITK-Wasm implementation is extremely portable and SIMD accelerated.', method='itkwasm_downsample.downsample', version='1.8.0')), scale_factors=[2, 4], method=<Methods.ITKWASM_GAUSSIAN: 'itkwasm_gaussian'>, chunks={'c': 128, 'z': 128, 'y': 128, 'x': 128})

Exercise¶

Explore a bit how the multiscale data is structured.

How do you find the individual scales?
How do you find the scale factors?

ngff_multiscales.images

[NgffImage(data=dask.array<rechunk-merge, shape=(2, 60, 256, 256), dtype=uint16, chunksize=(2, 60, 128, 128), chunktype=numpy.ndarray>, dims=['c', 'z', 'y', 'x'], scale={'z': 0.5, 'y': 0.5, 'x': 0.5}, translation={'z': 0.0, 'y': 0.0, 'x': 0.0}, name='cells3d', axes_units=None, axes_orientations=None, computed_callbacks=[]),
 NgffImage(data=dask.array<setitem, shape=(2, 30, 128, 128), dtype=uint16, chunksize=(2, 30, 128, 128), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 1.0, 'y': 1.0, 'x': 1.0}, translation={'z': 0.25, 'y': 0.25, 'x': 0.25}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[]),
 NgffImage(data=dask.array<setitem, shape=(2, 15, 64, 64), dtype=uint16, chunksize=(2, 15, 64, 64), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 2.0, 'y': 2.0, 'x': 2.0}, translation={'z': 0.75, 'y': 0.75, 'x': 0.75}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[])]

Saving to disk¶

Finally, we can save the multiscale OME-ZARR to disk like this. An important parameter here is the version, which specifies the OME-NGFF version to use. Currently, version 0.5 is the latest released version of the ngff specification.

nz.to_ngff_zarr(
    store='cells3d.ome.zarr',
    multiscales=ngff_multiscales,
    version='0.5'
)

Lastly, use your file browser of choice to navigate to the saved cells3d.ome.zarr folder and explore its contents. You should see a structure similar to this:

cells3d.ome.zarr/
├── .zarr.json
├── scale0/
│   ├── .zarr.json
│   └── cells3d/
│          └── c/
│              └── 0/
│              └── 1/
├── scale1/
│   ├── .zarr.json
│   └── cells3d/
│          └── c/
│              └── 0/
│              └── 1/
...

Which essentially reflects the chunking (1 along the channel axis) and multiscale structure (scale0, scale1, ...) we specified when creating the NgffImage.

Exercise¶

Play around with different chunk sizes and see how this affects the structure of the saved OME-ZARR file.

Optional: Repeat with ome-zarr-py¶

The ome-zarr-py library provides similar functionality for saving OME-ZARR files. Let’s repeat the steps above with this library.

Ome-zarr-py mirrors the read/write functionality of zarr-python more closely, which may or may not be an advantage depending on your use case. First, specify a non-empty directory to save the ome-zarr file in:

target_folder = r'cells3d_ome_zarr_py.ome.zarr'
shutil.rmtree(target_folder, ignore_errors=True)

store = parse_url(target_folder, mode='w').store
root = zarr.group(store=store)

Multiscales and metadata¶

Applying the scaling and setting the scale information is slightly different here. First, we need to create a Scaler object, which takes care of the downsampling for us. The downscale parameter controls the downsampling factor between each scale level, and max_layer specifies how many levels to create. The method parameter defines the downsampling method to use.

max_layer = 2
factor = 2
scaler = Scaler(
    downscale=factor,
    max_layer=max_layer,
    method='local_mean'
)

If we want to pass scale information, we need to create the relevant metadata structure ourselves and pass it to the writer:

scale = [1.0, 0.5, 0.5, 0.5]

transformations = []
for i in range(0, max_layer + 1):
    scales = [1] + [s / (factor ** i) for s in scale[1:]]
    transformations.append(
        [{'type': 'scale', 'scale': scales}]
    )

transformations

[[{'type': 'scale', 'scale': [1, 0.5, 0.5, 0.5]}],
 [{'type': 'scale', 'scale': [1, 0.25, 0.25, 0.25]}],
 [{'type': 'scale', 'scale': [1, 0.125, 0.125, 0.125]}]]

Exercise¶

Use the write_image function to save the multiscale OME-ZARR file and the metadata we created to disk.

write_image?

[]

Reading¶

While you can use the above-libraries to read ome-zarr, there is merit in using a more general library that can read multiple formats. If you are using multiple formats in your work, this allows your workflows to be format-agnostic. One such library is Bioio, which is the successor of the well-known aicsimageio library. Here’s how to read an OME-ZARR file with Bioio:

image = BioImage('cells3d.ome.zarr/', )
image

<BioImage [plugin: bioio-ome-zarr installed at 2025-11-06 14:52:44.497175, Image-is-in-Memory: False]>

Exercise¶

Try to find out the scaling information from the loaded image.

Halfway to I2K 2025: OME-ZARR basics

Upload to cloud