Simple benchmarks

Here, we perform simple benchmarks to demonstrate basic performance.

import anndata as ad
import scanpy as sc
adata = sc.datasets.pbmc3k()
AnnData object with n_obs × n_vars = 2700 × 32738
    var: 'gene_ids'

Reading & writing

Let us start by writing & reading anndata’s native HDF5 file format: .h5ad:

CPU times: user 93.9 ms, sys: 17.4 ms, total: 111 ms
Wall time: 118 ms
adata ='test.h5ad')
CPU times: user 51.2 ms, sys: 13.3 ms, total: 64.5 ms
Wall time: 64.1 ms

We see that reading and writing is much faster than for loom files. The efficiency gain here is due to explicit storage of the sparse matrix structure.

CPU times: user 2.82 s, sys: 457 ms, total: 3.27 s
Wall time: 3.31 s
adata = ad.read_loom('test.loom')
CPU times: user 1.05 s, sys: 221 ms, total: 1.28 s
Wall time: 1.28 s
/Users/alexwolf/repos/anndata/anndata/_core/ ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)