Build Status Conda Coverage Docs PyPI PyPIDownloadsMonth PyPIDownloadsTotal Stars


anndata - Annotated data

anndata is a Python package for handling annotated data matrices in memory and on disk, positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface.

If you use anndata in your work, please cite the anndata pre-print as follows:

anndata: Annotated data

Isaac Virshup, Sergei Rybakov, Fabian J. Theis, Philipp Angerer, F. Alexander Wolf

bioRxiv 2021 Dec 19. doi: 10.1101/2021.12.16.473007.

You can cite the scverse publication as follows:

The scverse project provides a computational ecosystem for single-cell omics data analysis

Isaac Virshup, Danila Bredikhin, Lukas Heumos, Giovanni Palla, Gregor Sturm, Adam Gayoso, Ilia Kats, Mikaela Koutrouli, Scverse Community, Bonnie Berger, Dana Pe’er, Aviv Regev, Sarah A. Teichmann, Francesca Finotello, F. Alexander Wolf, Nir Yosef, Oliver Stegle & Fabian J. Theis

Nat Biotechnol. 2022 Apr 10. doi: 10.1038/s41587-023-01733-8.


Muon paper published 2022-02-02

Muon has been published in Genome Biology [^cite_bredikhin22]. Muon is a framework for multimodal data built on top of AnnData.

Check out Muon and its datastructure MuData.

COVID-19 datasets distributed as h5ad 2020-04-01

In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad files:

Latest additions

Version 0.9

0.9.1 2023-04-11


0.9.0 2023-04-11



Breaking changes

Other updates


Bug fixes

Version 0.8

0.8.1 the future

Bug fixes

  • Fix warning from rename_categories #790 I Virshup

  • Remove backwards compat checks for categories in uns when we can tell the file is new enough #790 I Virshup

  • Categorical arrays are now created with a python bool instead of a numpy.bool_ #856


0.8.0 14th March, 2022

IO Specification


The on disk format of AnnData objects has been updated with this release. Previous releases of anndata will not be able to read all files written by this version.

For discussion of possible future solutions to this issue, see #698

Internal handling of IO has been overhauled. This should make it much easier to support new datatypes, use partial access, and use AnnData internally in other formats.

  • Each element should be tagged with an encoding_type and encoding_version. See updated docs on the file format

  • Support for nullable integer and boolean data arrays. More data types to come!

  • Experimental support for low level access to the IO API via read_elem() and write_elem()


  • Added PyTorch dataloader AnnLoader and lazy concatenation object AnnCollection. See the tutorials #416 S Rybakov

  • Compatibility with h5ad files written from Julia #569 I Kats

  • Many logging messages that should have been warnings are now warnings #650 I Virshup

  • Significantly more efficient anndata.read_umi_tools() #661 I Virshup

  • Fixed deepcopy of a copy of a view retaining sparse matrix view mixin type #670 M Klein

  • In many cases X can now be None #463 R Cannoodt #677 I Virshup. Remaining work is documented in #467.

  • Removed hard xlrd dependency I Virshup

  • obs and var dataframes are no longer copied by default on AnnData instantiation #371 I Virshup

Bug fixes

  • Fixed issue where .copy was creating sparse matrices views when copying #670 michalk8

  • Fixed issue where .X matrix read in from zarr would always have float32 values #701 I Virshup

  • Raw.to_adata` now includes obsp in the output #404 G Eraslan


  • xlrd dropped as a hard dependency

  • Now requires h5py v3.0.0 or newer