anndata - Annotated data#
anndata is a Python package for handling annotated data matrices in memory and on disk, positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface.
Discuss development on GitHub.
Read the documentation.
Ask questions on the scverse Discourse.
Install via
pip install anndata
orconda install anndata -c conda-forge
.See Scanpy’s documentation for usage related to single cell data. anndata was initially built for Scanpy.
anndata is part of the scverse project (website, governance) and is fiscally sponsored by NumFOCUS. Please consider making a tax-deductible donation to help the project pay for developer time, professional services, travel, workshops, and a variety of other needs.

Citation#
If you use anndata
in your work, please cite the anndata
pre-print as follows:
anndata: Annotated data
Isaac Virshup, Sergei Rybakov, Fabian J. Theis, Philipp Angerer, F. Alexander Wolf
bioRxiv 2021 Dec 19. doi: 10.1101/2021.12.16.473007.
You can cite the scverse publication as follows:
The scverse project provides a computational ecosystem for single-cell omics data analysis
Isaac Virshup, Danila Bredikhin, Lukas Heumos, Giovanni Palla, Gregor Sturm, Adam Gayoso, Ilia Kats, Mikaela Koutrouli, Scverse Community, Bonnie Berger, Dana Pe’er, Aviv Regev, Sarah A. Teichmann, Francesca Finotello, F. Alexander Wolf, Nir Yosef, Oliver Stegle & Fabian J. Theis
Nat Biotechnol. 2023 Apr 10. doi: 10.1038/s41587-023-01733-8.
Latest additions#
Version 0.11#
0.11.0 the future#
Features
Bugfix
Documentation
Performance
Breaking
Version 0.10#
0.10.4 the future#
Bugfix
Only try to use
Categorical.map(na_action=…)
in actually supported Pandas ≥2.1 #1226 @flying-sheepAnnData.__sizeof__()
support for backed datasets #1230 @Neah-Koadata[:, []]
now returns anAnnData
object empty on the appropriate dimensions instead of erroring #1243 @ilan-gold
Documentation
Performance
0.10.3 2023-10-31#
Bugfix
Prevent pandas from causing infinite recursion when setting a slice of a categorical column #1211 @flying-sheep
Documentation
Stop showing “Support for Awkward Arrays is currently experimental” warnings when reading, concatenating, slicing, or transposing AnnData objects #1182 @flying-sheep
Other updates
Fail canary CI job when tests raise unexpected warnings. #1182 @flying-sheep
0.10.2 2023-10-11#
Bugfix
Added compatibility layer for packages relying on
anndata._core.sparse_dataset.SparseDataset
. Note that this API is deprecated and new code should useCSRDataset
,CSCDataset
, andsparse_dataset()
instead. #1185 @ivirshupHandle deprecation warning from
pd.Categorical.map
thrown duringanndata.concat
#1189 @flying-sheep @ivirshupFixed extra steps being included in IO tracebacks #1193 @flying-sheep
as_dense
argument ofwrite_h5ad
no longer writes an array without encoding metadata #1193 @flying-sheep
Performance
Improved performance of
concat_on_disk
with dense arrays in some cases #1169 @selmanozleyen
0.10.1 2023-10-08#
Bugfix
0.10.0 2023-10-06#
Features
GPU Support
Dense and sparse
CuPy
arrays are now supported #1066 @ivirshupOnce you have
CuPy
arrays in your anndata, use it with:rapids-singlecell
from v0.9+
anndata now has GPU enabled CI. Made possibly by a grant from CZI’s EOSS program and managed via Cirun #1066 #1084 @Zethson @ivirshup
Out of core
Concatenate on-disk anndata objects with
anndata.experimental.concat_on_disk()
#955 @selmanozleyenAnnData can now hold dask arrays with
scipy.sparse.spmatrix
chunks #1114 @ivirshupPublic API for interacting with on disk sparse arrays:
sparse_dataset()
,CSRDataset
, andCSCDataset
#765 @ilan-gold @ivirshupImproved performance for simple slices of OOC sparse arrays #1131 @ivirshup
Improved errors and warnings
Improved error messages when combining dataframes with duplicated column names #1029 @ivirshup
Improved warnings when modifying views of
AlingedMappings
#1016 @flying-sheep @ivirshupAnnDataReadError
s have been removed. The original error is now thrown with additional information in a note #1055 @ivirshup
Documentation
Added zarr examples to file format docs #1162 @ivirshup
Breaking changes
anndata.AnnData.transpose()
no longer copies unnecessarily. If you rely on the copying behavior, call.copy
on the resulting object. #1114 @ivirshup
Other updates
Bump minimum python version to 3.9 #1117 @flying-sheep
Deprecations
Deprecate
anndata.read
, which was just an alias foranndata.read_h5ad()
#1108 @ivirshup.dtype
argument toAnnData
constructor is now deprecated #1153 @ivirshup
Bug fixes
Fix shape inference on initialization when
X=None
is specified #1121 @flying-sheep
See Release notes for more.
News#
Muon paper published 2022-02-02#
Muon has been published in Genome Biology [^cite_bredikhin22].
Muon is a framework for multimodal data built on top of AnnData
.
COVID-19 datasets distributed as h5ad
2020-04-01#
In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad
files: covid19cellatlas.org.