PyPI Docs Build Status Coverage

anndata - Annotated Data

Install via pip install anndata or conda install anndata -c bioconda.

Report issues and see the code on GitHub.

AnnData provides a scalable way of keeping track of data together with learned annotations. It is used within Scanpy, for which it was initially developed. Both packages have been introduced in Genome Biology (2018).

See all releases here. The following lists selected improvements.

December 16, 2018: on GitHub and 0.6.16

  1. layers() inspired by .loom files allows their information lossless reading via read_loom()
  2. initialatization from pandas DataFrames
  3. iteration over chunks chunked_X() and chunk_X()
  4. support for reading zarr files: read_zarr()
  5. changed default compression to None in write_h5ad() to speed up read and write, disk space use is usually less critical (v0.6.16)

May 1, 2018: version 0.6

  1. compatibility with Seurat converter
  2. tremendous speedup for concatenate()
  3. bug fix for deep copy of unstructured annotation after slicing
  4. bug fix for reading HDF5 stored single-category annotations
  5. ‘outer join’ concatenation: adds zeros for concatenation of sparse data and nans for dense data
  6. better memory efficiency in loom exports

February 9, 2018: version 0.5

  1. inform about duplicates in var_names and resolve them using var_names_make_unique()
  2. automatically remove unused categories after slicing
  3. read/write .loom files using loompy 2
  4. fixed read/write for a few text file formats
  5. read UMI tools files: read_umi_tools()

December 23, 2017: version 0.4

  1. read/write .loom files
  2. scalability beyond dataset sizes that fit into memory: see this blog post
  3. AnnData has a raw attribute that simplifies storing the data matrix when you consider it “raw”: see the clustering tutorial