Release Notes

Version 0.7

On master

Upcoming changes

0.7.3 2020-05-20

Bug fixes

  • Fixed bug where graphs used too much memory when copying #381 I Virshup

0.7.2 2020-05-15

Concatenation overhaul I Virshup

  • Elements of uns can now be merged, see #350

  • Outer joins now work for layers and obsm, see #352

  • Fill value for outer joins can now be specified

  • Expect improvments in performance, see #303

Functionality

  • obsp and varp can now be transposed #370 A Wolf

  • obs_names_make_unique() is now better at making values unique, and will warn if ambiguities arise #345 M Weiden

  • obsp is now preferred for storing pairwise relationships between observations. In practice, this means there will be deprecation warnings and reformatting applied to objects which stored connectivities under uns["neighbors"]. Square matrices in uns will no longer be sliced (use .{obs,var}p instead). #337 I Virshup

  • ImplicitModificationWarning is now exported #315 P Angerer

  • Better support for ndarray subclasses stored in AnnData objects #335 michalk8

Bug fixes

  • Fixed inplace modification of Index objects by the make unique function #348 I Virshup

  • Passing ambiguous keys to obs_vector() and var_vector() now throws errors #340 I Virshup

  • Fix instantiating AnnData objects from DataFrame #316 P Angerer

  • Fixed indexing into AnnData objects with arrays like adata[adata[:, gene].X > 0] #332 I Virshup

  • Fixed type of version #315 P Angerer

  • Fixed deprecated import from pandas #319 P Angerer

0.7.0 2020-01-22

Warning

Breaking changes introduced between 0.6.22.post1 and 0.7:

  • Elements of AnnDatas don’t have their dimensionality reduced when the main object is subset. This is to maintain consistency when subsetting. See discussion in #145.

  • Internal modules like anndata.core are private and their contents are not stable: See #174.

  • The old deprecated attributes .smp*. .add and .data have been removed.

View overhaul #164

  • Indexing into a view no longer keeps a reference to intermediate view, see #62.

  • Views are now lazy. Elements of view of AnnData are not indexed until they’re accessed.

  • Indexing with scalars no longer reduces dimensionality of contained arrays, see #145.

  • All elements of AnnData should now follow the same rules about how they’re subset, see #145.

  • Can now index by observations and variables at the same time.

IO overhaul #167

  • Reading and writing has been overhauled for simplification and speed.

  • Time and memory usage can be half of previous in typical use cases

  • Zarr backend now supports sparse arrays, and generally is closer to having the same features as HDF5.

  • Backed mode should see significant speed and memory improvements for access along compressed dimensions and IO. PR #241.

  • Categoricals can now be ordered (PR #230) and written to disk with a large number of categories (PR #217).

Mapping attributes overhaul (obsm, varm, layers, …)

  • New attributes obsp and varp have been added for two dimensional arrays where each axis corresponds to a single axis of the AnnData object. PR #207.

  • These are intended to store values like cell-by-cell graphs, which are currently stored in uns.

  • Sparse arrays are now allowed as values in all mapping attributes.

  • DataFrames are now allowed as values in obsm and varm.

  • All mapping attributes now share an implementation and will have the same behaviour. PR #164.

Miscellaneous improvements

  • Mapping attributes now have ipython tab completion (e.g. adata.obsm["\t can provide suggestions) PR #183.

  • AnnData attributes are now delete-able (e.g. del adata.raw) PR #242.

  • Many many bug fixes

Version 0.6

0.6.* 2019-*-*

  • better support for aligned mappings (obsm, varm, layers) 0.6.22 #155 thanks to I Virshup

  • convenience accesors obs_vector(), var_vector() for 1d arrays. 0.6.21 #144 thanks to I Virshup

  • compatibility with Scipy >=1.3 by removing IndexMixin dependency. 0.6.20 #151 thanks to P Angerer

  • bug fix for second-indexing into views. 0.6.19 0ab553f thanks to P Angerer

  • bug fix for reading excel files. 0.6.19 90bea2c thanks to A Wolf

  • changed default compression to None in write_h5ad() to speed up read and write, disk space use is usually less critical. 0.6.16 21d8033 thanks to A Wolf

  • maintain dtype upon copy. 0.6.13 534bea4 thanks to A Wolf

  • layers inspired by .loom files allows their information lossless reading via read_loom(). 0.6.70.6.9 #46 & #48 thanks to S Rybakov

  • support for reading zarr files: read_zarr() 0.6.7 #38 thanks to T White

  • initialization from pandas DataFrames 0.6. 648bcc8 thanks to A Wolf

  • iteration over chunks chunked_X() and chunk_X() 0.6.1 #20 thanks to S Rybakov

0.6.0 2018-05-01

  • compatibility with Seurat converter

  • tremendous speedup for concatenate()

  • bug fix for deep copy of unstructured annotation after slicing

  • bug fix for reading HDF5 stored single-category annotations

  • 'outer join' concatenation: adds zeros for concatenation of sparse data and nans for dense data

  • better memory efficiency in loom exports

Version 0.5

0.5.0 2018-02-09

Version 0.4

0.4.0 2017-12-23

  • read/write .loom files

  • scalability beyond dataset sizes that fit into memory: see this blog post

  • AnnData has a raw attribute, which simplifies storing the data matrix when you consider it raw: see the clustering tutorial