Release notes#
Version 0.10#
0.10.10.dev20+g072c433 2024-11-07#
Bug fixes#
Documentation#
- Correct - anndata.AnnData.Xtype to include- CSRDatasetand- CSCDatasetas possible types @ilan-gold (#1616)
Features#
- Add support for ellipsis indexing of the - AnnDataobject @ilan-gold (#1729)
0.10.9 2024-08-28#
Bug fixes#
- Fix writing large number of columns for - h5files @ilan-gold @selmanozleyen (#1147)
- Add warning for setting - Xon a view with repeated indices @ilan-gold (#1501)
- Coerce - numpy.matrixclasses to arrays when trying to store them in- AnnData@flying-sheep (#1516)
- Fix for setting a dense - Xview with a sparse matrix @ilan-gold (#1532)
- Upper bound - numpyfor- gpuinstallation on account of cupy/cupy#8391 @ilan-gold (#1540)
- Upper bound dask on account of #1579 @ilan-gold (#1580) 
- Ensure setting - pandas.DataFrame.indexon a view of a- AnnDatainstantiates the- DataFramefrom the view @ilan-gold (#1586)
- Disallow using - DataFrames with multi-index columns @ilan-gold (#1589)
Development Process#
Documentation#
- add - callbacktyping for- read_dispatched()and- write_dispatched()@ilan-gold (#1557)
Performance#
- Support for - concat_on_diskouter join @ilan-gold (#1504)
0.10.8 2024-06-20#
Bug fixes#
- Write out - 64bitindptr when appropriate for- concat_on_disk()#1493 @ilan-gold
- Support for Numpy 2 #1499 @flying-sheep 
- Fix - sparse_dataset()docstring test on account of new- scipyversion #1514 @ilan-gold
Documentation#
- Improved example for - sparse_dataset()#1468 @ivirshup
0.10.7 2024-04-09#
Bug fixes#
- Handle upstream - numcodecsbug where read-only string arrays cannot be encoded @ivirshup #1421
- Use in-memory sparse matrix directly to fix compatibility with - scipy- 1.13@ilan-gold #1435
Performance#
- Remove - vindexfor subsetting- dask.array.Arraybecause of its slowness and memory consumption @ilan-gold #1432
0.10.6 2024-03-11#
Bug fixes#
- Defer import of zarr in test helpers, as scanpy CI job relies on them #1343 @ilan-gold 
- Writing a dataframe with non-unique column names now throws an error, instead of silently overwriting #1335 @ivirshup 
- Bring optimization from #1233 to indexing on the whole - AnnDataobject, not just the sparse dataset itself #1365 @ilan-gold
- Fix mean slice length checking to use improved performance when indexing backed sparse matrices with boolean masks along their major axis #1366 @ilan-gold 
- Fixed overflow occurring when writing dask arrays with sparse chunks by always writing dask arrays with 64 bit indptr and indices, and adding an overflow check to - .appendmethod of sparse on disk structures #1348 @ivirshup
- Modified - ValueErrormessage for invalid- .Xduring construction to show more helpful list instead of ambiguous- __name__#1395 @eroell
- Pin - array-api-compat!=1.5to avoid incorrect implementation of- asarray#1411 @ivirshup
Documentation#
Development#
0.10.5 2024-01-25#
Bug fixes#
- Fix outer concatenation along variables when only a subset of objects had an entry in layers #1291 @ivirshup 
- Fix comparison of >2d arrays in - unsduring concatenation #1300 @ivirshup
- Fix bug (introduced in 0.10.4) where indexing an AnnData with - list[bool]would return the wrong result #1332 @ivirshup
Documentation#
- Re-add search-as-you-type, this time via - readthedocs-sphinx-search#1311 @flying-sheep
Performance#
- BaseCompressedSparseDataset’s- indptris cached #1266 @ilan-gold
- Improved performance when indexing backed sparse matrices with boolean masks along their major axis #1233 @ilan-gold 
0.10.4 2024-01-04#
Bug fixes#
- Only try to use - Categorical.map(na_action=…)in actually supported Pandas ≥2.1 #1226 @flying-sheep
- AnnData.__sizeof__()support for backed datasets #1230 @Neah-Ko
- adata[:, []]now returns an- AnnDataobject empty on the appropriate dimensions instead of erroring #1243 @ilan-gold
- adata.X[mask]works in newer- numpyversions when- Xis- backed#1255 @ilan-gold
- adata.X[...]fixed for- Xas a- BaseCompressedSparseDatasetwith- zarrbackend #1265 @ilan-gold
- Improve read/write error reporting #1273 @flying-sheep 
Documentation#
- Improve aligned mapping error messages #1252 @flying-sheep 
0.10.3 2023-10-31#
Bug fixes#
- Prevent pandas from causing infinite recursion when setting a slice of a categorical column #1211 @flying-sheep 
Documentation#
- Stop showing “Support for Awkward Arrays is currently experimental” warnings when reading, concatenating, slicing, or transposing AnnData objects #1182 @flying-sheep 
Other updates#
- Fail canary CI job when tests raise unexpected warnings. #1182 @flying-sheep 
0.10.2 2023-10-11#
Bug fixes#
- Added compatibility layer for packages relying on - anndata._core.sparse_dataset.SparseDataset. Note that this API is deprecated and new code should use- CSRDataset,- CSCDataset, and- sparse_dataset()instead. #1185 @ivirshup
- Handle deprecation warning from - pd.Categorical.mapthrown during- anndata.concat#1189 @flying-sheep @ivirshup
- Fixed extra steps being included in IO tracebacks #1193 @flying-sheep 
- as_denseargument of- write_h5adno longer writes an array without encoding metadata #1193 @flying-sheep
Performance#
- Improved performance of - concat_on_diskwith dense arrays in some cases #1169 @selmanozleyen
0.10.1 2023-10-08#
Bug fixes#
0.10.0 2023-10-06#
Features#
GPU Support
- Dense and sparse - CuPyarrays are now supported #1066 @ivirshup- Once you have - CuPyarrays in your anndata, use it with:- rapids-singlecellfrom v0.9+
 
- anndata now has GPU enabled CI. Made possibly by a grant from CZI’s EOSS program and managed via Cirun #1066 #1084 @Zethson @ivirshup 
Out of core
- Concatenate on-disk anndata objects with - anndata.experimental.concat_on_disk()#955 @selmanozleyen
- AnnData can now hold dask arrays with - scipy.sparse.spmatrixchunks #1114 @ivirshup
- Public API for interacting with on disk sparse arrays: - sparse_dataset(),- CSRDataset, and- CSCDataset#765 @ilan-gold @ivirshup
- Improved performance for simple slices of OOC sparse arrays #1131 @ivirshup 
Improved errors and warnings
- Improved error messages when combining dataframes with duplicated column names #1029 @ivirshup 
- Improved warnings when modifying views of - AlingedMappings#1016 @flying-sheep @ivirshup
- AnnDataReadErrors have been removed. The original error is now thrown with additional information in a note #1055 @ivirshup
Documentation#
- Added zarr examples to file format docs #1162 @ivirshup 
Breaking changes#
- anndata.AnnData.transpose()no longer copies unnecessarily. If you rely on the copying behavior, call- .copyon the resulting object. #1114 @ivirshup
Other updates#
- Bump minimum python version to 3.9 #1117 @flying-sheep 
Deprecations#
- Deprecate - anndata.read, which was just an alias for- anndata.read_h5ad()#1108 @ivirshup.
- dtypeargument to- AnnDataconstructor is now deprecated #1153 @ivirshup
Bug fixes#
- Fix shape inference on initialization when - X=Noneis specified #1121 @flying-sheep
Version 0.9#
0.9.2 2023-07-25#
Bug fixes#
- Views of - awkward.Arrays now work with- awkward>=2.3#1040 @ivirshup
- Fix ufuncs of views like - adata.X[:10].cov(axis=0)returning views #1043 @flying-sheep
- Fix instantiating AnnData where - .Xis a- DataFramewith an integer valued index #1002 @flying-sheep
- Fix - read_zarr()when used on- zarr.Group#1057 @ivirshup
0.9.1 2023-04-11#
Bug fixes#
0.9.0 2023-04-11#
Features#
- Added experimental support for dask arrays #813 @syelman @rahulbshrestha 
- obsm,- varmand- unscan now hold AwkwardArrays #647 @giovp, @grst, @ivirshup
- Added experimental functions - anndata.experimental.read_dispatched()and- anndata.experimental.write_dispatched()which allow customizing IO with a callback #873 @ilan-gold @ivirshup
- Better error messages during IO #734 @flying-sheep, @ivirshup 
- Unordered categorical columns are no longer cast to object during - anndata.concat()#763 @ivirshup
Documentation#
- New tutorials for experimental features 
- File format description now includes a more formal specification #882 @ivirshup 
- Interoperability: new page on interoperability with other packages #831 @ivirshup 
- Expanded docstring more documentation for - backedargument of- anndata.read_h5ad()#812 @jeskowagner
- Documented how to use alternative compression methods for the - h5adfile format, see- AnnData.write_h5ad()#857 @nigeil
Breaking changes#
Other updates#
Deprecations#
- AnnData.concatenate()is now deprecated in favour of- anndata.concat()#845 @ivirshup
Bug fixes#
- Fix warning from - rename_categories#790 I Virshup
- Remove backwards compat checks for categories in - unswhen we can tell the file is new enough #790 I Virshup
- Categorical arrays are now created with a python - boolinstead of a- numpy.bool_#856
- Fixed order dependent outer concatenation bug #904 @ivirshup, reported by @szalata 
- Fixed bug in renaming categories #790 @ivirshup, reported by @perrin-isir 
- Fixed IO bug when keys in - unsended in- _categories#806 @ivirshup, reported by @Hrovatin
- Fixed - raw.to_adatanot populating- obsaligned values when- rawwas assigned through the setter #939 @ivirshup
Version 0.8#
0.8.0 14th March, 2022#
IO Specification#
Warning
The on disk format of AnnData objects has been updated with this release.
Previous releases of anndata will not be able to read all files written by this version.
For discussion of possible future solutions to this issue, see #698
Internal handling of IO has been overhauled.
This should make it much easier to support new datatypes, use partial access, and use AnnData internally in other formats.
- Each element should be tagged with an - encoding_typeand- encoding_version. See updated docs on the file format
- Support for nullable integer and boolean data arrays. More data types to come! 
- Experimental support for low level access to the IO API via - read_elem()and- write_elem()
Features#
- Added PyTorch dataloader - AnnLoaderand lazy concatenation object- AnnCollection. See the tutorials #416 S Rybakov
- Compatibility with - h5adfiles written from Julia #569 I Kats
- Many logging messages that should have been warnings are now warnings #650 I Virshup 
- Significantly more efficient - anndata.read_umi_tools()#661 I Virshup
- Fixed deepcopy of a copy of a view retaining sparse matrix view mixin type #670 M Klein 
- In many cases - Xcan now be- None#463 R Cannoodt #677 I Virshup. Remaining work is documented in #467.
- Removed hard - xlrddependency I Virshup
- obsand- vardataframes are no longer copied by default on- AnnDatainstantiation #371 I Virshup
Bug fixes#
Dependencies#
- xlrddropped as a hard dependency
- Now requires - h5py- v3.0.0or newer
Version 0.7#
0.7.8 9 November, 2021#
Bug fixes#
- Re-include test helpers #641 I Virshup 
0.7.7 9 November, 2021#
Bug fixes#
- Fixed propagation of import error when importing - write_zarrbut not all dependencies are installed #579 R Hillje
- Fixed issue with - .unssub-dictionaries being referenced by copies #576 I Virshup
- Fixed out-of-bounds integer indices not raising - IndexError#630 M Klein
- Fixed backed - SparseDatasetindexing with scipy 1.7.2 #638 I Virshup
Development processes#
- Use PEPs 621 (standardized project metadata), 631 (standardized dependencies), and 660 (standardized editable installs) #639 I Virshup 
0.7.6 11 April, 2021#
Features#
- Added - anndata.AnnData.to_memory()for returning an in memory object from a backed one #470 #542 V Bergen I Virshup
- anndata.AnnData.write_loom()now writes- obs_namesand- var_namesusing the- Index’s- .nameattribute, if set #538 I Virshup
Bug fixes#
- Fixed bug where - np.str_column names errored at write time #457 I Virshup
- Fixed “value.index does not match parent’s axis 0/1 names” error triggered when a data frame is stored in obsm/varm after obs_names/var_names is updated #461 G Eraslan 
- Fixed - adata.write_csvswhen- adatais a view #462 I Virshup
- Fixed null values being converted to strings when strings are converted to categorical #529 I Virshup 
- Fixed handling of compression key word arguments #536 I Virshup 
- Fixed copying a backed - AnnDatafrom changing which file the original object points at #533 ilia-kats
- Fixed a bug where calling - AnnData.concatenatean- AnnDatawith no variables would error #537 I Virshup
Deprecations#
- Passing positional arguments to - anndata.read_loom()besides the path is now deprecated #538 I Virshup
- anndata.read_loom()arguments- obsm_namesand- varm_namesare now deprecated in favour of- obsm_mappingand- varm_mapping#538 I Virshup
0.7.5 12 November, 2020#
Functionality#
- Added ipython tab completion and a useful return from - .keysto- adata.uns#415 I Virshup
Bug fixes#
0.7.4 10 July, 2020#
Concatenation overhaul #378 I Virshup#
- New function - anndata.concat()for concatenating- AnnDataobjects along either observations or variables
- New documentation section: Concatenation 
Functionality#
- AnnData object created from dataframes with sparse values will have sparse - .X#395 I Virshup
Bug fixes#
0.7.3 20 May, 2020#
Bug fixes#
- Fixed bug where graphs used too much memory when copying #381 I Virshup 
0.7.2 15 May, 2020#
Concatenation overhaul I Virshup#
Functionality#
- obs_names_make_unique()is now better at making values unique, and will warn if ambiguities arise #345 M Weiden
- obspis now preferred for storing pairwise relationships between observations. In practice, this means there will be deprecation warnings and reformatting applied to objects which stored connectivities under- uns["neighbors"]. Square matrices in- unswill no longer be sliced (use- .{obs,var}pinstead). #337 I Virshup
- ImplicitModificationWarningis now exported #315 P Angerer
- Better support for - ndarraysubclasses stored in- AnnDataobjects #335 michalk8
Bug fixes#
- Fixed inplace modification of - Indexobjects by the make unique function #348 I Virshup
- Passing ambiguous keys to - obs_vector()and- var_vector()now throws errors #340 I Virshup
- Fix instantiating - AnnDataobjects from- DataFrame#316 P Angerer
- Fixed indexing into - AnnDataobjects with arrays like- adata[adata[:, gene].X > 0]#332 I Virshup
- Fixed type of version #315 P Angerer 
0.7.0 22 January, 2020#
Warning
Breaking changes introduced between 0.6.22.post1 and 0.7:
- Elements of - AnnDatas don’t have their dimensionality reduced when the main object is subset. This is to maintain consistency when subsetting. See discussion in #145.
- Internal modules like - anndata.coreare private and their contents are not stable: See #174.
- The old deprecated attributes - .smp*.- .addand- .datahave been removed.
View overhaul #164#
- Indexing into a view no longer keeps a reference to intermediate view, see #62. 
- Views are now lazy. Elements of view of AnnData are not indexed until they’re accessed. 
- Indexing with scalars no longer reduces dimensionality of contained arrays, see #145. 
- All elements of AnnData should now follow the same rules about how they’re subset, see #145. 
- Can now index by observations and variables at the same time. 
IO overhaul #167#
- Reading and writing has been overhauled for simplification and speed. 
- Time and memory usage can be half of previous in typical use cases 
- Zarr backend now supports sparse arrays, and generally is closer to having the same features as HDF5. 
- Backed mode should see significant speed and memory improvements for access along compressed dimensions and IO. PR #241. 
- Categoricals can now be ordered (PR #230) and written to disk with a large number of categories (PR #217).
Mapping attributes overhaul (obsm, varm, layers, …)#
- New attributes - obspand- varphave been added for two dimensional arrays where each axis corresponds to a single axis of the AnnData object. PR #207.
- These are intended to store values like cell-by-cell graphs, which are currently stored in - uns.
- Sparse arrays are now allowed as values in all mapping attributes. 
- All mapping attributes now share an implementation and will have the same behaviour. PR #164. 
Miscellaneous improvements#
Version 0.6#
0.6.0 1 May, 2018#
- compatibility with Seurat converter 
- tremendous speedup for - concatenate()
- bug fix for deep copy of unstructured annotation after slicing 
- bug fix for reading HDF5 stored single-category annotations 
- 'outer join'concatenation: adds zeros for concatenation of sparse data and nans for dense data
- better memory efficiency in loom exports 
Version 0.5#
0.5.0 9 February, 2018#
- inform about duplicates in - var_namesand resolve them using- var_names_make_unique()
- automatically remove unused categories after slicing 
- read/write .loom files using loompy 2 
- fixed read/write for a few text file formats 
- read UMI tools files: - read_umi_tools()
Version 0.4#
0.4.0 23 December, 2017#
- read/write .loom files 
- scalability beyond dataset sizes that fit into memory: see this blog post 
- AnnDatahas a- rawattribute, which simplifies storing the data matrix when you consider it raw: see the clustering tutorial