anndata.experimental.sparse_dataset

Contents

anndata.experimental.sparse_dataset#

anndata.experimental.sparse_dataset(group)[source]#

Generates a backed mode-compatible sparse dataset class.

Parameters:
group Union[Group, Group]

The backing group store.

Return type:

CSRDataset | CSCDataset

Returns:

Sparse dataset class.

Example

First we’ll need a stored dataset:

>>> import scanpy as sc
>>> import h5py
>>> from anndata.experimental import sparse_dataset, read_elem
>>> sc.datasets.pbmc68k_reduced().raw.to_adata().write_h5ad("pbmc.h5ad")

Initialize a sparse dataset from storage

>>> f = h5py.File("pbmc.h5ad")
>>> X = sparse_dataset(f["X"])
>>> X
CSRDataset: backend hdf5, shape (700, 765), data_dtype float32

Indexing returns sparse matrices

>>> X[100:200]  
<100x765 sparse matrix of type '<class 'numpy.float32'>'
    with 25003 stored elements in Compressed Sparse Row format>

These can also be used inside of an AnnData object, no need for backed mode

>>> from anndata import AnnData
>>> adata = AnnData(
...     layers={"backed": X}, obs=read_elem(f["obs"]), var=read_elem(f["var"])
... )
>>> adata.layers["backed"]
CSRDataset: backend hdf5, shape (700, 765), data_dtype float32

Indexing access (i.e., from views) brings selection into memory

>>> adata[adata.obs["bulk_labels"] == "CD56+ NK"].layers[
...     "backed"
... ]  
<31x765 sparse matrix of type '<class 'numpy.float32'>'
    with 7340 stored elements in Compressed Sparse Row format>