anndata.experimental.read_lazy#
- anndata.experimental.read_lazy(store, *, load_annotation_index=True)[source]#
Lazily read in on-disk/in-cloud AnnData stores, including
obs
andvar
. No array data should need to be read into memory with the exception ofak.Array
, scalars, and some older-encoding arrays.- Parameters:
- store
str
|Path
|MutableMapping
|Group
|Dataset
A store-like object to be read in. If
zarr.Group
, it is best for it to be consolidated.- load_annotation_index
bool
(default:True
) Whether or not to use a range index for the
{obs,var}
xarray.Dataset
so as not to load the index into memory. IfFalse
, the realindex
will be inserted as{obs,var}_names
in the object but not be one of thecoords
thereby preventing read operations. Access toadata.obs.index
will also only give the dummy index, and not the “real” index that is file-backed.
- store
- Return type:
- Returns:
A lazily read-in
AnnData
object.
Examples
Preparing example objects
>>> import anndata as ad >>> from urllib.request import urlretrieve >>> import scanpy as sc >>> base_url = "https://datasets.cellxgene.cziscience.com" >>> def get_cellxgene_data(id_: str): ... out_path = sc.settings.datasetdir / f"{id_}.h5ad" ... if out_path.exists(): ... return out_path ... file_url = f"{base_url}/{id_}.h5ad" ... sc.settings.datasetdir.mkdir(parents=True, exist_ok=True) ... urlretrieve(file_url, out_path) ... return out_path >>> path_b_cells = get_cellxgene_data("a93eab58-3d82-4b61-8a2f-d7666dcdb7c4") >>> path_fetal = get_cellxgene_data("d170ff04-6da0-4156-a719-f8e1bbefbf53") >>> b_cells_adata = ad.experimental.read_lazy(path_b_cells) >>> fetal_adata = ad.experimental.read_lazy(path_fetal) >>> print(b_cells_adata) AnnData object with n_obs × n_vars = 146 × 33452 obs: 'donor_id', 'self_reported_ethnicity_ontology_term_id', 'organism_ontology_term_id', ... >>> print(fetal_adata) AnnData object with n_obs × n_vars = 344 × 15585 obs: 'nCount_Spatial', 'nFeature_Spatial', 'Cluster', 'adult_pred_type'...
This functionality is compatible with
anndata.concat()
>>> ad.concat([b_cells_adata, fetal_adata], join="outer") AnnData object with n_obs × n_vars = 490 × 33452 obs: 'donor_id', 'self_reported_ethnicity_ontology_term_id', 'organism_ontology_term_id'...