anndata.experimental.read_lazy#
- anndata.experimental.read_lazy(store, *, load_annotation_index=True)[source]#
Lazily read in on-disk/in-cloud AnnData stores, including
obsandvar. No array data should need to be read into memory with the exception ofak.Array, scalars, and some older-encoding arrays.- Parameters:
- store
PathLike[str] |str|MutableMapping|Group|File|Group A store-like object to be read in. If
zarr.Group, it is best for it to be consolidated. If a path to an.h5adfile is provided, the open HDF5 file will be attached to the {class}`~anndata.AnnData` at thefileattribute and it will be the user’s responsibility to close it when done with the returned object. For this reason, it is recommended to use an {class}`h5py.File` as thestoreargument when working with h5 files. It must remain open for at least as long as this returned object is in use.- load_annotation_index
bool(default:True) Whether or not to use a range index for the
{obs,var}xarray.Datasetso as not to load the index into memory. IfFalse, the realindexwill be inserted as{obs,var}_namesin the object but not be one of thecoordsthereby preventing read operations. Access toadata.obs.indexwill also only give the dummy index, and not the “real” index that is file-backed.
- store
- Return type:
- Returns:
A lazily read-in
AnnDataobject.
Examples
Preparing example objects
>>> import anndata as ad >>> import pooch >>> import scanpy as sc >>> base_url = "https://datasets.cellxgene.cziscience.com" >>> # To update hashes: pooch.retrieve(url, known_hash=None) prints the new hash >>> def get_cellxgene_data(id_: str, hash_: str): ... return pooch.retrieve( ... f"{base_url}/{id_}.h5ad", ... known_hash=hash_, ... fname=f"{id_}.h5ad", ... path=sc.settings.datasetdir, ... ) >>> path_b_cells = get_cellxgene_data( ... "a93eab58-3d82-4b61-8a2f-d7666dcdb7c4", ... "sha256:dac90fe2aa8b78aee2c1fc963104592f8eff7b873ca21d01a51a5e416734651c", ... ) >>> path_fetal = get_cellxgene_data( ... "d170ff04-6da0-4156-a719-f8e1bbefbf53", ... "sha256:d497eebca03533919877b6fc876e8c9d8ba063199ddc86dd9fbcb9d1d87a3622", ... ) >>> b_cells_adata = ad.experimental.read_lazy(path_b_cells) >>> fetal_adata = ad.experimental.read_lazy(path_fetal) >>> print(b_cells_adata) AnnData object with n_obs × n_vars = 146 × 33452 obs: 'donor_id', 'self_reported_ethnicity_ontology_term_id', 'organism_ontology_term_id', ... >>> print(fetal_adata) AnnData object with n_obs × n_vars = 344 × 15585 obs: 'nCount_Spatial', 'nFeature_Spatial', 'Cluster', 'adult_pred_type'...
This functionality is compatible with
anndata.concat()>>> ad.concat([b_cells_adata, fetal_adata], join="outer") AnnData object with n_obs × n_vars = 490 × 33452 obs: 'donor_id', 'self_reported_ethnicity_ontology_term_id', 'organism_ontology_term_id'...