anndata.experimental.AnnCollection#
- class anndata.experimental.AnnCollection(adatas, *, join_obs='inner', join_obsm=None, join_vars=None, label=None, keys=None, index_unique=None, convert=None, harmonize_dtypes=True, indices_strict=True)[source]#
Lazily concatenate AnnData objects along the
obsaxis.This class doesn’t copy data from underlying AnnData objects, but lazily subsets using a joint index of observations and variables. It also allows on-the-fly application of prespecified converters to
.obsattributes of the AnnData objects.Subsetting of this object returns an
AnnCollectionView, which provides views of.obs,.obsm,.layers,.Xfrom the underlying AnnData objects.- Parameters:
- adatas
Sequence[AnnData] |dict[str,AnnData] The objects to be lazily concatenated. If a Mapping is passed, keys are used for the
keysargument and values are concatenated.- join_obs
Optional[Literal['inner','outer']] (default:'inner') If “inner” specified all
.obsattributes fromadataswill be inner joined and copied to this object. If “outer” specified all.obsmattributes fromadataswill be outer joined and copied to this object. For “inner” and “outer” subset objects will access.obsof this object, not the original.obsattributes ofadatas. IfNone, nothing is copied to this object’s.obs, a subset object will directly access.obsattributes ofadatas(with proper reindexing and dtype conversions). ForNone`the inner join rule is used to select columns of `.obsofadatas.- join_obsm
Optional[Literal['inner']] (default:None) If “inner” specified all
.obsmattributes fromadataswill be inner joined and copied to this object. Subset objects will access.obsmof this object, not the original.obsmattributes ofadatas. IfNone, nothing is copied to this object’s.obsm, a subset object will directly access.obsmattributes ofadatas(with proper reindexing and dtype conversions). For both options the inner join rule for the underlying.obsmattributes is used.- join_vars
Optional[Literal['inner']] (default:None) Specify how to join
adatasalong the var axis. IfNone, assumes alladatashave the same variables. If “inner”, the intersection of all variables inadataswill be used.- label
str|None(default:None) Column in
.obsto place batch information in. If it’s None, no column is added.- keys
Sequence[str] |None(default:None) Names for each object being added. These values are used for column values for
labelor appended to the index ifindex_uniqueis notNone. Defaults to incrementing integer labels.- index_unique
str|None(default:None) Whether to make the index unique by using the keys. If provided, this is the delimiter between “{orig_idx}{index_unique}{key}”. When
None, the original indices are kept.- convert
Callable|dict[str,Callable|dict[str,Callable]] |None(default:None) You can pass a function or a Mapping of functions which will be applied to the values of attributes (
.obs,.obsm,.layers,.X) or to specific keys of these attributes in the subset object. Specify an attribute and a key (if needed) as keys of the passed Mapping and a function to be applied as a value.- harmonize_dtypes
bool(default:True) If
True, all retrieved arrays from subset objects will have the same dtype.- indices_strict
bool(default:True) If
True, arrays from the subset objects will always have the same order of indices as in selection used to subset. This parameter can be set toFalseif the order in the returned arrays is not important, for example, when using them for stochastic gradient descent. In this case the performance of subsetting can be a bit better.
- adatas
Examples
>>> from scanpy.datasets import pbmc68k_reduced, pbmc3k_processed >>> adata1, adata2 = pbmc68k_reduced(), pbmc3k_processed() >>> adata1.shape (700, 765) >>> adata2.shape (2638, 1838) >>> dc = AnnCollection([adata1, adata2], join_vars='inner') >>> dc AnnCollection object with n_obs × n_vars = 3338 × 208 constructed from 2 AnnData objects view of obsm: 'X_pca', 'X_umap' obs: 'n_genes', 'percent_mito', 'n_counts', 'louvain' >>> batch = dc[100:200] # AnnCollectionView >>> batch AnnCollectionView object with n_obs × n_vars = 100 × 208 obsm: 'X_pca', 'X_umap' obs: 'n_genes', 'percent_mito', 'n_counts', 'louvain' >>> batch.X.shape (100, 208) >>> len(batch.obs['louvain']) 100
Attributes
Dict of all accessible attributes and their keys.
On the fly converters for keys of attributes and data matrix.
Trueifadatashave backed AnnData objects,Falseotherwise.Number of observations.
Number of variables/features.
One-dimensional annotation of observations.
Multi-dimensional annotation of observations.
Shape of the lazily concatenated data matrix
Methods