Accessors and paths#
anndata.acc provides accessors that create references
to axis-aligned 1D and 2D arrays in AnnData objects.
You can use these to drive e.g. plotting or validation code.
For these purposes, they are
easy to create:
The central
Aobject is an accessor for the wholeAnnDataobject, and allows you to createAdRefobjects, which are references to arrays spanning one or two dimensions of anAnnDataobject (without being bound to a specific object):>>> from anndata.acc import A >>> A.X[:, "gene-3"] # reference to `adata[:, "gene-3"].X` as 1D vector A.X[:, 'gene-3'] >>> type(A.X[:, "gene-3"]) <class 'anndata.acc.AdRef'>
… and to use:
>>> import scanpy as sc >>> adata = sc.datasets.pbmc3k_processed()
E.g. to check if
adata.varm["PCs"]has at least 30 columns:>>> A.varm["PCs"][:, 30] in adata True
or to extract the referenced vector:
>>> ref = A.obs["louvain"] >>> adata[ref].categories[:2] Index(['CD4 T cells', 'CD14+ Monocytes'], dtype=...)
introspectible:
AdRefs have theAdRef.dims,AdRef.idx, andAdRef.accattributes, allowing you to inspect all relevant properties.>>> pc0 = A.obsm["pca"][:, 0] >>> pc0 A.obsm['pca'][:, 0] >>> pc0.idx 0 >>> pc0.acc A.obsm['pca'] >>> A.var["symbol"].dims {'var'} >>> pc0.acc.k 'pca'
convenient:
Want to reference multiple vectors from the same object? Pass a list of indices to the vector accessor:
>>> A.obsp["connectivities"][:, ["cell0", "cell1"]] [A.obsp['connectivities'][:, 'cell0'], A.obsp['connectivities'][:, 'cell1']]
extensible: see extending accessors.
API & Glossary#
The central accessor is A:
See AdAcc for examples of how to use it to create references (i.e., AdRefs).
|
|
|
A reference to a 1D or 2D array along one or two dimensions of an AnnData object. |
- reference#
An instance of
AdRef. References a 1D or 2D array inAnnDataobjects. It is independent of individual objects and can be inspected, checked for equality, used as mapping keys, or applied to concrete objects, e.g. viaref in adataoradata[ref]. An example of this would beA.obsm["d"][:, 2]but notA.obsm["d"], which is a reference accessor.- accessor#
An instance of any of the
*Accclasses, i.e.AdAcc, or subclasses ofMapAccorRefAcc. Can be descended into via attribute access to get deeper accessors (e.g.A→A.obs) or references (e.g.A.obs.index,A.obs["c"]). Their presence in an anndata object can also be checked viaacc in adata.- reference accessor#
RefAccsubclasses directly create references (AdRefinstances). They can be accessed from these references using theAdRef.accattribute, and are therefore useful in matches orisinstance()checks:RefAcc(*, ref_class)Abstract base class for reference accessors.
Class
has attributes
available as
A.???Example reference creation
A.X[:, :],A.layers["c"][:, "g0"]A.obs["a"],A.var["b"]A.obsm["d"][:, 2]A.obsp["e"][:, "c1"],A.vbsp["e"]["g0", :]- mapping accessor#
MapAccsubclasses can be indexed with a string to create reference accessors, e.g.A.layersorA.obsmare bothMapAccs, whileA.layers["a"]is aLayerAccandA.obsm["b"]is aMultiAcc.MapAccs are mostly useful for extending, but might be useful for APIs that need to refer to aMappingof arrays:MapAcc()Accessor for mapping containers.
LayerMapAcc(*, ref_class[, ref_acc_cls])Accessor for layers (
A.layers).MultiMapAcc(dim, *, ref_class[, ref_acc_cls])Accessor for multi-dimensional array containers (
A.obsm/A.varm).GraphMapAcc(dim, *, ref_class[, ref_acc_cls])
Extending accessors#
There are three layers of extensibility:
subclassing
RefAccand creating a newAdRefinstance for creating them:from matplotlib import pyplot as plt from anndata.acc import AdAcc, AdRef class MplRef(AdRef, str): """Matplotlib will only treat strings as references, so we subclass `str`.""" def __new__(cls, acc, idx) -> None: obj = str.__new__(cls, str(AdRef(acc, idx))) AdRef.__init__(obj, acc, idx) return obj A = AdAcc(ref_class=MplRef) adata = sc.datasets.pbmc3k_processed() plt.scatter(*A.obsm["X_umap"][:, [0, 1]], c=A.obs["n_counts"], data=adata)
subclass one or more of the reference accessors, and create a new
AdAccinstance:>>> from anndata.acc import AdAcc, AdRef, MetaAcc >>> >>> class TwoDRef(AdRef): ... """A reference able to refer to multiple metadata columns.""" ... ... >>> >>> class MyMetaAcc(MetaAcc): ... def __getitem__(self, k): ... if isinstance(k, list): ... # override default behavior of returning a list of refs ... return self.ref_class(self, k) ... return super().__getitem__(k) >>> >>> A = AdAcc(ref_class=TwoDRef, meta_cls=MyMetaAcc) >>> A.obs[["a", "b"]] A.obs[['a', 'b']]
subclass
AdAccto add new accessors:>>> from dataclasses import dataclass, field >>> from anndata.acc import AdAcc, MetaAcc >>> >>> @dataclass(frozen=True) ... class EHRAcc(AdAcc): ... tem: MetaAcc = field(init=False) ... def __post_init__(self) -> None: ... super().__post_init__() ... tem = MetaAcc("tem", ref_class=self.ref_class) ... object.__setattr__(self, "tem", tem) # necessary because it’s frozen >>> >>> A = EHRAcc() >>> A.tem["visit_id"] A.tem['visit_id']