
Annotation Module¶
flyqma.annotation
provides several tools for labeling distinct subpopulations of cells within an image. Subpopulations are identified on the basis of their clonal marker expression level using a novel unsupervised classification strategy. Please see the Fly-QMA manuscript for a detailed description of the annotation strategy and its various parameters.
-
class
flyqma.annotation.labelers.
AttributeLabeler
(label, attribute, labels)[source]¶ Assigns label to cell measurement data based on an existing attribute.
Attributes:
label (str) - name of label field to be added
attribute (str) - existing cell attribute used to determine labels
labeler (vectorized func) - callable that maps attribute values to labels
-
class
flyqma.annotation.labelers.
CelltypeLabeler
(label='celltype', attribute='genotype', labels=None)[source]¶ Assigns <celltype> to cell measurement data based on <genotype> attribute.
Attributes:
label (str) - name of label field to be added
attribute (str) - existing cell attribute used to determine labels
labeler (vectorized func) - callable that maps attribute values to labels
-
class
flyqma.annotation.annotation.
Annotation
(attribute, sampler_type='radial', sampler_kwargs={}, min_num_components=3, max_num_components=10, num_labels=3)[source]¶ Object for assigning labels to measurements. Object is trained on one or more graphs by fitting a bivariate mixture model and using a model selection procedure to select an optimal number of components.
The trained model may then be used to label measurements in other graphs, either through direct prediction via the bivariate mixture model or through a hybrid prediction combining the bivariate mixture model with a marginal univariate model.
Attributes:
classifier (Classifier derivative) - callable object
attribute (str) - attribute used to determine labels
sampler_type (str) - either ‘radial’, ‘neighbors’, ‘community’
sampler_kwargs (dict) - keyword arguments for sampler
min_num_components (int) - minimum number of mixture components
max_num_components (int) - maximum number of mixture components
num_labels (int) - maximum number of unique labels to be assigned
Parameters:
kwargs: keyword arguments for Classifier
-
annotate
(graph, bivariate_only=False, threshold=0.8, alpha=0.9, sampler_type=None, sampler_kwargs=None)[source]¶ Annotate graph of measurements.
Args:
graph (spatial.WeightedGraph)
bivariate_only (bool) - if True, only use posteriors evaluated using the bivariate mixture model. Otherwise, use the marginal univariate posterior by default, replacing uncertain values with their counterparts estimated by the bivariate model.
threshold (float) - minimum marginal posterior probability of a given label before spatial context is considered
alpha (float) - attenuation factor
sampler_type (str) - either ‘radial’, ‘neighbors’ or ‘community’
sampler_kwargs (dict) - keyword arguments for sampling
Returns:
labels (np.ndarray[int]) - labels for each measurement in graph
-
combine_posteriors
(posterior, marginal_posterior, threshold=0.8)[source]¶ Replace uncertain posterior probablilities with their more certain marginal counterparts. If the maximum marginal posterior probability for a given sample does not meet the specified threshold while the maximum bivarite posterior probability does, the latter value is used. Otherwise, the marginal value is used.
Args:
posterior (np.ndarray[float]) - posterior probabilities of each label
marginal_posterior (np.ndarray[float]) - marginal posterior probabilities of each label
threshold (float) - minimum marginal posterior probability of a given label before spatial context is considered
Returns:
combined (np.ndarray[float])
-
static
diffuse_posteriors
(graph, posterior, alpha=0.9)[source]¶ Diffuse estimated posterior probabilities of each label along the weighted edges of the graph.
Args:
graph (Graph) - graph connecting adjacent measurements
posterior (np.ndarray[float]) - posterior probabiltiy of each label
alpha (float) - attenuation factor
Returns:
diffused_posteriors (np.ndarray[float])
-
evaluate_marginal_posterior
(sample, margin)[source]¶ Evaluates posterior probability of each label using only the specified marginal distribution.
Args:
sample (np.ndarray[float]) - sample values
margin (int) - index of desired margin
Returns:
marginal_posterior (np.ndarray[float])
-
classmethod
from_data
(data, attribute, xykey=None, **kwargs)[source]¶ Instantiate annotation object from measurement data.
Args:
data (pd.DataFrame) - measurement data containing <attribute>, as well as <xykey> fields
attribute (str) - name of attribute used to classify cells
xykey (list) - name of attributes defining measurement x/y position
kwargs: keyword arguments for Annotation
Returns:
annotator (Annotation derivative)
-
classmethod
from_layer
(layer, attribute, **kwargs)[source]¶ Instantiate from layer.
Args:
layer (data.Layer) - image layer instance
attribute (str) - name of attribute used to classify cells
kwargs: keyword arguments for Annotation
Returns:
annotator (Annotation derivative)
-
get_sample
(graph, sampler_type, sampler_kwargs)[source]¶ Get sample to be annotated. A sample consists of a columns of measured levels adjoined to a column of levels averaged over the neighborhood of each measurement.
Args:
graph (spatial.WeightedGraph)
sampler_type (str) - either ‘radial’, ‘neighbors’ or ‘community’
sampler_kwargs (dict) - keyword arguments for sampling
Returns:
sample (np.ndarray[float]) - sampled levels
-
-
class
flyqma.annotation.annotation.
AnnotationIO
[source]¶ Methods for saving and loading an Annotation instance.
-
classmethod
load
(path)[source]¶ Load annotator from file.
Args:
path (str) - path to annotation directory
Returns:
annotator (Annotation derivative)
-
property
parameters
¶ Dictionary of parameter values.
-
classmethod
Mixture Models¶
Tools for fitting univariate and bivariate gaussian mixture models.
-
class
flyqma.annotation.mixtures.univariate.
MixtureProperties
[source]¶ Properties for guassian mixture models.
-
property
AIC
¶ AIC score.
-
property
BIC
¶ BIC score.
-
property
bounds
¶ Low and upper bounds of support.
-
property
component_pdfs
¶ Returns stacked array of component PDFs.
-
property
components
¶ Individual model components.
-
property
lbound
¶ Lower bound of support.
-
property
log_likelihood
¶ Maximized log likelihood.
-
property
means
¶ Mean value of each component.
-
property
num_components
¶ Number of model components.
-
property
num_samples
¶ Number of samples.
-
property
pdf
¶ Gaussian Mixture PDF.
-
property
scale_factor
¶ Scaling factor for log-transformed support.
-
property
stds
¶ Standard deviation of each component.
-
property
support
¶ Distribution support.
-
property
support_size
¶ Size of support.
-
property
ubound
¶ Upper bound of support.
-
property
-
class
flyqma.annotation.mixtures.univariate.
UnivariateMixture
(*args, values=None, **kwargs)[source]¶ Univariate Gaussian mixture model.
Attributes:
values (array like) - values to which model was fit
Inherited attributes:
See sklearn.mixture.GaussianMixture
-
estimate_required_samples
(SNR=5.0)[source]¶ Returns minimum number of averaged samples required to achieve the specified signal to noise (SNR) ratio.
-
classmethod
from_logsample
(sample, n=3, max_iter=10000, tol=1e-08, covariance_type='diag', n_init=10)[source]¶ Instantiate from log-transformed sample.
-
classmethod
from_parameters
(mu, sigma, weights=None, values=None, **kwargs)[source]¶ Instantiate model from parameter vectors.
-
classmethod
from_sample
(sample, n, **kwargs)[source]¶ Instantiate from log-normally distributed sample.
-
multi_logsample
(N, m=10)[source]¶ Returns <N> log-transformed samples as well as <N> log-transformed samples averaged over <m> other samples from the same component.
-
-
class
flyqma.annotation.mixtures.bivariate.
BivariateMixture
(*args, values=None, **kwargs)[source]¶ Bivariate Gaussian mixture model.
Inherited attributes:
values (array like) - values to which model was fit
See sklearn.mixture.GaussianMixture
-
class
flyqma.annotation.mixtures.bivariate.
BivariateMixtureProperties
[source]¶ Extension properties for bivariate mixtures.
-
property
extent
¶ Extent for x and y axes.
-
property
supportx
¶ Support for first dimension.
-
property
-
class
flyqma.annotation.mixtures.visualization.
BivariateVisualization
[source]¶ Visualization methods for bivariate mixture models.
-
property
tick_positions
¶ Tick positions.
-
property
Model Selection¶
Tools for statistical model selection.
-
class
flyqma.annotation.model_selection.univariate.
SelectionIO
[source]¶ Methods for saving and loading a model selection instance.
-
class
flyqma.annotation.model_selection.univariate.
UnivariateModelSelection
(values, attribute, min_num_components=3, max_num_components=8, num_labels=3, models=None)[source]¶ Class for performing univariate mixture model selection. The optimal model is chosen based on BIC score.
-
property
AIC
¶ AIC scores for each model.
-
property
AIC_optimal
¶ Model with AIC optimal number of components.
-
property
BIC
¶ BIC scores for each model.
-
property
BIC_optimal
¶ Model with BIC optimal number of components.
-
static
fit_model
(values, num_components, num_labels, **kwargs)[source]¶ Fit model with specified number of components.
-
property
models
¶ List of models ordered by number of components.
-
property
parameters
¶ Dictionary of instance parameters.
-
property
-
class
flyqma.annotation.model_selection.bivariate.
BivariateModelSelection
(values, attribute, min_num_components=3, max_num_components=8, num_labels=3, models=None)[source]¶ Bivariate extension for model selection.
Label Assignment¶
Tools for unsupervised classification of cell measurements.
-
class
flyqma.annotation.classification.classifiers.
Classifier
(values, attribute=None, num_labels=3, log=True, cmap=None)[source]¶ Classifier base class. Children of this class must possess a means attribute, as well as a predict method.
Attributes:
values (array like) - basis for clustering
attribute (str or list) - attribute(s) used to determine labels
log (bool) - indicates whether clustering performed on log values
num_labels (int) - number of output labels
classifier (vectorized func) - maps value to label_id
labels (np.ndarray[int]) - predicted labels
cmap (matplotlib.colors.ColorMap) - colormap for label_id
parameters (dict) - {param name: param value} pairs
fig (matplotlib.figures.Figure) - histogram figure
-
build_classifier
()[source]¶ Build function that returns the most probable label for each of a series of values.
-
build_colormap
(cmap, vmin=-1)[source]¶ Build normalized colormap for class labels.
Args:
cmap (matplotlib.colormap)
vmin (float) - lower bound for colorscale
Returns:
colormap (func) - function mapping class labels to colors
-
evaluate_classifier
(data)[source]¶ Assign class labels to <data>.
Args:
data (pd.DataFrame) - must contain necessary attributes
Returns:
labels (np.ndarray[int])
-
classmethod
from_grouped_measurements
(data, attribute, groupby=None, **kwargs)[source]¶ Fit classifier to data grouped by a specified feature.
Args:
data (pd.DataFrame) - measurement data
groupby (str) - attribute used to group measurement data
attribute (str or list) - attribute(s) on which to cluster
kwargs: keyword arguments for classifier
Returns:
classifier (Classifier derivative)
-
classmethod
from_measurements
(data, attribute, **kwargs)[source]¶ Fit classifier to data.
Args:
data (pd.DataFrame) - measurement data
attribute (str or list) - attribute(s) on which to cluster
kwargs: keyword arguments for classifier
Returns:
classifier (Classifier derivative)
-
-
class
flyqma.annotation.classification.classifiers.
ClassifierIO
[source]¶ Methods for saving and loading classifier objects.
-
classmethod
load
(path)[source]¶ Load classifier from file.
Args:
path (str) - path to classifier directory
Returns:
classifier (Classifier derivative)
-
save
(dirpath, data=False, image=True, extension=None, **kwargs)[source]¶ Save classifier to specified path.
Args:
dirpath (str) - directory in which classifier is to be saved
data (bool) - if True, save training data
image (bool) - if True, save labeled histogram image
extension (str) - directory name extension
kwargs: keyword arguments for image rendering
-
classmethod
-
class
flyqma.annotation.classification.classifiers.
ClassifierProperties
[source]¶ Properties for classifier objects.
-
property
centroids
¶ Means of each component (not log transformed).
-
property
component_groups
¶ List of lists of components for each label.
-
property
component_to_label
¶ Returns dictionary mapping components to labels. Mapping is achieved by k-means clustering the model centroids (linear scale).
-
property
num_samples
¶ Number of samples.
-
property
order
¶ Ordered component indices (low to high).
-
property
values
¶ Values for classifier.
-
property
-
class
flyqma.annotation.classification.kmeans.
KMeansClassifier
(values, num_components=3, groups=None, log=True, **kwargs)[source]¶ K-means classifier.
Attributes:
groups (dict) - {cluster_id: label_id} pairs for merging clusters
component_to_label (vectorized func) - maps cluster_id to label_id
km (sklearn.cluster.KMeans) - kmeans object
classifier (vectorized func) - maps value to label_id
labels (np.ndarray[int]) - predicted labels
Inherited attributes:
values (array like) - basis for clustering
attribute (str or list) - attribute(s) on which to cluster
log (bool) - indicates whether clustering performed on log values
cmap (matplotlib.colors.ColorMap) - colormap for label_id
parameters (dict) - {param name: param value} pairs
fig (matplotlib.figures.Figure) - histogram figure
-
property
means
¶ Mean of each cluster.
-
property
-
class
flyqma.annotation.classification.mixtures.
BivariateMixtureClassifier
(values, num_components=3, num_labels=3, fit_kw={}, model=None, **kwargs)[source]¶ Bivariate mixed log-normal model classifier.
Attributes:
model (mixtures.BivariateMixture) - frozen bivariate mixture model
Inherited attributes:
values (np.ndarray[float]) - basis for clustering
attribute (list) - attributes on which to cluster
num_labels (int) - number of labels
num_components (int) - number of mixture components
classifier (vectorized func) - maps values to labels
labels (np.ndarray[int]) - predicted labels
log (bool) - indicates whether clustering performed on log values
cmap (matplotlib.colors.ColorMap) - colormap for labels
parameters (dict) - {param name: param value} pairs
-
class
flyqma.annotation.classification.mixtures.
MixtureModelIO
[source]¶ Methods for saving and loading classifier objects.
-
classmethod
load
(path)[source]¶ Load classifier from file.
Args:
path (str) - path to classifier directory
Returns:
classifier (Classifier derivative)
-
save
(dirpath, data=False, image=True, extension=None, **kwargs)[source]¶ Save classifier to specified path.
Args:
dirpath (str) - directory in which classifier is to be saved
data (bool) - if True, save training data
image (bool) - if True, save labeled histogram image
extension (str) - directory name extension
kwargs: keyword arguments for image rendering
-
classmethod
-
class
flyqma.annotation.classification.mixtures.
UnivariateMixtureClassifier
(values, num_components=3, num_labels=3, fit_kw={}, model=None, **kwargs)[source]¶ Univariate mixed log-normal model classifier.
Attributes:
model (mixtures.UnivariateMixture) - frozen univariate mixture model
num_components (int) - number of mixture components
classifier (vectorized func) - maps values to labels
labels (np.ndarray[int]) - predicted labels
Inherited attributes:
values (np.ndarray[float]) - basis for clustering
num_labels (int) - number of output labels
log (bool) - indicates whether clustering performed on log values
cmap (matplotlib.colors.ColorMap) - colormap for labels
parameters (dict) - {param name: param value} pairs
-
build_classifier
()[source]¶ Build function that returns the most probable label for each of a series of values.
-
build_posterior
()[source]¶ Build function that returns the posterior probability of each label given a series of values.
-
static
fit
(values, num_components=3, **kwargs)[source]¶ Fit univariate gaussian mixture model.
Args:
values (np.ndarray[float]) - 1D array of log-transformed values
num_components (int) - number of model components
kwargs: keyword arguments for fitting
Returns:
model (mixtures.UnivariateMixture)
-
property
means
¶ Mean of each component.
-
property
num_components
¶ Number of model components.
-
-
class
flyqma.annotation.classification.visualization.
MixtureVisualization
[source]¶ Methods for visualizing a mixture-model based classifier.
-
property
component_cdfs
¶ Returns weighted CDF of each component over support.
-
property
component_pdfs
¶ Weighted component PDFs over support.
-
property
ecdf
¶ Empirical CDF over support.
-
property
epdf
¶ Empirical PDF over support.
-
property
esupport
¶ Empirical support vector (sorted values).
-
property
label_colors
¶ RGB color for each class label.
-
property
pdf
¶ Model PDF over support.
-
property
support
¶ Model support.
-
property
support_labels
¶ Labels for support vector.
-
property
Spatial Analysis¶
Tools for analyzing the 2D spatial arrangement of cells.
-
class
flyqma.annotation.spatial.triangulation.
LocalTriangulation
(*args, **kwargs)[source]¶ Triangulation with edge distance filter.
Attributes:
edge_list (np.ndarray[int]) - (from, to) node pairs
edge_lengths (np.ndarray[float]) - euclidean length of each edge
-
property
angle_threshold
¶ Predicted upper bound on edge angles.
-
property
angles
¶ Angle on [0, 2p] interval.
-
property
edge_angles
¶ Angular distance of each edge about origin.
-
property
edge_radii
¶ Minimum node radius in each edge.
-
property
edges
¶ Filtered edges.
-
classmethod
filter_edges
(nodes, edges, lengths, max_length=0.1)[source]¶ Returns all edges less than <max_length>, with at least one edge containing each node.
-
filter_longest_edge
(edges, edge_lengths)[source]¶ Returns all edges except the longest edge in each triangle.
-
classmethod
filter_outliers
(nodes, edges, lengths)[source]¶ Returns all edges whose lengths are not outliers, with at least one edge containing each node.
-
static
find_disconnected_nodes
(nodes, edges)[source]¶ Returns boolean array of nodes not included in edges.
-
property
hull
¶ Convex hull.
-
static
is_outlier
(points, threshold=3.0)[source]¶ Returns a boolean array with True if points are outliers and False otherwise.
Args:
points (np.ndarray[float]) - 1-D array of observations
threshold (float) - Maximum modified z-score. Observations with a modified z-score (based on the median absolute deviation) greater are classified as outliers.
Returns:
mask (np.ndarray[bool])
References:
Boris Iglewicz and David Hoaglin (1993), “Volume 16: How to Detect and Handle Outliers”, The ASQC Basic References in Quality Control: Statistical Techniques, Edward F. Mykytka, Ph.D., Editor.
-
property
nodes
¶ All nodes.
-
property
num_triangles
¶ Number of triangles.
-
property
radii
¶ Radius.
-
property
size
¶ Number of points.
-
property
-
class
flyqma.annotation.spatial.graphs.
CommunityDetection
[source]¶ Methods for detecting communities in a Graph.
-
class
flyqma.annotation.spatial.graphs.
Graph
(data, xykey=None)[source]¶ Object provides an undirected unweighted graph connecting adjacent cells.
Attributes:
data (pd.DataFrame) - cell measurement data (nodes)
xykey (list) - attribute keys for node x/y positions
G (nx.Graph) - undirected graph instance
nodes (np.ndarray[int]) - node indices
edges (np.ndarray[int]) - pairs of connected node indices
node_map (vectorized func) - maps positional index to node index
position_map (vectorized func) - maps node index to positional index
tri (matplotlib.tri.Triangulation) - triangulation of node positions
-
get_correlations
(attribute, log=True)[source]¶ Returns SpatialCorrelation object for <attribute>.
Args:
attribute (str) - name of attribute
log (bool) - if True, log-transform attribute values
Returns:
correlations (SpatialCorrelation)
-
-
class
flyqma.annotation.spatial.graphs.
GraphVisualizationMethods
[source]¶ Methods for visualizing a Graph instance.
-
label_triangles
(label_by='genotype')[source]¶ Label each triangle with most common node attribute value.
Args:
label_by (str) - node attribute used to label each triangle
Returns:
labels (np.ndarray[int]) - labels for each triangle
-
plot_edges
(ax=None, **kwargs)[source]¶ Plot triangulation edges.
Args:
ax (matplotlib.axes.AxesSubplot)
kwargs: keyword arguments for matplotlib.pyplot.triplot
-
plot_triangles
(label_by='genotype', cmap=None, ax=None, **kwargs)[source]¶ Plot triangle faces using tripcolor.
Args:
label_by (str) - data attribute used to color each triangle
cmap (matplotlib.colors.ColorMap) - colormap for attribute values
ax (matplotlib.axes.AxesSubplot)
kwargs: keyword arguments for plt.tripcolor
-
show
(ax=None, colorby=None, disconnect=False, **kwargs)[source]¶ Visualize graph.
Args:
ax (matplotlib.axes.AxesSubplot) - if None, create figure
colorby (str) - node attribute used to assign node/edge colors
disconnect (bool) - if True, remove edges between nodes whose colorby values differ
kwargs: keyword arguments for NetworkxGraphVisualization.draw
-
-
class
flyqma.annotation.spatial.graphs.
NetworkxGraphVisualization
(G, pos)[source]¶ Object for visualizing a NetworkX Graph object.
Attributes:
G (nx.Graph) - networkx graph object
pos (np.ndarray[float]) - 2D node positions
-
draw
(ax=None, colorby=None, edge_color='k', node_color='k', cmap=None, **kwargs)[source]¶ Draw graph.
Args:
ax (matplotlib.axes.AxesSubplot) - axis on which to draw graph
colorby (str) - node attribute on which nodes/edges are colored
edge_color, node_color (str) - edge/node colors, overrides colorby
node_cmap (matplotlib.colors.ColorMap) - node colormap
-
-
class
flyqma.annotation.spatial.graphs.
SpatialProperties
[source]¶ Spatial properties for Graph objects.
-
property
distance_matrix
¶ Euclidean distance matrix between all nodes.
-
property
edge_lengths
¶ Unique edge lengths.
-
static
evaluate_fluctuations
(values)[source]¶ Construct pairwise fluctuation matrix for <values>.
Args:
values (1D np.ndarray[float]) - attribute values
Returns:
fluctuations (2D np.ndarray[float]) - pairwise fluctuations
-
get_fluctuations_matrix
(attribute, log=True)[source]¶ Returns normalized pairwise fluctuations of <attribute> value for each node in the graph.
Args:
attribute (str) - name of attribute
log (bool) - if True, log-transform attribute values
Returns:
fluctuations (2D np.ndarray[float]) - pairwise fluctuations
-
static
get_matrix_upper
(matrix)[source]¶ Return upper triangular portion of a 2-D matrix.
Parameters:
matrix (2D np.ndarray)
Returns:
upper (1D np.ndarray) - upper triangle, ordered row then column
-
property
median_edge_length
¶ Median edge length.
-
property
node_positions
¶ Assign 2D coordinate positions to nodes.
-
property
node_positions_arr
¶ N x 2 array of node coordinates, ordered by positional index.
-
property
unique_distances
¶ Upper triangular portion of euclidean distance matrix.
-
property
-
class
flyqma.annotation.spatial.graphs.
TopologicalProperties
[source]¶ Topological properties for Graph objects.
-
property
adjacency
¶ Adjacency matrix ordered by <self.nodes>.
-
property
adjacency_positional
¶ Adjacency matrix ordered by positional index in <self.data>.
-
property
edge_list
¶ Distance-filtered edges as (from, to) tuples.
-
property
edges
¶ Distance-filtered edges.
-
property
nodes
¶ Unique nodes in graph.
-
property
nodes_order
¶ Indices that sort nodes by positional index in <self.data>.
-
property
num_nodes
¶ Number of nodes.
-
property
-
class
flyqma.annotation.spatial.graphs.
WeightFunction
(data, weighted_by='r', distance=False)[source]¶ Object for weighting graph edges by similarity.
Attributes:
data (pd.DataFrame) - nodes data
weighted_by (str) - node attribute used to assess similarity
values (pd.Series) - node attribute values
distance (bool) - if True, weights edges by distance
-
assess_weights
(edges, logratio=False)[source]¶ Evaluate edge weights normalized by mean difference in node values.
Args:
edges (list of (i, j) tuples) - edges between nodes i and j
logratio (bool) - if True, weight edges by logratio
Returns:
weights (np.ndarray[float]) - edge weights
-
-
class
flyqma.annotation.spatial.graphs.
WeightedGraph
(data, weighted_by, xykey=None, logratio=True, distance=False)[source]¶ Object provides an undirected weighted graph connecting adjacent cells. Edge weights are evaluated based on the similarity of expression between pairs of connected nodes. Node similariy is based on the cell measurement data attribute specified by the ‘weighted_by’ parameter.
Attributes:
weighted_by (str) - data attribute used to weight edges
imap (spatial.InfoMap) - community detection
community_labels (np.ndarray[int]) - community label for each node
logratio (bool) - if True, weight edges by log ratio
distance (bool) - if True, weights edges by distance rather than similarity
Inherited attributes:
data (pd.DataFrame) - cell measurement data (nodes)
xykey (list) - attribute keys for node x/y positions
nodes (np.ndarray[int]) - node indices
edges (np.ndarray[int]) - pairs of connected node indices
node_map (vectorized func) - maps positional index to node index
position_map (vectorized func) - maps node index to positional index
tri (matplotlib.tri.Triangulation) - triangulation of node positions
-
property
edge_list
¶ Distance-filtered edges as (from, to, weight) tuples.
-
property
-
class
flyqma.annotation.spatial.correlation.
CharacteristicLength
(correlation, fraction_of_max=0.01)[source]¶ Class for determining the characteristic length over which correlations decay.
-
property
characteristic_length
¶ Characteristic decay length.
-
property
x_normed
¶ Distance vector normalized by maximum value.
-
property
yp
¶ Predicted correlation values.
-
property
-
class
flyqma.annotation.spatial.correlation.
CorrelationVisualization
[source]¶ Visualization methods for SpatialCorrelation.
-
class
flyqma.annotation.spatial.correlation.
SpatialCorrelation
(d_ij=None, C_ij=None)[source]¶ Container for correlations between 1-D timeseries.
Attributes:
d_ij (np array) - pairwise separation distances between measurements
C_ij (np array) - normalized pairwise fluctuations between measurements
-
class
flyqma.annotation.spatial.infomap.
CommunityAggregator
(infomap)[source]¶ Tool for hierarchical aggregation of communities.
-
class
flyqma.annotation.spatial.infomap.
InfoMap
(edges, **kwargs)[source]¶ Object for performing infomap flow-based community detection.
Attributes:
infomap (infomap.Infomap) - infomap object
node_to_module (dict) - {node: module} pairs
classifier (vectorized func) - maps nodes to modules
aggregator (CommunityAggregator)
-
build_classifier
()[source]¶ Construct node to module classifier.
Returns:
node_to_module (dict) - {node: module} pairs
classifier (vectorized func) - maps nodes to modules
-
static
build_network
(edges, twolevel=False, N=25)[source]¶ Compile InfoMap object from graph edges.
Args:
twolevel (bool) - if True, perform two-level clustering
N (int) - number of trials
-
property
max_depth
¶ Maximum tree depth.
-
-
class
flyqma.annotation.spatial.sampling.
CommunitySampler
(graph, attr, depth=1.0, log=True, twolevel=False)[source]¶ Class for sampling node attributes averaged over local community.
Attributes:
graph (spatial.Graph) - graph instance
G (nx.Graph) - graph with node attribute
attr (str) - attribute to be averaged over neighbors
depth (int) - mean correlation lifetime
level (int) - hierarchical level at which clusters are merged
log (bool) - if True, log-transform values before averaging
twolevel (bool) - if True, use two-level community clustering
-
autocorrelate
(include_distances=False)[source]¶ Returns autocorrelation versus community depth.
Args:
include_distances (bool) - return mean separate distances
Returns:
levels (list) - clustering depths, starting from finest resolution
correlations (list) - mean correlation within communities
<optional> distances (list) - mean pairwise separation distance
-
average_over_neighbors
()[source]¶ Average attribute value over all members of the community encompassing each node.
-
property
averaged_attr
¶ Name of averaged attribute.
-
property
clustering_level
¶ Highest clustering level at which the mean correlation remains above <self.depth> multiples of the decay constant.
-
property
neighbors
¶ Dictionary of neighbor indices keyed by node indices.
-
property
size_attr
¶ Neighborhood size attribute name.
-
property
z_attr
¶ Name of z-scored attribute.
-
-
class
flyqma.annotation.spatial.sampling.
NeighborSampler
(graph, attr, depth=1, log=True)[source]¶ Class for sampling node attributes averaged over neighbors.
Attributes:
graph (spatial.Graph) - graph instance
G (nx.Graph) - graph with node attribute
attr (str) - attribute to be averaged over neighbors
depth (int) - maximum number of edges connecting neighbors
log (bool) - if True, log-transform values before averaging
-
property
G
¶ NetworkX graph instance.
-
property
attr_used
¶ Name of attribute used to access graph data.
-
property
averaged_attr
¶ Name of averaged attribute.
-
property
data
¶ Graph data.
-
property
keys
¶ List of attribute names.
-
classmethod
multisample
(attr, *graphs, **kwargs)[source]¶ Generate composite sample from one or more <graphs>.
Args:
attr (str) - attribute to be averaged over neighbors
graphs (spatial.Graph) - one or more graph instances
kwargs: keyword arguments for sampler
Returns:
sample (np.ndarray[float]) - 2D array of sampled values, first column contains cell measurements while the second column contains measurements averaged over the neighbors of each cell
keys (list of str) - attribute keys for sampled data
-
property
neighbors
¶ Dictionary of neighbor indices keyed by node indices.
-
property
node_values
¶ Vector of attribute values for each node.
-
property
node_values_dict
¶ Dictionary of attribute values, keyed by node index.
-
property
num_nodes
¶ Number of nodes.
-
property
sample
¶ Returns bivariate sample combining each node’s attribute value with the average attribute value in its neighborhood.
-
property
size_attr
¶ Neighborhood size attribute name.
-
property
-
class
flyqma.annotation.spatial.sampling.
RadialSampler
(graph, attr, depth=1.0, log=True)[source]¶ Class for sampling node attributes averaged within a predetermined radius of each node.
Attributes:
graph (spatial.Graph) - graph instance
G (nx.Graph) - graph with node attribute
attr (str) - attribute to be averaged over neighbors
depth (int) - hierarchical level to which communities are merged
log (bool) - if True, log-transform values before averaging
length_scale (float) - characteristic length scale of the graph
radius (float) - radius of sampling region surrounding each measurement
-
average_over_neighbors
()[source]¶ Average attribute value over all nodes within the specified radius of each node.
-
property
averaged_attr
¶ Name of averaged attribute.
-
property
distance_matrix
¶ Euclidean distance matrix between nodes (ordered by position in <self.data>).
-
property
neighbors
¶ Dictionary of neighbor positional indices keyed by node indices.
-
property
size_attr
¶ Neighborhood size attribute name.
-
-
flyqma.annotation.spatial.timeseries.
apply_custom_roller
(func, x, **kwargs)[source]¶ Apply function to rolling window.
Args:
func (function) - function applied to each window, returns 1 x N_out
x (np.ndarray) - ordered samples, length N
kwargs: keyword arguments for window specification
Returns:
fx (np.ndarray) - function output for each window, N/resolution x N_out
-
flyqma.annotation.spatial.timeseries.
bootstrap
(x, func=<function mean>, confidence=95, N=1000)[source]¶ Returns point estimate obtained by parametric bootstrap.
Args:
x (np.ndarray) - ordered samples, length N
func (function) - function applied to each bootstrap sample
confidence (float) - confidence interval, between 0 and 100
N (int) - number of bootstrap samples
Returns:
interval (np.ndarray) - confidence interval bounds
-
flyqma.annotation.spatial.timeseries.
detrend_signal
(x, window_size=99, order=1)[source]¶ Detrend and scale fluctuations using first-order univariate spline.
Args:
x (np array) -ordered samples
window_size (int) - size of interpolation window for lowpass filter
order (int) - spline order
Returns:
residuals (np array) - detrended residuals
trend (np array) - spline fit to signal
-
flyqma.annotation.spatial.timeseries.
get_binned_mean
(x, window_size=100)[source]¶ Returns mean values within non-overlapping sequential windows.
Args:
x (np.ndarray) - ordered samples, length N
window_size (int) - size of window, W
Returns:
means (np.ndarray) - bin means, N/W x 1
-
flyqma.annotation.spatial.timeseries.
get_rolling_gaussian
(x, window_size=100, resolution=10)[source]¶ Returns gaussian fit within sliding window.
Args:
x (np.ndarray) - ordered samples
window_size (int) - size of window
resolution (int) - sampling interval
Returns:
model (scipy.stats.norm)
-
flyqma.annotation.spatial.timeseries.
get_rolling_mean
(x, **kw)[source]¶ Compute rolling mean. This implementation permits flexible sampling intervals and multi-dimensional time series, but is slower than get_running_mean for 1D time series.
Args:
x (np.ndarray) - ordered samples, length N
kw: arguments for window specification
Returns:
means (np.ndarray) - moving average of x, N/resolution x 1
-
flyqma.annotation.spatial.timeseries.
get_rolling_mean_interval
(x, window_size=100, resolution=1, confidence=95, nbootstraps=1000)[source]¶ Evaluate confidence interval for moving average of ordered values.
Args:
x (np.ndarray) - ordered samples, length N
window_size (int) - size of window, W
resolution (int) - sampling interval
confidence (float) - confidence interval, between 0 and 100
nbootstraps (int) - number of bootstrap samples
Returns:
interval (np.ndarray) - confidence interval bounds, N/resolution x 2
-
flyqma.annotation.spatial.timeseries.
get_rolling_window
(x, window_size=100, resolution=1)[source]¶ Return array slices within a rolling window.
Args:
x (np.ndarray) - ordered samples, length N
window_size (int) - size of window, W
resolution (int) - sampling interval
Returns:
windows (np.ndarray) - sampled values, N/resolution x W
-
flyqma.annotation.spatial.timeseries.
get_running_mean
(x, window_size=100)[source]¶ Returns running mean for a 1D vector. This is the fastest implementation, but is limited to one-dimensional arrays and doesn’t permit interval specification.
Args:
x (np.ndarray) - ordered samples, length N
window_size (int) - size of window, W
Returns:
means (np.ndarray) - moving average of x
-
flyqma.annotation.spatial.timeseries.
plot_mean
(ax, x, y, label=None, ma_type='sliding', window_size=100, resolution=1, line_color='k', line_width=1, line_alpha=1, linestyle=None, markersize=2, smooth=False, **kw)[source]¶ Plot moving average.
Args:
x, y (array like) - timeseries data
ax (matplotlib.axes.AxesSubplot) - axis which to which line is added
label (str) - data label
ma_type (str) - type of average used, either sliding, binned, or savgol
window_size (int) - size of window
resolution (int) - sampling interval
line_color, line_width, line_alpha, linestyle - formatting parameters
smooth (bool) - if True, apply secondary savgol filter
Returns:
line (matplotlib.lines.Line2D)
-
flyqma.annotation.spatial.timeseries.
plot_mean_interval
(ax, x, y, ma_type='sliding', window_size=100, resolution=10, nbootstraps=1000, confidence=95, color='grey', alpha=0.25, error_bars=False, lw=0.0)[source]¶ Adds confidence interval for line average (sliding window or binned) to existing axes.
Args:
x, y (array like) - data
ax (axes) - axis which to which line is added
ma_type (str) - type of average used, either ‘sliding’ or ‘binned’
window_size (int) - size of sliding window or bin (num of cells)
interval_resolution (int) - sampling resolution for confidence interval
nbootstraps (int) - number of bootstraps
confidence (float) - confidence interval, between 0 and 100
color, alpha - formatting parameters
-
flyqma.annotation.spatial.timeseries.
savgol
(x, window_size=100, polyorder=1)[source]¶ Perform Savitzky-Golay filtration of 1-D array.
Args:
x (np.ndarray) - ordered samples
window_size (int) - filter size
polyorder (int) - polynomial order
Returns:
trend (np.ndarray) - smoothed values