
Data Module¶
flyqma.data
provides three levels of organization for managing cell measurement data:
Layer
: a 2D cross sectional image of an eye disc
Stack
: a set of layers obtained from the same eye disc
Experiment
: a collection of eye discs obtained under similar conditions
Images¶
Images are 2D arrays of pixel intensities recorded within one or more fluorescence channels.
-
class
flyqma.data.images.
ImageMultichromatic
(im, labels=None)[source]¶ Object represents a multichromatic image.
Attributes:
im (np.ndarray[float]) - 2D array of pixel values in WHC format
Inherited attributes:
shape (array like) - image dimensions
mask (np.ndarray[bool]) - image mask
labels (np.ndarray[int]) - segment ID mask
-
class
flyqma.data.images.
ImageScalar
(im, labels=None)[source]¶ Object represents a monochrome image.
Attributes:
im (np.ndarray[float]) - 2D array of pixel values
shape (array like) - image dimensions
mask (np.ndarray[bool]) - image mask
labels (np.ndarray[int]) - segment ID mask
-
clahe
(factor=8, clip_limit=0.01, nbins=256)[source]¶ Run CLAHE on reflection-padded image.
Args:
factor (float or int) - number of segments per dimension
clip_limit (float) - clip limit for CLAHE
nbins (int) - number of grey-scale bins for histogram
-
preprocess
(median_radius=2, gaussian_sigma=(2, 2), clip_limit=0.03, clip_factor=20)[source]¶ Preprocess image.
Args:
median_radius (int) - median filter size, px
gaussian_sigma (tuple) - gaussian filter size, px std dev
clip_limit (float) - CLAHE clip limit
clip_factor (int) - CLAHE clip factor
-
show
(segments=True, cmap=None, vmin=0, vmax=1, figsize=(10, 10), ax=None, **kwargs)[source]¶ Render image.
Args:
segments (bool) - if True, include cell segment contours
cmap (matplotlib.colors.ColorMap or str) - colormap or RGB channel
vmin, vmax (float) - bounds for color scale
figsize (tuple) - figure size
ax (matplotlib.axes.AxesSubplot) - if None, create axis
kwargs: keyword arguments for add_contours
Returns:
fig (matplotlib.figures.Figure)
-
Layers¶
Layers are 2D cross sectional images of an eye disc.
-
class
flyqma.data.layers.
Layer
(path, im=None, annotator=None)[source]¶ Object represents a single imaged layer.
Attributes:
measurements (pd.DataFrame) - raw cell measurement data
data (pd.DataFrame) - processed cell measurement data
path (str) - path to layer directory
_id (int) - layer ID, must be an integer value
subdirs (dict) - {name: path} pairs for all subdirectories
metadata (dict) - layer metadata
labels (np.ndarray[int]) - segment ID mask
annotator (Annotation) - object that assigns labels to measurements
graph (Graph) - graph connecting cell centroids
include (bool) - if True, layer was manually marked for inclusion
Inherited attributes:
im (np.ndarray[float]) - 3D array of pixel values
shape (array like) - image dimensions
mask (np.ndarray[bool]) - image mask
labels (np.ndarray[int]) - segment ID mask
Properties:
color_depth (int) - number of fluorescence channels
num_cells (int) - number of cells detected by segmentation
bg_key (str) - key for channel used to generate segmentation
is_segmented (bool) - if True, layer has been segmented
has_trained_annotator (bool) - if True, layer has a trained annotator
-
build_graph
(weighted_by, **graph_kw)[source]¶ Compile weighted graph connecting adjacent cells.
Args:
weighted_by (str) - attribute used to weight edges
graph_kw: keyword arguments, including:
xykey (list) - attribute keys for node x/y positions
logratio (bool) - if True, weight edges by log ratio
distance (bool) - if True, weights edges by distance
-
initialize
()[source]¶ Initialize layer directory by:
Creating a layer directory
Removing existing segmentation directory
Saving metadata to file
-
process_measurements
(measurements)[source]¶ - Augment measurements by:
incorporating manual selection boundary
correcting for fluorescence bleedthrough
assigning measurement labels
marking clone boundaries
assigning label concurrency information
Operations 3-5 require construction of a WeightedGraph object.
Args:
measurements (pd.DataFrame) - raw measurement data
Returns:
data (pd.DataFrame) - processed measurement data
-
-
class
flyqma.data.layers.
LayerAnnotation
[source]¶ Annotation related methods for Layer class.
-
annotate
()[source]¶ Annotate measurement data in place, also labeling boundaries between labeled regions and marking regions in which each label occurs.
-
apply_annotation
(label='genotype', **kwargs)[source]¶ Assign labels to cell measurements in place.
Args:
label (str) - attribute name for predicted genotype
kwargs: keyword arguments for Annotator.annotate()
-
apply_concurrency
(basis='genotype', min_pop=5, max_distance=10, **kwargs)[source]¶ Add boolean ‘concurrent_<basis>’ field to measurement data for each unique value of <basis> attribute.
Args:
basis (str) - attribute on which concurrency is established
min_pop (int) - minimum population size for inclusion of cell type
max_distance (float) - maximum distance threshold for inclusion
kwargs: keyword arguments for ConcurrencyLabeler
-
mark_boundaries
(basis='genotype', max_edges=0)[source]¶ Mark boundaries between cells with disparate labels by assigning a boundary label to all cells that share an edge with another cell with a different label.
Args:
basis (str) - attribute used to define label
max_edges (int) - maximum number of edges for interior cells
-
show_annotation
(channel, label, interior_only=False, selection_only=False, cmap=None, figsize=(8, 4), **kwargs)[source]¶ Visualize annotation by overlaying <label> attribute on the image of the specified fluoreascence <channel>.
Args:
channel (str) - fluorescence channel to visualize
label (str) - attribute containing cell type labels
interior_only (bool) - if True, exclude border regions
selection_only (bool) - if True, only add contours within ROI
cmap (matplotlib.ListedColorMap) - color scheme for celltype labels
figsize (tuple) - figure dimensions
kwargs: keyword arguments for plt.scatter
Returns:
fig (matplotlib.Figure)
-
train_annotator
(attribute, save=False, logratio=True, num_labels=3, **kwargs)[source]¶ Train an Annotation model on the measurements in this layer.
Args:
attribute (str) - measured attribute used to determine labels
save (bool) - if True, save model selection routine
logratio (bool) - if True, weight edges by relative attribute value
num_labels (int) - number of allowable unique labels
kwargs: keyword arguments for Annotation, including:
sampler_type (str) - either ‘radial’, ‘neighbors’, ‘community’
sampler_kwargs (dict) - keyword arguments for sampler
min_num_components (int) - minimum number of mixture components
max_num_components (int) - maximum number of mixture components
addtl_kwargs: keyword arguments for Classifier
Returns:
selector (ModelSelection object)
-
-
class
flyqma.data.layers.
LayerCorrection
[source]¶ Bleedthrough correction related methods for Layer class.
-
class
flyqma.data.layers.
LayerIO
[source]¶ Methods for saving and loading Layer objects and their subcomponents.
-
load
(use_cache=True, graph=True)[source]¶ Load layer.
Args:
use_cache (bool) - if True, use cached measurement data, otherwise re-process the measurement data
graph (bool) - if True, load weighted graph
-
save
(segmentation=True, measurements=True, processed_data=True, annotator=False, segmentation_image=False, annotation_image=False)[source]¶ Save segmentation parameters and results.
Args:
segmentation (bool) - if True, save segmentation
measurements (bool) - if True, save measurement data
processed_data (bool) - if True, save processed measurement data
annotator (bool) - if True, save annotator
segmentation_image (bool) - if True, save segmentation image
annotation_image (bool) - if True, save annotation image
-
-
class
flyqma.data.layers.
LayerMeasurement
[source]¶ Measurement related methods for Layer class.
-
apply_normalization
(data)[source]¶ Normalize fluorescence intensity measurements by measured background channel intensity.
Args:
data (pd.DataFrame) - processed cell measurement data
-
import_segmentation_mask
(path, channel, save=True, save_image=True)[source]¶ Import external segmentation mask and use it to generate measurements.
Provided mask must contain a 2-D array of positive integers in which a values of zero denotes the image background.
Args:
path (str) - path to segmentation mask
channel (int) - fluorescence channel used for segmentation
save (bool) - if True, copy segmentation to stack directory
save_image (bool) - if True, save segmentation image
-
measure
()[source]¶ Measure properties of cell segments. Raw measurements are stored under in the ‘measurements’ attribute, while processed measurements are stored in the ‘data’ attribute.
-
segment
(channel, preprocessing_kws={}, seed_kws={}, seg_kws={}, min_area=250)[source]¶ Identify nuclear contours by running watershed segmentation on specified background channel.
Args:
channel (int) - channel index on which to segment image
preprocessing_kws (dict) - keyword arguments for image preprocessing
seed_kws (dict) - keyword arguments for seed detection
seg_kws (dict) - keyword arguments for segmentation
min_area (int) - threshold for minimum segment size, px
Returns:
background (ImageScalar) - background image (after processing)
-
-
class
flyqma.data.layers.
LayerProperties
[source]¶ Properties for Layer class:
color_depth (int) - number of fluorescence channels
num_cells (int) - number of cells detected by segmentation
bg_key (str) - key for channel used to generate segmentation
has_image (bool) - if True, image is loaded into memory
is_segmented (bool) - if True, layer has been segmented
has_trained_annotator (bool) - if True, layer has a trained annotator
-
property
bg_key
¶ DataFrame key for background channel.
-
property
color_depth
¶ Number of color channels.
-
property
has_image
¶ True if image is available.
-
property
has_trained_annotator
¶ Returns True if trained annotator is available.
-
property
is_segmented
¶ True if measurement data are available.
-
property
num_cells
¶ Number of cells detected by segmentation.
-
property
-
class
flyqma.data.layers.
LayerROI
[source]¶ ROI related methods for Layer class.
-
define_roi
(data)[source]¶ Adds a “selected” attribute to measurements dataframe. The attribute is True for cells that fall within the ROI.
Args:
data (pd.DataFrame) - processed measurement data
-
import_roi_mask
(path, save=True)[source]¶ Import external ROI mask and use it to label measurement data.
Provided mask must contain a 2-D boolean array with the same dimensions as the raw image. True values denote the ROI. The mask may only contain a single contiguous ROI.
Args:
path (str) - path to ROI mask
save (bool) - if True, copy ROI mask to stack directory
-
classmethod
mask_to_vertices
(mask)[source]¶ Convert boolean mask to a list of vertices defining the border around the largest contiguous region.
Args:
mask (np.ndarray[bool]) - ROI mask, where True denotes the region. Note that the mask may only contain one contiguous component.
Returns:
vertices (np.ndarray[int]) - N x 2 array of vertices
-
-
class
flyqma.data.layers.
LayerVisualization
[source]¶ Methods for visualizing a layer.
-
build_attribute_mask
(attribute, interior_only=False, selection_only=False, **kwargs)[source]¶ Use <attribute> value for each segment to construct an image mask.
Args:
attribute (str) - attribute used to label each segment
interior_only (bool) - if True, excludes clone borders
selection_only (bool) - if True, only include selected region
Returns:
mask (np.ma.Maskedarray) - masked image in which foreground segments are replaced with the attribute values
-
build_classifier_mask
(classifier, interior_only=False, selection_only=False, **kwargs)[source]¶ Use segment <classifier> to construct an image mask.
Args:
classifier (annotation.Classifier object)
interior_only (bool) - if True, excludes clone borders
selection_only (bool) - if True, only include selected region
Returns:
mask (np.ma.Maskedarray) - masked image in which foreground segments are replaced with the assigned labels
-
Stacks¶
Stacks are sets of layers obtained from the same eye disc.
-
class
flyqma.data.stacks.
Stack
(path, bit_depth=None)[source]¶ Object represents a 3D RGB image stack.
Attributes:
path (str) - path to stack directory
_id (str) - stack ID
stack (np.ndarray[float]) - 3D RGB image stack
shape (tuple) - stack dimensions, (depth, X, Y, 3)
bit_depth (int) - bit depth of raw tif image
stack_depth (int) - number of layers in stack
color_depth (int) - number of fluorescence channels in stack
annotator (Annotation) - object that assigns labels to measurements
metadata (dict) - stack metadata
tif_path (str) - path to multilayer RGB tiff file
layers_path (str) - path to layers directory
annotator_path (str) - path to annotation directory
-
aggregate_measurements
(selected_only=False, exclude_boundary=False, raw=False, use_cache=True)[source]¶ Aggregate measurements from each included layer.
Args:
selected_only (bool) - if True, exclude cells not marked for inclusion
exclude_boundary (bool) - if True, exclude cells on clone boundaries
raw (bool) - if True, aggregate raw measurements
use_cache (bool) - if True, used available cached measurement data
Returns:
data (pd.Dataframe) - measurement data (None if unavailable)
-
property
bit_depth
¶ Bit depth of raw image.
-
property
color_depth
¶ Number of fluorescence channels in stack.
-
property
filename
¶ Stack filename.
-
property
included
¶ Indices of included layers.
-
initialize
(bit_depth)[source]¶ Initialize stack directory.
Args:
bit_depth (int) - bit depth of raw tif (e.g. 12 or 16)
-
property
is_annotated
¶ True if annotation is complete.
-
property
is_initialized
¶ Returns True if Stack has been initialized.
-
property
is_segmented
¶ True if segmentation is complete.
-
load_layer
(layer_id=0, graph=True, use_cache=True, full=True)[source]¶ Load individual layer.
Args:
layer_id (int) - layer index
graph (bool) - if True, load layer graph
use_cache (bool) - if True, use cached layer measurement data
full (bool) - if True, load fully labeled RGB image
Returns:
layer (Layer)
-
segment
(channel, preprocessing_kws={}, seed_kws={}, seg_kws={}, min_area=250, save=True)[source]¶ Segment all layers using watershed strategy.
Args:
channel (int) - channel index on which to segment image
preprocessing_kws (dict) - keyword arguments for image preprocessing
seed_kws (dict) - keyword arguments for seed detection
seg_kws (dict) - keyword arguments for segmentation
min_area (int) - threshold for minimum segment size, px
save (bool) - if True, save measurement data for each layer
-
property
selector_path
¶ Path to model selection object.
-
property
stack_depth
¶ Number of layers in stack.
-
train_annotator
(attribute, save=False, logratio=True, num_labels=3, **kwargs)[source]¶ Train an Annotation model on all layers in this stack.
Args:
attribute (str) - measured attribute used to determine labels
save (bool) - if True, save annotator and model selection routine
logratio (bool) - if True, weight edges by relative attribute value
num_labels (int) - number of allowable unique labels
kwargs: keyword arguments for Annotation, including:
sampler_type (str) - either ‘radial’, ‘neighbors’, ‘community’
sampler_kwargs (dict) - keyword arguments for sampler
min_num_components (int) - minimum number of mixture components
max_num_components (int) - maximum number of mixture components
addtl_kwargs: keyword arguments for Classifier
-
-
class
flyqma.data.stacks.
StackIO
[source]¶ Methods for saving and loading a Stack instance.
-
static
from_silhouette
(filepath, bit_depth)[source]¶ Initialize stack from silhouette <filepath>.
Args:
path (str) - path to silhouette file
bit_depth (int) - bit depth of raw tif (e.g. 12 or 16)
Returns:
stack (flyqma.Stack)
-
static
Experiments¶
Experiments are collections of stacks obtained under similar conditions.
-
class
flyqma.data.experiments.
Experiment
(path)[source]¶ Object represents a collection of 3D RGB image stacks collected under the same experimental conditions.
Attributes:
path (str) - path to experiment directory
_id (str) - name of experiment
stack_ids (list of str) - unique stack ids within experiment
stack_dirs (dict) - {stack_id: stack_directory} tuples
count (int) - counter for stack iteration
-
aggregate_measurements
(selected_only=False, exclude_boundary=False, raw=False, use_cache=True)[source]¶ Aggregate measurements from each stack.
Args:
selected_only (bool) - if True, exclude cells outside the ROI
exclude_boundary (bool) - if True, exclude cells on the border of labeled regions
raw (bool) - if True, use raw measurements from included discs
use_cache (bool) - if True, used available cached measurement data
Returns:
data (pd.Dataframe) - curated cell measurement data, which is None if no measurement data are found
-
initialize
(bit_depth)[source]¶ Initialize a collection of image stacks.
Args:
bit_depth (int) - bit depth of raw tif (e.g. 12 or 16). Value will be read from the stack metadata if None is provided. An error is raised if no value is found.
-
property
is_initialized
¶ Returns True if Experiment has been initialized.
-
Silhouette Interface¶
Fly-QMA provides several tools for seemlessly exchanging data with NU FlyEye Silhouette.
-
class
flyqma.data.silhouette_read.
ReadSilhouette
(path)[source]¶ Read-only interface to a FlyEye Silhouette file.
Attributes:
path (str) - path to Silhouette file
feed (dict) - feed file containing layer IDs
feud (dict) - feud file containing cell type labels
Properties:
is_flipped_about_yz (bool) - if True, invert about YZ plane
is_flipped_about_xy (bool) - if True, invert about XY plane
-
class
flyqma.data.silhouette_read.
ReadSilhouetteData
(path, recompile=False)[source]¶ Read-only interface to data within a FlyEye Silhouette file.
Upon instantiation, individual cell measurements are aggregated into a data.cells.Cells compatible DataFrame.
Measurement data must be read on a layer-by-layer basis the first time a Silhouette object is instantiated. Following this initial reading, the aggregated measurement data are serialized and stored within the silhouette file. These serialized measurements may then be accessed directly during future use. The recompile flag indicates whether the serialized measurements should be ignored upon instantiation.
Attributes:
df (pd.DataFrame) - cell measurement data
Inherited attributes:
path (str) - path to Silhouette file
feed (dict) - feed file containing layer IDs
feud (dict) - feud file containing cell type labels
-
property
labels
¶ pd.Series of labels keyed by (layer_id, segment_id).
-
load
(recompile=False)[source]¶ Read all contour and orientation data from silhouette file.
Args:
recompile (bool) - if True, recompile measurements from all layers
-
static
parse_contour
(contour)[source]¶ Convert contour to list format.
Args:
contour (dict) - contour from silhouette file
Returns:
ctr_list (list) - values in data.cells.Cells compatible list format
-
read_contours
(all_labels={}, include_unlabeled=False)[source]¶ Read contours from silhouette file.
Args:
all_labels (dict) - {layer_id: {contour_id: label}} for each layer
include_unlabeled (bool) - if True, include unlabeled segments
Returns:
df (pd.DataFrame) - data.cells.Cells compatible dataframe of contours
-
property
-
class
flyqma.data.silhouette_write.
WriteSilhouette
[source]¶ Methods for writing a stack to Silhouette readable format.
The Silhouette container includes a FEED file:
FEED.json
"orientation": {"flip_about_xy": false, "flip_about_yz": false}, "layer_ids": [ 0,1,2... ], "params": { param_name: param_value ... } }
-
load_silhouette_labels
()[source]¶ Load manually assigned labels from file.
Returns:
labels (pd.Series) - labels keyed by (layer_id, segment_id)
-
property
silhouette_path
¶ Path to Silhouette directory.
-
write_silhouette
(dst=None, label=None, include_image=True, channel_dict=None)[source]¶ Write silhouette file.
Args:
dst (str) - destination directory
label (str) - field containing cell type annotations
include_image (bool) - save RGB image of each layer
channel_dict (dict) - RGB channel names, keyed by channel index. If none provided, defaults to the first three channels in RGB order.
-
-
class
flyqma.data.silhouette_write.
WriteSilhouetteLayer
[source]¶ Methods for writing a Layer to Silhouette readable format. A layer file is structured as follows:
LAYER_ID.json :
{ “id”: LAYER_ID “imageFilename”: “LAYER_ID.png” “contours”: [ … contours … ]
{“centroid”: [CONTOUR_CENTROID_X, CONTOUR_CENTROID_Y], “color_avg”: {“b”: X, “g”: X, “r”: X}, “color_std”: {“b”: X, “g”: X, “r”: X}, “id”: CONTOUR_ID, “pixel_count”: CONTOUR_AREA, “points”: [[x1, y1], [x2, y2] … ]}}
-
build_contours
(channel_dict)[source]¶ Convert dataframe to a list of contours (Silhouette format).
Args:
channel_dict (dict) - RGB channel names, keyed by channel index
Returns:
contours (list) - list of contour dictionaries
-
write_silhouette
(dst, layer_id=None, include_image=True, channel_dict=None)[source]¶ Write silhouette compatible JSON to target directory.
Args:
dst (str) - destination directory
layer_id (int) - ID optionally used to override true layer ID
include_image (bool) - save layer image as png
channel_dict (dict) - RGB channel names, keyed by channel index. If none provided, defaults to the first three channels in RGB order.
-