cytoflow.views.mst#
Plots a minimum spanning tree of a statistic. Particularly useful for
visualizing the results of a clustering operations such as KMeansOp
and SOMOp.
MSTView – plots the minimum spanning tree.
- class cytoflow.views.mst.MSTView[source]#
Bases:
HasStrictTraitsA view that creates a minimum spanning tree view of a statistic.
Set
statisticto the name of the statistic to plot; setfeatureto the name of that statistic’s feature you’d like to analyze. Then, setlocationsto another statistic whose features are the locations (in any number of dimensions) of the nodes in the tree – usually these are cluster centroids fromKMeansOporSOMOp(see the example below). The view computes a minimum-spanning tree containing the nodes and lays it out in two dimensions.There are three different ways of plotting the value at each location in tree:
Setting
styletoheat(the default) will produce an MST with a circle at each vertex and the color of the circle is related to the intensity of the value offeature. (In this scenario,variableis ignored.)Setting
styletopiewill draw a pie plot at each location. Ifvariableis set, then the values ofvariableare used as the categories of the pie, and the arc length of each slice of pie is related to the intensity of the value offeature. Ifvariableis unset, thenfeatureis ignored and the features of the statistic are used as the categories.Setting
styletopetalwill draw a “petal plot” in each cell. Ifvariableis set, then the values ofvariableare used as the categories, but unlike a pie plot, the arc width of each slice is equal. Instead, the radius of the pie slice scales with the square root of the intensity, so that the relationship between area and intensity remains the same. Ifvariableis unset, thenfeatureis ignored and the features of the statistic are used as the categories.
Warning
If
styleispieorpetal, then negative data will be clipped to 0!Optionally, you can set
size_functionto scale the circles (or pies or petals) by a function computed onExperiment.data. (Often set tolento scale by the number of events in each cluster.)Note
If you’d like to select events based on this view (by drawing a polygon around the nodes of the tree), you can do that with
SOMOp.- statistic#
The statistic to plot. Must be a key in
Experiment.statistics.- Type:
Str
- locations#
A statistic whose levels are the same as
statisticand whose features are the dimensions of the locations of each node to plot.- Type:
Str
- .. note:: If `style` is ``heat``, then the levels of `statistic` must be the
same as the levels of
locations. Ifstyleispieorpetal, the levels ofstatisticmust be the levels oflocationsplusvariable.
- locations_level#
Which level in the
locationsstatistic is different at each location? The values of the others must be specified in theplot_nameparameter ofplot. Optional if there is only one level inlocations.- Type:
Str
- locations_features#
Which features in
locationsto use. By default, use all of them.- Type:
List(Str)
- .. warning::
The
KMeansOpstatistic is mostly locations, but also has the a Proportion feature. You likely don’t want to use it as a location for laying out the minimum spanning tree!
- variable#
The variable used for plotting pie and petal plots. Must be left empty for a heatmap.
- Type:
Str
- feature#
The column in the statistic to plot (often a channel name.)
- Type:
Str
- style#
What kind of plot to make?
- Type:
Enum(
heat,pie,petal) (default =heat)
- scale#
For a heat map, how should the color of
featurebe scaled before plotting? For pie and petal maps, how should the input data be normalized to [0,1] before plotting?- Type:
{‘linear’, ‘log’, ‘logicle’}
- size_function#
If set, separate the
Experimentinto subsets by levels oflocations, compute a function on them, and scale the size of each tree node by those values. The callable should take a singlepandas.DataFrameargument and return a positivefloator value that can be cast tofloat(such asint). Of particular use islen, which will scale the cells by the number of events in each subset.- Type:
Callable (default: None)
- metric#
What metric should be used to compute distance in the tree? Must be one of
braycurtis,canberra,chebyshev,cityblock,correlation,cosine,dice,euclidean,hamming,jaccard,jensenshannon,kulczynski1,mahalanobis,matching,minkowski,rogerstanimoto,russellrao,seuclidean,sokalmichener,sokalsneath,sqeuclidean,yule. Suggestion: useeuclideanfor small numbers of dimensions (location features) andcosinefor larger numbers.- Type:
Str (default:
euclidean)
- subset#
An expression that specifies the subset of the statistic to plot. Passed unmodified to
pandas.DataFrame.query.- Type:
Note
MSTViewis not a subclass ofBaseViewor any of its descendants. It implements theIViewinterface but does it does not useseaborn.FacetGridfor laying out its plots.Examples
Make a little data set.
>>> import cytoflow as flow >>> import_op = flow.ImportOp() >>> import_op.tubes = [flow.Tube(file = "Plate01/RFP_Well_A3.fcs", ... conditions = {'Dox' : 10.0}), ... flow.Tube(file = "Plate01/CFP_Well_A4.fcs", ... conditions = {'Dox' : 1.0})] >>> import_op.conditions = {'Dox' : 'float'} >>> ex = import_op.apply()
Compute some KMeans clusters
>>> km = flow.KMeansOp(name = "KMeans", ... channels = ["V2-A", "Y2-A", "B1-A"], ... scale = {"V2-A" : "logicle", ... "Y2-A" : "logicle", ... "B1-A" : "logicle"}, ... num_clusters = 20) >>> km.estimate(ex) >>> ex2 = km.apply(ex)
Add a statistic
>>> ex3 = flow.ChannelStatisticOp(name = "ByDox", ... channel = "Y2-A", ... by = ["KMeans", "Dox"], ... function = len).apply(ex2)
Plot the minimum spanning tree
>>> flow.MSTView(statistic = "ByDox", ... locations = "KMeans", ... locations_features = ["V2-A", "Y2-A", "B1-A"], ... feature = "Y2-A", ... variable = "Dox", ... style = "pie").plot(ex3)
- plot(experiment, plot_name=None, **kwargs)[source]#
Plot a chart of a variable’s values against a statistic.
- Parameters:
experiment (Experiment) – The
Experimentto plot using this view.plot_name (str) – If this
IViewcan make multiple plots,plot_nameis the name of the plot to make. Must be one of the values retrieved fromenum_plots.title (str) – Set the plot title
legend (bool) – Plot a legend or color bar? Defaults to
True.legendlabel (str) – Set the label for the color bar or legend
palette (palette name) – Colors to use for the different levels of the hue variable. Should be something that can be interpreted by
seaborn.color_palette. If plotting a heat map, this should be a continuous color map (‘viridis’ is the default.) Otherwise, choose either a discrete color map (‘deep’ is the default) or a continuous color map from which equi-spaced colors will be drawn.radius (float) – The radius of the circle or pie plots, on a scale from 0 to 1.
All other parameters are passed to the `matplotlib.patches.Circle` or
`matplotlib.patches.Wedge` construtors (ie, they should be patch attributes).