Writing new cytoflow
modules¶
Creating a new module in cytoflow
ranges from easy (for simple things)
to quite involved. I like to think that cytoflow
follows the Perl
philosophy of making the easy jobs easy and the hard jobs possible.
With that in mind, let’s look at the process of creating a new module, progressing from easy to involved.
Basics¶
All the APIs (both public and internal) are built using
Traits. For operations and views in
the cytoflow
package, basic working knowledge of traits
is sufficient.
For GUI work, trait notification is used extensively.
The GUI wrappers also use TraitsUI because it makes wrapping traits with UI elements easy. Have a look at documentation for views, handlers, and of course the trait editors.
Finally, there are some principles that I expect new modules contributed to this codebase to follow:
Check for pathological errors and fail early. I really dislike the tendency of a number of libraries to fail with cryptic errors. (I’m looking at you,
pandas
.) Check for obvious errors and raise aCytoflowOpError
orCytoflowViewError
). If the problem is non-fatal, warn withCytoflowOpWarning
orCytoflowViewWarning
. The GUI will also know how to handle these gracefully.Separate experimental data from module state. There are workflows that require estimating parameters with one data set, then applying those operations to another. Make sure your module supports them.
Estimate slow but apply fast. The GUI re-runs modules’
apply()
methods automatically when parameters change. That means that theapply()
method must run very quickly.Write tests. I hate writing unit tests, but they are indispensible for catching bugs. Even in a view’s tests are just smoke tests (“It plots something and doesn’t crash”), that’s better than nothing.
New operations¶
The base operation API is fairly simple:
id
- a requiredtraits.Constant
containing the UID of the operationfriendly_id
- a requiredtraits.Constant
containing a human-readable nameapply()
- takes anExperiment
and returns a newExperiment
with the operation applied.apply()
shouldclone()
the old experiment, then modify and return the clone. Don’t forget to add the operation to the newExperiment
’shistory
. A good example of a simple operation isRatioOp
.Note
Be aware of the
deep
parameter forclone()
! It defaults toTrue
– only set it toFalse
if you are only adding columns to theExperiment
.Note
The resulting
Experiment
must have apandas.RangeIndex
for its index – several modules rely on this! If you add or remove events from theExperiment
, make sure you callpandas.DataFrame.reset_index
onExperiment.data
to make the index monotonic again.estimate()
- You may also wish to estimate the operation’s parameters from a data set. Crucially, this might not be the data set you are eventually applying the operation to. If your operation relies on estimating parameters, implement theestimate()
function. This may involve selecting a subset of the data in theExperiment
, or it may involve loading in an an additional FCS file. A good example of the former isKMeansOp
; a good example of the latter isAutofluorescenceOp
.You may also find that you wish to estimate different parameter sets for different sub-populations (as encoded in the
Experiment
’sconditions
.) By convention, the conditions that you want to estimate different parameters for are passed using a trait namedby
, which takes a list of conditions and groups the data by unique combinations of those conditions’ values before estimating a paramater set for each. Look atKMeansOp
for an example of this behavior.default_view()
- for some operations, you may want to provide a default view. This view may just be a base view parameterized in a particular way (like theHistogramView
that is the default view ofBinningOp
), or it may be a visualization of the parameters estimated by theestimate()
function (like the default view ofAutofluorescenceOp
.) In many cases, the view returned by this function is linked back to the operation that produced it.
New views¶
The base view API is very simple:
id
- a requiredtraits.Constant
containing the UID of the operationfriendly_id
- a requiredtraits.Constant
containing a human-readable nameplot()
- plotsExperiment
.
As I wrote more views, however, I noticed a significant amount of code
duplication, which led to bugs and lost time. So, I refactored the view code
to use a short hierarchy of classes for particular types of views. You can
take advantage of this functionality when writing a new module, or you can
simply derive your new view from traits.HasTraits
and implement the
simple API above.
The view base classes are:
BaseView
– implements a view with row, column and hue facets. After setting up the facet grid, it calls the derived class’s_grid_plot()
to actually do the plotting.plot()
also has parameters to set the plot style, legend, axis labels, etc.BaseDataView
– implements a view that plots anExperiment
’s data (as opposed to a statistic.) Includes functionality for subsetting the data before plotting, and determining axis limits and scales.Base1DView
– implements a 1-dimensional data view. SeeHistogramView
for an example.Base2DView
– implements a 2-dimensional data view. SeeScatterplotView
for an example.BaseNDView
– implements an N-dimensional data view. SeeRadvizView
for an example.BaseStatisticsView
– implements a view that plots a statistic from anExperiment
(as opposed to the underlying data.) These views have a “primary”variable
, and can be subset as well.Base1DStatisticsView
– implements a view that plots one dimension of a statistic. SeeBarChartView
for an example.Base2DStatisticsView
– implements a view that plots two dimensions of a statistic. SeeStats2DView
for an example.
New GUI operations¶
Wrapping an operation for the GUI sometimes feels like it requires more work than writing the operation in the first place. A new operation requires at least five things:
A class derived from the underlying
cytoflow
operation. The derived operation should be placed in a module incytoflowgui.workflow.operations
, and it should:Inherit from
WorkflowOperation
to add support for various GUI event-handling bits (as well as the underlyingcytoflow
class, if appropriate)Override attributes in the underlying
cytoflow
class to add metadata that tells the GUI how to react to changes. (See theIWorkflowOperation
docstring for details.)Provide an implementation of
get_notebook_code()
, to support exporting to Jupyter notebook.If the module has an
estimate()
method, then implementclear_estimate()
to clear those parameters.If the module has a
default_view()
method, it should be overridden to return a GUI-enabled view class (see below.)Optionally, override
should_apply()
andshould_clear_estimate()
to only do expensive operations when necessary.
Serialization logic.
cytoflow
usescamel
for sane YAML serialization; a dumper and loader for the class must save and load the operation’s parameters. These should also go incytoflowgui.workflow.operations
.A handler class that defines the default
traits.View
and provides supporting logic. This class should be derived fromOpHandler
and should be placed incytoflowgui.op_plugins
.A plugin class derived from
envisage.plugin.Plugin
and implementingIOperationPlugin
. It should also derive fromcytoflowgui.op_plugins.op_plugin_base.PluginHelpMixin
, which adds support for online help.Tests. Because of
cytoflowgui
’s split between processes, testing GUI logic for modules can be kind of a synchronization nightmare. This is by design – because the same synchronization issues are present when running the software. See thecytoflowgui/tests
directory for (many) examples.(Optionally) default view implementations. If the operation has a default view, you should wrap it as well (in the operation plugin module.) See the next section for details.
New GUI views¶
A new view operation requires at least five things:
A class derived from the underlying
cytoflow
view. The derived view should be placed incytoflowgui.workflow.views
Inherit from
WorkflowView
or one of its children to add support for various GUI event-handling bitsOverride attributes in the underlying
cytoflow
class to add metadata that tells the GUI how to react to changes. (See theIWorkflowView
docstring for details.)Provide an implementation of
get_notebook_code()
, to support exporting to Jupyter notebook.Optionally, override
should_plot()
to only plot when necessary.
Serialization logic.
cytoflow
usescamel
for sane YAML serialization; a dumper and loader for the class must save and load the operation’s parameters. These should also go incytoflowgui.workflow.views
.A handler class that defines the default
traits.View
and provides supporting logic. This class should be derived fromViewHandler
and should be placed incytoflowgui.view_plugins
.A plugin class derived from
envisage.plugin.Plugin
and implementingIViewPlugin
. It should also derive fromcytoflowgui.view_plugins.view_plugin_base.PluginHelpMixin,
, which adds support for online help.Plot parameters. The parameters to a view’s
plot()
method are stored in an object that derives fromBasePlotParams
or one of its decendants. Choose data types that are appropriate for the view, and include a default view namedview_params_view
in the handler class. Don’t forget to write serialization code for it as well!Tests. Because of
cytoflowgui
’s split between processes, testing GUI logic for modules can be kind of a synchronization nightmare. This is by design – because the same synchronization issues are present when running the software. See thecytoflowgui/tests
directory for (many) examples. In the case of a view, most of these are “smoke tests”, testing that the view doesn’t crash with various sets of parameters.
Note
Why the split between the classes in cytoflowgui.op_modules
,
cytoflowgui.workflow.operations
, cytoflowgui.view_modules
,
and cytoflowgui.workflow.views
? It’s because of the fact that
cytoflow
runs in two processes – one handles the GUI and the other
operates on the workflow. If you load a module containing UI bits, even
if you don’t explicitly create a QGuiApplication
, it starts an
event loop. That’s why older versions of Cytoflow
had two icons
in the task bar when running on a Mac. You know how sometimes you go
to fix a “little” bug and end up re-writing the whole program? This
was one of those times….