Writing new cytoflow
modules¶
Creating a new module in cytoflow
ranges from easy (for simple things)
to quite involved. I like to think that cytoflow
follows the Perl
philosophy of making the easy jobs easy and the hard jobs possible.
With that in mind, let’s look at the process of creating a new module, progressing from easy to involved.
Basics¶
All the APIs (both public and internal) are built using
Traits. For operations and views in
the cytoflow
package, basic working knowledge of traits
is sufficient.
For GUI work, trait notification is used extensively.
The GUI wrappers also use TraitsUI because it makes wrapping traits with UI elements easy. Have a look at views, handlers, and of course the trait editors.
Finally, there are some principles that I expect new modules contributed to this codebase to follow:
- Check for pathological errors and fail early. I really dislike the
tendency of a number of libraries to fail with cryptic errors. (I’m looking at
you,
pandas
.) Check for obvious errors and raise aCytoflowOpError
orCytoflowViewError
). If the problem is non-fatal, warn withCytoflowOpWarning
orCytoflowViewWarning
. The GUI will also know how to handle these gracefully. - Separate experimental data from module state. There are workflow that require estimating parameters with one data set, then applying those operations to another. Make sure your module supports them.
- Estimate slow but apply fast. The GUI re-runs modules’
apply()
methods automatically when parameters change. That means that theapply()
method must run very quickly. - Write tests. I hate writing unit tests, but they are indispensible for catching bugs. Even in a view’s tests are just smoke tests (“It plots something and doesn’t crash”), that’s better than nothing.
New operations¶
The base operation API is fairly simple:
id
- a requiredtraits.Constant
containing the UID of the operationfriendly_id
- a requiredtraits.Constant
containing a human-readable nameapply()
- takes anExperiment
and returns a newExperiment
with the operation applied.apply()
shouldclone()
the old experiment, then modify and return the clone. Don’t forget to add the operation to the newExperiment
’shistory
. A good example of a simple operation isRatioOp
.estimate()
- You may also wish to estimate the operation’s parameters from a data set. Crucially, this may not be the data set you are eventually applying the operation to. If your operation relies on estimating parameters, implement theestimate()
function. This may involve selecting a subset of the data in theExperiment
, or it may involve loading in an an additional FCS file. A good example of the former isKMeansOp
; a good example of the latter isAutofluorescenceOp
.You may also find that you wish to estimate different parameter sets for different sub-populations (as encoded in the
Experiment
’sconditions
.) By convention, the conditions that you want to estimate different parameters for are passed using a trait namedby
, which takes a list of conditions and groups the data by unique combinations of those conditions’ values before estimating a paramater set for each. Look atKMeansOp
for an example of this behavior.default_view()
- for some operations, you may want to provide a default view. This view may just be a base view parameterized in a particular way (like theHistogramView
that is the default view ofBinningOp
), or it may be a visualization of the parameters estimated by theestimate()
function (like the default view ofAutofluorescenceOp
.) In many cases, the view returned by this function is linked back to the operation that produced it.
New views¶
The base view API is very simple:
id
- a requiredtraits.Constant
containing the UID of the operationfriendly_id
- a requiredtraits.Constant
containing a human-readable nameplot()
- plotsExperiment
.
As I wrote more views, however, I noticed a significant amount of code
duplication, which led to bugs and lost time. So, I refactored the view code
to use a short hierarchy of classes for particular types of views. You can
take advantage of this functionality when writing a new module, or you can
simply derive your new view from traits.HasTraits
and implement the
simple API above.
The view base classes are:
BaseView
– implements a view with row, column and hue facets. After setting up the facet grid, it calls the derived class’s_grid_plot()
to actually do the plotting.plot()
also has parameters to set the plot style, legend, axis labels, etc.BaseDataView
– implements a view that plots anExperiment
’s data (as opposed to a statistic.) Includes functionality for subsetting the data before plotting, and determining axis limits and scales.Base1DView
– implements a 1-dimensional data view. SeeHistogramView
for an example.Base2DView
– implements a 2-dimensional data view. SeeScatterplotView
for an example.BaseNDView
– implements an N-dimensional data view. SeeRadvizView
for an example.BaseStatisticsView
– implements a view that plots a statistic from anExperiment
(as opposed to the underlying data.) These views have a “primary”variable
, and can be subset as well.Base1DStatisticsView
– implements a view that plots one dimension of a statistic. SeeBarChartView
for an example.Base2DStatisticsView
– implements a view that plots two dimensions of a statistic. SeeStats2DView
for an example.
New GUI operations¶
Wrapping an operation for the GUI sometimes feels like it requires more work than writing the operation in the first place. A new operation requires at least five things:
- A plugin class implementing
IOperationPlugin
. It should also derive fromPluginHelpMixin
, which adds support for online help. - A class derived from the underlying
cytoflow
operation. The derived operation should:- Inherit from
PluginOpMixin
to add support for various GUI event-handling bits - Override attributes in the underlying
cytoflow
class to add metadata that tells the GUI how to react to changes. (See thePluginOpMixin
docstring for details.) - Override the
handler_factory
attribute to be a callable that returns anOpHandlerMixin
instance. - Provide an implementation of
get_notebook_code()
, to support exporting to Jupyter notebook. - If the module has an
estimate()
method, then implementclear_estimate()
to clear those parameters. - If the module has a
default_view()
method, it should be overridden to return a GUI-enabled view class (see below.) - Optionally, override
should_apply()
andshould_clear_estimate()
to only do expensive operations when necessary.
- Inherit from
- A handler class that defines the default
traits.View
and provides supporting logic. This class should be derived fromOpHandlerMixin
andtraits.Controller
. - Serialization logic.
cytoflow
usescamel
for sane YAML serialization; a dumper and loader for the class must save and load the operation’s parameters. - Tests. Because of
cytoflowgui
’s split between processes, testing GUI logic for modules can be kind of a synchronization nightmare. This is by design – because the same synchronization issues are present when running the software. See thecytoflowgui/tests
directory for (many) examples. - (Optionally) default view implementations. If the operation has a default view, you should wrap it as well (in the operation plugin module.) See the next section for details.
New GUI views¶
A new view operation requires at least five things:
- A plugin class implementing either
IViewPlugin
. It should also derive fromPluginHelpMixin
, which adds support for online help. - A class derived from the underlying
cytoflow
view. The derived view should:- Inherit from
PluginViewMixin
to add support for various GUI event-handling bits - Override attributes in the underlying
cytoflow
class to add metadata that tells the GUI how to react to changes. (See thePluginViewMixin
docstring for details.) - Override the
handler_factory
attribute to be a callable that returns aViewHandlerMixin
instance. - Provide an implementation of
get_notebook_code()
, to support exporting to Jupyter notebook. - Override the
plot_params
attribute with an instance of an object containing plot parameters (see below). - Optionally, override
should_plot()
to only plot when necessary. - Optionally, overide
plot_wi()
to change whetherplot()
is called on the currentWorkflowItem
’s result or the previous one’s.
- Inherit from
- A handler class that defines the default
traits.View
and provides supporting logic. This class should be derived fromViewHandlerMixin
andtraits.Controller
. - Serialization logic.
cytoflow
usescamel
for sane YAML serialization; a dumper and loader for the class must save and load the operation’s parameters. - Plot parameters. The parameters to a view’s
plot()
method are stored in an object that derives fromBasePlotParams
or one of its decendants. Choose data types that are appropriate for the view, and include a default view. Set it as the class type for the view’splot_params
attribute. Don’t forget to write serialization code for it as well! - Tests. Because of
cytoflowgui
’s split between processes, testing GUI logic for modules can be kind of a synchronization nightmare. This is by design – because the same synchronization issues are present when running the software. See thecytoflowgui/tests
directory for (many) examples. In the case of a view, most of these are “smoke tests”, testing that the view doesn’t crash with various sets of parameters.