Writing new cytoflow modules¶
Creating a new module in cytoflow ranges from easy (for simple things)
to quite involved. I like to think that cytoflow follows the Perl
philosophy of making the easy jobs easy and the hard jobs possible.
With that in mind, let’s look at the process of creating a new module, progressing from easy to involved.
Basics¶
All the APIs (both public and internal) are built using
Traits. For operations and views in
the cytoflow package, basic working knowledge of traits is sufficient.
For GUI work, trait notification is used extensively.
The GUI wrappers also use TraitsUI because it makes wrapping traits with UI elements easy. Have a look at documentation for views, handlers, and of course the trait editors.
Finally, there are some principles that I expect new modules contributed to this codebase to follow:
Check for pathological errors and fail early. I really dislike the tendency of a number of libraries to fail with cryptic errors. (I’m looking at you,
pandas.) Check for obvious errors and raise aCytoflowOpErrororCytoflowViewError). If the problem is non-fatal, warn withCytoflowOpWarningorCytoflowViewWarning. The GUI will also know how to handle these gracefully.Separate experimental data from module state. There are workflows that require estimating parameters with one data set, then applying those operations to another. Make sure your module supports them.
Estimate slow but apply fast. The GUI re-runs modules’
apply()methods automatically when parameters change. That means that theapply()method must run very quickly.Write tests. I hate writing unit tests, but they are indispensible for catching bugs. Even in a view’s tests are just smoke tests (“It plots something and doesn’t crash”), that’s better than nothing.
New operations¶
The base operation API is fairly simple:
id- a requiredtraits.Constantcontaining the UID of the operationfriendly_id- a requiredtraits.Constantcontaining a human-readable nameapply()- takes anExperimentand returns a newExperimentwith the operation applied.apply()shouldclone()the old experiment, then modify and return the clone. Don’t forget to add the operation to the newExperiment’shistory. A good example of a simple operation isRatioOp.Note
Be aware of the
deepparameter forclone()! It defaults toTrue– only set it toFalseif you are only adding columns to theExperiment.Note
The resulting
Experimentmust have apandas.RangeIndexfor its index – several modules rely on this! If you add or remove events from theExperiment, make sure you callpandas.DataFrame.reset_indexonExperiment.datato make the index monotonic again.estimate()- You may also wish to estimate the operation’s parameters from a data set. Crucially, this might not be the data set you are eventually applying the operation to. If your operation relies on estimating parameters, implement theestimate()function. This may involve selecting a subset of the data in theExperiment, or it may involve loading in an an additional FCS file. A good example of the former isKMeansOp; a good example of the latter isAutofluorescenceOp.You may also find that you wish to estimate different parameter sets for different sub-populations (as encoded in the
Experiment’sconditions.) By convention, the conditions that you want to estimate different parameters for are passed using a trait namedby, which takes a list of conditions and groups the data by unique combinations of those conditions’ values before estimating a paramater set for each. Look atKMeansOpfor an example of this behavior.default_view()- for some operations, you may want to provide a default view. This view may just be a base view parameterized in a particular way (like theHistogramViewthat is the default view ofBinningOp), or it may be a visualization of the parameters estimated by theestimate()function (like the default view ofAutofluorescenceOp.) In many cases, the view returned by this function is linked back to the operation that produced it.
New views¶
The base view API is very simple:
id- a requiredtraits.Constantcontaining the UID of the operationfriendly_id- a requiredtraits.Constantcontaining a human-readable nameplot()- plotsExperiment.
As I wrote more views, however, I noticed a significant amount of code
duplication, which led to bugs and lost time. So, I refactored the view code
to use a short hierarchy of classes for particular types of views. You can
take advantage of this functionality when writing a new module, or you can
simply derive your new view from traits.HasTraits and implement the
simple API above.
The view base classes are:
BaseView– implements a view with row, column and hue facets. After setting up the facet grid, it calls the derived class’s_grid_plot()to actually do the plotting.plot()also has parameters to set the plot style, legend, axis labels, etc.BaseDataView– implements a view that plots anExperiment’s data (as opposed to a statistic.) Includes functionality for subsetting the data before plotting, and determining axis limits and scales.Base1DView– implements a 1-dimensional data view. SeeHistogramViewfor an example.Base2DView– implements a 2-dimensional data view. SeeScatterplotViewfor an example.BaseNDView– implements an N-dimensional data view. SeeRadvizViewfor an example.BaseStatisticsView– implements a view that plots a statistic from anExperiment(as opposed to the underlying data.) These views have a “primary”variable, and can be subset as well.Base1DStatisticsView– implements a view that plots one dimension of a statistic. SeeBarChartViewfor an example.Base2DStatisticsView– implements a view that plots two dimensions of a statistic. SeeStats2DViewfor an example.
New GUI operations¶
Wrapping an operation for the GUI sometimes feels like it requires more work than writing the operation in the first place. A new operation requires at least five things:
A class derived from the underlying
cytoflowoperation. The derived operation should be placed in a module incytoflowgui.workflow.operations, and it should:Inherit from
WorkflowOperationto add support for various GUI event-handling bits (as well as the underlyingcytoflowclass, if appropriate)Override attributes in the underlying
cytoflowclass to add metadata that tells the GUI how to react to changes. (See theIWorkflowOperationdocstring for details.)Provide an implementation of
get_notebook_code(), to support exporting to Jupyter notebook.If the module has an
estimate()method, then implementclear_estimate()to clear those parameters.If the module has a
default_view()method, it should be overridden to return a GUI-enabled view class (see below.)Optionally, override
should_apply()andshould_clear_estimate()to only do expensive operations when necessary.
Serialization logic.
cytoflowusescamelfor sane YAML serialization; a dumper and loader for the class must save and load the operation’s parameters. These should also go incytoflowgui.workflow.operations.A handler class that defines the default
traits.Viewand provides supporting logic. This class should be derived fromOpHandlerand should be placed incytoflowgui.op_plugins.A plugin class derived from
envisage.plugin.Pluginand implementingIOperationPlugin. It should also derive fromcytoflowgui.op_plugins.op_plugin_base.PluginHelpMixin, which adds support for online help.Tests. Because of
cytoflowgui’s split between processes, testing GUI logic for modules can be kind of a synchronization nightmare. This is by design – because the same synchronization issues are present when running the software. See thecytoflowgui/testsdirectory for (many) examples.(Optionally) default view implementations. If the operation has a default view, you should wrap it as well (in the operation plugin module.) See the next section for details.
New GUI views¶
A new view operation requires at least five things:
A class derived from the underlying
cytoflowview. The derived view should be placed incytoflowgui.workflow.viewsInherit from
WorkflowViewor one of its children to add support for various GUI event-handling bitsOverride attributes in the underlying
cytoflowclass to add metadata that tells the GUI how to react to changes. (See theIWorkflowViewdocstring for details.)Provide an implementation of
get_notebook_code(), to support exporting to Jupyter notebook.Optionally, override
should_plot()to only plot when necessary.
Serialization logic.
cytoflowusescamelfor sane YAML serialization; a dumper and loader for the class must save and load the operation’s parameters. These should also go incytoflowgui.workflow.views.A handler class that defines the default
traits.Viewand provides supporting logic. This class should be derived fromViewHandlerand should be placed incytoflowgui.view_plugins.A plugin class derived from
envisage.plugin.Pluginand implementingIViewPlugin. It should also derive fromcytoflowgui.view_plugins.view_plugin_base.PluginHelpMixin,, which adds support for online help.Plot parameters. The parameters to a view’s
plot()method are stored in an object that derives fromBasePlotParamsor one of its decendants. Choose data types that are appropriate for the view, and include a default view namedview_params_viewin the handler class. Don’t forget to write serialization code for it as well!Tests. Because of
cytoflowgui’s split between processes, testing GUI logic for modules can be kind of a synchronization nightmare. This is by design – because the same synchronization issues are present when running the software. See thecytoflowgui/testsdirectory for (many) examples. In the case of a view, most of these are “smoke tests”, testing that the view doesn’t crash with various sets of parameters.
Note
Why the split between the classes in cytoflowgui.op_modules,
cytoflowgui.workflow.operations, cytoflowgui.view_modules,
and cytoflowgui.workflow.views? It’s because of the fact that
cytoflow runs in two processes – one handles the GUI and the other
operates on the workflow. If you load a module containing UI bits, even
if you don’t explicitly create a QGuiApplication, it starts an
event loop. That’s why older versions of Cytoflow had two icons
in the task bar when running on a Mac. You know how sometimes you go
to fix a “little” bug and end up re-writing the whole program? This
was one of those times….