Hierarchical Gating#

A common strategy for manual gating is hierarchical gating – a skilled cytometrist examines sets of one- or two-dimensional plots, one after another, to separate cells into “positive” and “negative” populations. While I like to think that modern cytometry has better, less biased tools to accomplish this task, it is still a necessary one in many contexts, and Cytoflow supports it.

This notebook demonstrates a hierarchical gating scheme from Saeys Y, Van Gassen S, Lambrecht BN. Computational flow cytometry: helping to make sense of high-dimensional immunology data. Nature Reviews Immunology 16:449-462 (2016). Our question here is a basic one – what is the count of each cell type in each tube in the six tubes in the experiment? The data were downloaded from the Sayes Lab github and compensated using the bleedthrough matrix in the provided FlowJo workspace before being re-saved by Cytoflow – no other data preprocessing was applied.

We want to quantify NK, NK T, T and B cells; neutrophils, DCs, basophils, and macrophages. The markers (and the channels they were measured in) are in the table below (also from the Sayes lab github). Per [https://flowrepository.org/experiments/833], these cells were splenocytes from wild-type C57Bl/6 mice.

Set up the notebook and import the data set#

This notebook uses the interactive widges provided by the ipympl package. If you are seeing only blank spaces where you are expecting interactive plots, make sure you are using the Jupyter Lab interface instead of the Jupyter Notebook interface. For whatever reason, it seems to work more consistently.

# set up the notebook
%matplotlib widget

import cytoflow as flow

# if your figures are too big or too small, you can scale them by changing matplotlib's DPI
import matplotlib
matplotlib.rc('figure', dpi = 160)

In Cytoflow, we’d usually add additional metadata to each tube. Here, however, all we have is the tube number. We are going to use the channels attribute of ImportOp to map channels to markers, though. (I don’t know which marker is on the AmCyan or Pacific Blue channels, though. And let’s be clear, I am not an immunologist, to know about any of these markers!)

import_op = flow.ImportOp(conditions = {"Tube" : "category"},
                          tubes = [flow.Tube(file='data/Saeys_11.fcs', conditions = {"Tube" : "11"}),
                                   flow.Tube(file='data/Saeys_12.fcs', conditions = {"Tube" : "12"}),
                                   flow.Tube(file='data/Saeys_13.fcs', conditions = {"Tube" : "13"}),
                                   flow.Tube(file='data/Saeys_28.fcs', conditions = {"Tube" : "28"}),
                                   flow.Tube(file='data/Saeys_30.fcs', conditions = {"Tube" : "30"}),
                                   flow.Tube(file='data/Saeys_31.fcs', conditions = {"Tube" : "31"})],
                          channels = {"FSC-A" : "FSC_A",
                                      "FSC-H" : "FSC_H",
                                      "APC-Cy7-A" : "Live_Dead",
                                      "AmCyan-A" : "AmCyan",
                                      "BV711-A" : "CD64",
                                      "PE-A" : "CD3",
                                      "PE-Cy5-A" : "CD19",
                                      "APC-A" : "CD161",
                                      "PE-Cy7-A" : "CD11c",
                                      "PerCP-Cy5-5-A" : "MHCII",
                                      "Alexa Fluor 700-A" : "Ly_6G",
                                      "BV605-A" : "CD11b",
                                      "BV786-A" : "FcERI",
                                      "Pacific Blue-A" : "Pacific_Blue"})

ex = import_op.apply()

Gate single cells and live cells#

First, gate on FSC_A and FSC-H to separate single cells from debris and clumps.

single_gate = flow.PolygonOp(name = "Single_Cell",
                             xchannel = "FSC_A",
                             ychannel = "FSC_H")

single_gate.default_view(density = True,
                         huescale = "log",
                         interactive = True).plot(ex, gridsize = 100)

ex_single = single_gate.apply(ex)

Next, gate on FSC_A and Live_Dead to find live cells. Remember, dead cells are the positive population!

live_gate = flow.PolygonOp(name = "Live",
                           xchannel = "FSC_A",
                           ychannel = "Live_Dead",
                           yscale = "logicle")

live_gate.default_view(density = True,
                       huescale = "log",
                       subset = "Single_Cell == True",
                       interactive = True).plot(ex_single, gridsize = 100)

ex_live = live_gate.apply(ex_single)

Gate immune cells#

Now, we’ll set up a single gate for each immunological population we’re interested in. Here, we are not looking at subsets defined by previous gates – instead, for each population, we’re just plotting cells that are Single_Cell == True and also Live == True.

We start with macrophages, which are CD64 high and AmCyan high

macrophage_gate = flow.PolygonOp(name = "Macrophage",
                                 xchannel = "CD64",
                                 xscale = "logicle",
                                 ychannel = "AmCyan",
                                 yscale = "logicle")

macrophage_gate.default_view(density = True,
                             huescale = "log",
                             subset = "Single_Cell == True & Live == True",
                             interactive = True).plot(ex_live, gridsize = 100)

B cells are CD19 high and CD3 low.

bcell_gate = flow.PolygonOp(name = "B_Cell",
                            xchannel = "CD3",
                            xscale = "logicle",
                            ychannel = "CD19",
                            yscale = "logicle")

bcell_gate.default_view(density = True,
                        huescale = "log",
                        subset = "Single_Cell == True & Live == True",
                        interactive = True).plot(ex_live, gridsize = 100)

We can use CD3 and CD161 to distinguish NK, NK T and T Cells.

nk_gate = flow.PolygonOp(name = "NK",
                         xchannel = "CD3",
                         xscale = "logicle",
                         ychannel = "CD161",
                         yscale = "logicle")

nk_gate.default_view(density = True,
                     huescale = "log",
                     subset = "Single_Cell == True & Live == True",
                     interactive = True).plot(ex_live, gridsize = 100)

nkt_gate = flow.PolygonOp(name = "NK_T",
                          xchannel = "CD3",
                          xscale = "logicle",
                          ychannel = "CD161",
                          yscale = "logicle")

nkt_gate.default_view(density = True,
                      huescale = "log",
                      subset = "Single_Cell == True & Live == True",
                      interactive = True).plot(ex_live, gridsize = 100)

tcell_gate = flow.PolygonOp(name = "T_Cell",
                            xchannel = "CD3",
                            xscale = "logicle",
                            ychannel = "CD161",
                            yscale = "logicle")

tcell_gate.default_view(density = True,
                        huescale = "log",
                        subset = "Single_Cell == True & Live == True",
                        interactive = True).plot(ex_live, gridsize = 100)

DCs are CD11c high and MHCII high

dc_gate = flow.PolygonOp(name = "DC",
                         xchannel = "CD11c",
                         xscale = "logicle",
                         ychannel = "MHCII",
                         yscale = "logicle")

dc_gate.default_view(density = True,
                     huescale = "log",
                     subset = "Single_Cell == True & Live == True",
                     interactive = True).plot(ex_live, gridsize = 100)

Neutrophils are Ly-6G high and CD11b high.

neutrophil_gate = flow.PolygonOp(name = "Neutrophil",
                                 xchannel = "Ly_6G",
                                 xscale = "logicle",
                                 ychannel = "CD11b",
                                 yscale = "logicle")

neutrophil_gate.default_view(density = True,
                             huescale = "log",
                             subset = "Single_Cell == True & Live == True",
                             interactive = True).plot(ex_live, gridsize = 100)

Finally, basophils are FcERI high. (I don’t know what marker is on the Pacific Blue channel – they seem to be high for that marker too.

basophil_gate = flow.PolygonOp(name = "Basophil",
                               xchannel = "FcERI",
                               xscale = "logicle",
                               ychannel = "Pacific_Blue",
                               yscale = "logicle")

basophil_gate.default_view(density = True,
                           huescale = "log",
                           subset = "Single_Cell == True & Live == True",
                           interactive = True).plot(ex_live, gridsize = 100)

Apply the gates and analyze the result#

Up to now, we’ve created and parameterized the various gates. Let’s apply all of them sequentially.

ex_gated = macrophage_gate.apply(ex_live)
ex_gated = bcell_gate.apply(ex_gated)
ex_gated = nk_gate.apply(ex_gated)
ex_gated = nkt_gate.apply(ex_gated)
ex_gated = tcell_gate.apply(ex_gated)
ex_gated = dc_gate.apply(ex_gated)
ex_gated = neutrophil_gate.apply(ex_gated)
ex_gated = basophil_gate.apply(ex_gated)

Now we can apply the hierarchical gating strategy. The HierarchyOp operation uses an ordered series of gates to create a new categorical condition. You parameterize it with a list of gates, values, and labels. If an event has the first condition equal to the first value, it gets the first label. Otherwise, if it has the second condition equal to the second value, it gets the second label. And so on. Left over events get a default label, which is Unknown by default.

ex_hierarchy = flow.HierarchyOp(name = "Cell_Type",
                                gates = [("Macrophage", True, "Macrophage"),
                                         ("B_Cell", True, "B Cell"),
                                         ("NK", True, "NK"),
                                         ("NK_T", True, "NK T"),
                                         ("T_Cell", True, "T Cell"),
                                         ("DC", True, "DC"),
                                         ("Neutrophil", True, "Neutrophil"),
                                         ("Basophil", True, "Basophil")]).apply(ex_gated)

Now, let’s compute a simple statistic on the FSC-A channel and just count the number of events that have each label of the Cell Type condition, broken out by Tube.

ex_hierarchy_count = flow.ChannelStatisticOp(name = "Cell_Type",
                                             channel = "FSC_A",
                                             function = len,
                                             by = ["Cell_Type", "Tube"],
                                             subset = "Single_Cell == True & Live == True").apply(ex_hierarchy)

Cytoflow has a TableView, but it’s not great for displaying wide tables. Instead, let’s use pandas to pivot the new statistic and Jupyter’s pretty-printing to display it.

ex_hierarchy_count.statistics['Cell_Type'].reset_index().pivot(columns = "Tube",
                                                               index = "Cell_Type",
                                                               values = "FSC_A")

Tube	11	12	13	28	30	31
Cell_Type
B Cell	19353.0	16382.0	18564.0	20205.0	17178.0	19686.0
Basophil	31.0	17.0	31.0	26.0	20.0	13.0
DC	994.0	867.0	561.0	783.0	612.0	685.0
Macrophage	484.0	629.0	565.0	463.0	381.0	274.0
NK	40.0	36.0	44.0	661.0	348.0	554.0
NK T	272.0	247.0	192.0	161.0	127.0	151.0
Neutrophil	212.0	166.0	198.0	229.0	231.0	238.0
T Cell	6280.0	6046.0	6411.0	7901.0	9322.0	10866.0
Unknown	1904.0	2036.0	2918.0	1535.0	1565.0	1343.0

Finally, everyone loves a plot instead of a table. Let’s make pie charts using MatrixView.

flow.MatrixView(statistic = "Cell_Type",
                style = "pie",
                variable = "Cell_Type",
                feature = "FSC_A",
                xfacet = "Tube").plot(ex_hierarchy_count,
                                      legendlabel = "Cell Type",
                                      linestyle = 'none')

/home/brian/src/cytoflow/cytoflow/views/matrix.py:392: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning). Consider using matplotlib.pyplot.close().

At the end of the day, I don’t think that any of these tubes was substantially different from the rest of them.