Tutorial: Estimating Genome Size#

Flow cytometry is regularly used to estimate genome size relative to a standard, often in plants. Nuclei are isolated from a plant of unknown genome size and from a plant of known genome size; the samples are mixed, stained with a fluorescent DNA-binding dye such as DAPI, and run on a flow cytometer. The relative fluorescences can be used to estimate the size of the unknown genome relative to the known one.

This tutorial demonstrates one possible approach. We use data from Kúr et al. Cryptic invasion suggested by a cytogeographic analysis of the halophytic Puccinellia distans complex (Poaceae) in Central Europe. Frontiers in Plant Science 14, 2023 https://doi.org/10.3389/fpls.2023.1249292>. No pre-processing was done – these are the raw files downloaded from the publication’s dataset on Zenodo. I did remove one file whose data looked pretty wonky – the investigators do the same with several of their samples.

If you’d like to follow along, you can do so by downloading one of the cytoflow-#####-examples-basic.zip files from the Cytoflow releases page on GitHub. These data are in the data/genome_size subfolder.

Set up Cytoflow and import the data#

import cytoflow as flow

# if your figures are too big or too small, you can scale them by changing matplotlib's DPI
import matplotlib
matplotlib.rc('figure', dpi = 160)

Because we don’t have any metadata besides the filename, use that as the metadata.

# Use glob to get the files and parse the conditions back out.

import glob, re
tubes = []

for f in glob.glob("data/genome_size/*.fcs"):
    r = re.search("data/genome_size/(.*?)\\.fcs", f)
    sample = r.group(1)

    tube = flow.Tube(file = f, conditions = {"Sample" : sample})
    tubes.append(tube)

ex = flow.ImportOp(tubes = tubes,
                   conditions = {"Sample" : "category"},
                   ignore_v = ["SSC", "FL1"]).apply()

/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@5+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@11+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@12+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@14+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@7_sil+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@7+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@9+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@10+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@1+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@5+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@3+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@4+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@6+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@8+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@8+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@14+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@9+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@4+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@6+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@13+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@2+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@3+B200(2).fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@13+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@12+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU88@11+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:300: UserWarning: There appears to be some information in the ANALYSIS segment of file data/genome_size/PU87@2+B200.fcs. However, it might not be read correctly.
/home/brian/src/cytoflow/fcsparser/fcsparser/api.py:501: UserWarning: The default channel names (defined by the $PnS parameter in the FCS file) were not unique. To avoid problems in downstream analysis, the channel names have been switched to the alternate channel names defined in the FCS file. To avoid seeing this warning message, explicitly instruct the FCS parser to use the alternate channel names by specifying the channel_naming parameter.
/home/brian/src/cytoflow/cytoflow/operations/import_op.py:420: CytoflowOpWarning: The data range $PnR doesn't match the data bits $PnB for channel SSC, masking out 4 bits
/home/brian/src/cytoflow/cytoflow/operations/import_op.py:442: CytoflowOpWarning: Converting channel SSC from logarithmic to linear
/home/brian/src/cytoflow/cytoflow/operations/import_op.py:420: CytoflowOpWarning: The data range $PnR doesn't match the data bits $PnB for channel FL1, masking out 4 bits
/home/brian/src/cytoflow/cytoflow/operations/import_op.py:420: CytoflowOpWarning: The data range $PnR doesn't match the data bits $PnB for channel Time, masking out 4 bits

Usually, Cytoflow imposes a constraint that a channel’s voltages are the same for each FCS file it imports. With care, this can allow you to compare quantitative measurements across FCS files. However, occasionally you need to relax this constraint, and this is one of those times – a few of these FCS files have slightly different voltages. However, because the standard is spiked into each sample, we can safely proceed. So, we set ignore_v to a list of channels where we would like to ignore the voltage.

And what about all those warnings? Occasionally an instrument manufacturer’s software saves files a little strangely. Cytoflow knows about a lot of these issues and fixes them for you, but it will also warn you that the data it is analyzing is not exactly the data in the FCS files.

Take a look at the data#

It’s always good to start with some basic data exploration.

flow.ScatterplotView(xchannel = "SSC",
                     xscale = "log",
                     ychannel = "FL1",
                     yscale = "linear").plot(ex, s = 0.2, alpha = 0.1)

../../_images/estimating_genome_size_5_0.png

Here, there are only two channels saved (in addition to Time), so have a look at the scatter plot. Notice that there are three major clusters – the lower FL1 cluster is our internal standard, and the higher two clusters are the nulcei we’re trying to estimate. There’s also quite a bit of noise, which might make that estimation inaccurate – we’ll think about that in a minute.

(Remember, unless you facet or subset your plot – which we’re not doing here – you’re looking at the whole data set, all of the tubes together on one plot.)

Gate out high-SSC events#

There are a bunch of events piled up at the top of the SSC channel, and those will interfere with subsequent analysis. Let’s get rid of them with a Threshold Gate.

ex_gated = flow.ThresholdOp(name = "Huge_SSC",
                            channel = "SSC",
                            threshold = 500).apply(ex)

Find the peaks with a density gate#

Let’s assume, for the moment, that we don’t want to draw 40 different gates to find the two peaks in each of our 20 samples. What to do? Cytoflow includes several automated clustering algorithms, and the one that works best in this situation is a Density Gate. The user selects how much of the data they would like to retain, and this gate selects that proportion of events in the highest-density regions of a 2D plot. Most important for our use, we don’t have to use the same gate for the entire data set. Like most data-driven operations, the Density Gate lets you group the data by some condition or piece of metadata before estimating the operation’s parameters. In this case, we’ll compute a different gate for each tube.

After exploring several different parameters, let’s keep 45% of the data. We’ll gate on the SSC and FL1 channels, same as the scatterplot. Change xscale from log to logicle, though. To make a different gate for each tube, set by to ["Sample"]. And make sure we set subset to Huge_SSC == False to ignore the events with huge SSCs.

density_gate = flow.DensityGateOp(name = "Density",
                                  xchannel = "SSC",
                                  xscale = "logicle",
                                  ychannel = "FL1",
                                  yscale = "linear",
                                  keep = 0.45,
                                  by = ["Sample"])
density_gate.estimate(ex_gated, subset = "Huge_SSC == False")

/home/brian/src/cytoflow/cytoflow/utility/logicle_scale.py:320: CytoflowWarning: Channel SSC doesn't have any negative data. Try a log transform instead.

Now, iterate over all of the plots and make sure that we got a gate that looks reasonable for each sample.

for plot_name in density_gate.default_view().enum_plots(ex_gated):
    density_gate.default_view().plot(ex_gated, plot_name = plot_name, title = plot_name)

/home/brian/.local/lib/anaconda/envs/cf_dev/lib/python3.12/site-packages/seaborn/axisgrid.py:453: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning). Consider using matplotlib.pyplot.close().

../../_images/estimating_genome_size_11_1.png

../../_images/estimating_genome_size_11_2.png

../../_images/estimating_genome_size_11_3.png

../../_images/estimating_genome_size_11_4.png

../../_images/estimating_genome_size_11_5.png

../../_images/estimating_genome_size_11_6.png

../../_images/estimating_genome_size_11_7.png

../../_images/estimating_genome_size_11_8.png

../../_images/estimating_genome_size_11_9.png

../../_images/estimating_genome_size_11_10.png

../../_images/estimating_genome_size_11_11.png

../../_images/estimating_genome_size_11_12.png

../../_images/estimating_genome_size_11_13.png

../../_images/estimating_genome_size_11_14.png

../../_images/estimating_genome_size_11_15.png

../../_images/estimating_genome_size_11_16.png

../../_images/estimating_genome_size_11_17.png

../../_images/estimating_genome_size_11_18.png

../../_images/estimating_genome_size_11_19.png

../../_images/estimating_genome_size_11_20.png

../../_images/estimating_genome_size_11_21.png

../../_images/estimating_genome_size_11_22.png

../../_images/estimating_genome_size_11_23.png

../../_images/estimating_genome_size_11_24.png

../../_images/estimating_genome_size_11_25.png

../../_images/estimating_genome_size_11_26.png

Looks good! And why change the X scale? This was an issue I caught as I scrolled through the gates – there was a gate with strange behavior with events with very small SSCs. I could have gated them out, I suppose, or I could change the scale – logicle behaves much better around 0 than log does.

Let’s apply the gate and move on.

ex_density = density_gate.apply(ex_gated)

/home/brian/src/cytoflow/cytoflow/utility/logicle_scale.py:320: CytoflowWarning: Channel SSC doesn't have any negative data. Try a log transform instead.

Separate the control and sample peak#

We’ve found the peaks, but if we want to summarize the different peaks’ means separately, we need to separate them. Let’s do so with another Threshold gate – this one I’ll name Unknown, to distinguish the spike-in sample (Unknown == False) from the events we want to measure (Unknown == True).

Note that in the view, I’ve shown only the subset for which Density == True – this hides all the events that weren’t in the peaks and cleans up the plot substantially.

threshold_op = flow.ThresholdOp(name = "Unknown",
                                channel = "FL1",
                                threshold = 1000)
threshold_op.default_view(subset = "Density == True").plot(ex_density)

../../_images/estimating_genome_size_15_0.png

ex_threshold = threshold_op.apply(ex_density)

Compute the peaks’ means#

We’re almost done. Recall that in Cytoflow, if we want to summarize some flow data, we do so by creating a statistic. So let’s create one using the ChannelStatisticOp operation. We’ll get each distinct subset of Sample and Standard, then apply np.mean and save in a statistic named Mean.

import numpy as np
ex_mean = flow.ChannelStatisticOp(name = "Mean",
                                  channel = "FL1",
                                  function = np.mean,
                                  by = ["Sample", "Unknown"],
                                  subset = "Density == True").apply(ex_threshold)
ex_mean.statistics['Mean']

		FL1
Sample	Unknown
PU87@10+B200	False	837.876204
PU87@10+B200	True	1308.596886
PU87@11+B200	False	837.271686
PU87@11+B200	True	1320.092908
PU87@12+B200	False	825.947059
PU87@12+B200	True	1293.881644
PU87@13+B200	False	804.640094
PU87@13+B200	True	1220.929577
PU87@14+B200	False	900.000967
PU87@14+B200	True	1393.757658
PU87@2+B200	False	828.260794
PU87@2+B200	True	1364.768903
PU87@3+B200	False	775.346457
PU87@3+B200	True	1224.151099
PU87@4+B200	False	830.764826
PU87@4+B200	True	1293.239563
PU87@5+B200	False	793.494881
PU87@5+B200	True	1231.083259
PU87@6+B200	False	839.064236
PU87@6+B200	True	1282.589608
PU87@7_sil+B200	False	821.028060
PU87@7_sil+B200	True	1303.553441
PU87@8+B200	False	847.688811
PU87@8+B200	True	1350.027876
PU87@9+B200	False	797.452258
PU87@9+B200	True	1309.014458
PU88@1+B200	False	811.518106
PU88@1+B200	True	2005.262701
PU88@11+B200	False	791.330855
PU88@11+B200	True	2003.251294
PU88@12+B200	False	788.473770
PU88@12+B200	True	1977.615936
PU88@13+B200	False	816.856079
PU88@13+B200	True	2058.079467
PU88@14+B200	False	790.418095
PU88@14+B200	True	1996.551899
PU88@2+B200	False	797.595672
PU88@2+B200	True	1992.682345
PU88@3+B200(2)	False	784.629487
PU88@3+B200(2)	True	1993.387588
PU88@4+B200	False	800.509960
PU88@4+B200	True	2032.551757
PU88@5+B200	False	786.311507
PU88@5+B200	True	1916.499501
PU88@6+B200	False	808.929134
PU88@6+B200	True	1989.944548
PU88@7+B200	False	809.851351
PU88@7+B200	True	2006.164361
PU88@8+B200	False	816.540984
PU88@8+B200	True	2099.449713
PU88@9+B200	False	832.135174
PU88@9+B200	True	2119.060337

Remember, a statistic is just a pandas.DataFrame with a hierarchical index. So, if we want a table with a row for each sample and the Unknown == True and Unknown == False values in separate columns, we can reshape the data using pandas.DataFrame.unstack, like so:

ex_mean.statistics['Mean'].unstack(level = "Unknown")

	FL1
Unknown	False	True
Sample
PU87@10+B200	837.876204	1308.596886
PU87@11+B200	837.271686	1320.092908
PU87@12+B200	825.947059	1293.881644
PU87@13+B200	804.640094	1220.929577
PU87@14+B200	900.000967	1393.757658
PU87@2+B200	828.260794	1364.768903
PU87@3+B200	775.346457	1224.151099
PU87@4+B200	830.764826	1293.239563
PU87@5+B200	793.494881	1231.083259
PU87@6+B200	839.064236	1282.589608
PU87@7_sil+B200	821.028060	1303.553441
PU87@8+B200	847.688811	1350.027876
PU87@9+B200	797.452258	1309.014458
PU88@1+B200	811.518106	2005.262701
PU88@11+B200	791.330855	2003.251294
PU88@12+B200	788.473770	1977.615936
PU88@13+B200	816.856079	2058.079467
PU88@14+B200	790.418095	1996.551899
PU88@2+B200	797.595672	1992.682345
PU88@3+B200(2)	784.629487	1993.387588
PU88@4+B200	800.509960	2032.551757
PU88@5+B200	786.311507	1916.499501
PU88@6+B200	808.929134	1989.944548
PU88@7+B200	809.851351	2006.164361
PU88@8+B200	816.540984	2099.449713
PU88@9+B200	832.135174	2119.060337

Getting the ratio between the two and converting to a pg/genome estimate is left as an exercise for the reader!