class cytoflow.utility.consensus_cluster.ConsensusCluster(cluster, L, K, H, resample_proportion=0.5)[source]#

Bases: object

Implementation of Consensus clustering, following the paper https://link.springer.com/content/pdf/10.1023%2FA%3A1023949509487.pdf Args:

  • cluster -> clustering class

  • NOTE: the class is to be instantiated with parameter n_clusters, and possess a fit_predict method, which is invoked on data.

  • L -> smallest number of clusters to try

  • K -> biggest number of clusters to try

  • H -> number of resamplings for each cluster number

  • resample_proportion -> percentage to sample

  • Mk -> consensus matrices for each k (shape =(K,data.shape[0],data.shape[0]))

    (NOTE: every consensus matrix is retained, like specified in the paper)

  • Ak -> area under CDF for each number of clusters

    (see paper: section 3.3.1. Consensus distribution.)

  • deltaK -> changes in areas under CDF

    (see paper: section 3.3.1. Consensus distribution.)

  • self.bestK -> number of clusters that was found to be best

fit(data, verbose=False)[source]#

Fits a consensus matrix for each number of clusters

Parameters:
  • * data -> (examples,attributes)

  • * verbose -> should print or not

predict()[source]#

Predicts on the consensus matrix, for best found cluster number

predict_data(data)[source]#

Predicts on the data, for best found cluster number :Parameters: * data -> (examples,attributes)