HCL: Hierarchical Clustering

Parameter Information


Tree Selection

These checkboxes are used to indicate whether to construct trees for genes, samples, or both.

Leaf Order Optimization

These checkboxes are used to indicate whether to use an algorithm that will rearrange the analysis to optimize the sum of similarities between all adjacent leaves on the tree. This algorithm is memory intensive and will increase the calculation time over the arbitrary ordering technique. Depending on the size of your data set this option may not be available due to Java memory constraints.

Distance Metric Selection

This area allows the selection of the metric to be used to assess gene-to-gene or sample-to-sample distances. The initial metric displayed (choosen) corresponds to the global setting in the Multiple Array Viewer's 'Metrics' menu. Alterations to the chosen metric in this dialog will only alter the metric used for the current algorithm run. The global setting in the main 'Metrics' menu will remain unchanged.

Euclidean Distance and Pearson Correllation tend to be the most frequently used options. An appendix in the MeV manual describes the distance metrics offered in MeV.

Linkage Method Selection

This parameter is used to indicate the convention used for determining cluster-to-cluster distances when constructing the hierarchical tree. Just as distance metrics define gene-to-gene or sample-to-sample distances the Linkage methods are employed to determine cluster-to-cluster distances.

Single Linkage: The distances are measured between each member of one cluster each member of the other cluster. The minimum of these distances is considered the cluster-to-cluster distance.

Average Linkage: The average distance of each member of one cluster to each member of the other cluster is used as a measure of cluster-to-cluster distance.

Complete Linkage: The distances are measured between each member of one cluster each member of the other cluster. The maximum of these distances is considered the cluster-to-cluster distance.