DendroUPGMA: A dendrogram construction utility.

Created at March 2002.
By Santi Garcia-Vallvé (santi.garcia-vallve AT urv.net) and Pere Puigbo
Biochemistry and Biotechnology Department.
Universitat Rovira i Virgili (URV). Tarragona. Spain.

               


Use this program to create a dendrogram from (a) sets of variables, (b) a similarity matrix or (c) a distance matrix. The program calculates a similarity matrix (only for option a), transforms similarity coefficients into distances and makes a clustering using the Unweighted Pair Group Method with Arithmetic mean (UPGMA) or Weighted Pair Group Method with Arithmetic Mean (WPGMA) algorithm.

For Option (a) -Construct a dendrogram from a set of variables- two input formats are available:

  • Format 1: fasta-like format - Two lines,the first one must begin with the ">" character (this will be the identifier line) and the secon one that contains the variables (separated by tabs or spaces, not a combination of both).
  • Format 2: Only one line for each set of variables (separated by spaces). The first one will be assumed to be the identifier.
For Options (b) and (c) -Construct a dendrogram from a similarity or distance matrix- the input must be a similarity or distance matrix in CSV or tab-delimited format.

Choose the input:
(a) sets of variables (raw data)

Choose the similarity index or distance coefficient used to compare between the set of variables (Use Jaccard or Dice for binary data):

Similarity index: Pearson correlation coefficient (r)  Jaccard index (Tanimoto)  Dice coefficient 

Distance coefficient: Euclidean distance Manhattan distance (city block or taxicab distance) Mean square deviation (MSD) Root Mean Square Deviation (RMSD)
Do you want to normalize (standardize) the input data? (only available for distance coefficients) No Yes

If the Pearson coefficient (r) has been chosen, choose the transformation of r to generate the distance values (d):
d = (1 - r) * 100     d = - Ln (abs (r))

Do you want to generate 100 bootstrap replicates? No Yes

Do you want to omit rows with identical values for all the variables? No Yes

(b) a similarity matrix

Do you want to omit rows with a similarity of 1? No Yes

(c) a distance matrix

 

Choose the clustering method:

UPGMA (Unweighted Pair Group Method with Arithmetic Mean) WPGMA (Weighted Pair Group Method with Arithmetic Mean)


Examples:

Download a small Tutorial of the DendroUPGMA server.


If you use this server please mention its URL address and the article S. Garcia-Vallve, J. Palau and A. Romeu (1999) Horizontal gene transfer in glycosyl hydrolases inferred from codon usage in Escherichia coli and Bacillus subtilis. Molecular Biology and Evololution 16(9):1125-1134.