Optimizer: A web server utility that optimize a DNA or Protein sequence

OPTIMIZER TUTORIAL

What is optimizer?

OPTIMIZER is an on-line PHP application that optimizes the codon usage of a DNA sequence to increase its expression level. Users can introduce their own preference tables to be used in the optimization process or use pre-computed tables from more than 15 prokaryotic species under a strong translational selection. Three methods of optimization are available: the 'one amino acid - one codon' approach, a random approach or an intermediate one. Several options, such as avoiding specific restriction sites and several outputs, are also available. This server can be useful for predicting and optimizing the level expression of a gene in heterologous gene expression.

How to cite: Puigbo P., Guzman E., Romeu A. and Garcia-Vallve S. OPTIMIZER: A web server for optimizing the codon usage of DNA sequences. Nucleic Acids Research. 2007 35:W126-W131.

The four steps to optimize a sequence are:

    Step 1) Paste your sequence in the text box and then choose the type of pasted sequence (DNA or Protein) and click on 'CONTINUE>' button.
    Step 2) Select option a) "OPTIMIZER data" and choose a genome, or select option b) "User data", insert your own table and choose the form. Optimizer can optimize using two types of reference tables: using the codon usage table or a table of the tRNA gene copy numbers (tGCN).
    Step 3) Choose the appropiate genetic code. If you have selected in STEP 2 'OPTIMIZER data' then choose genetic code 11.
    Step 4) Choose the method to use in the optimization proces. If You have introduced a DNA sequence and selected a codon usage table to optimize your sequence. Thus there are 3 methods for optimizing the codon usage of the query sequence. The first method, "one amino acid - one codon", optimizes all codons of the query sequence, i.e. all codons that codify the same amino acid are substituted by the synonymous codon used most frequently in the reference set. The second method, "customized one amino acid - one codon", allows the user to choose how many of 59 codons are going to be optimized with the 'one AA - one codon' approach. The third method, "Guided random", select codons at random with probabilities obtained from the codon usage table. If you have introduced a Protein sequence and selected a codon usage table to optimize your sequence. Thus there is the method, "one aminoacid - one codon" in three different forms (Most Frequent, GC-rich, AT-rich or Robust codons). If You have introduced a DNA or Protein sequence and selected a tRNA gene copy number table to optimize your sequence. Thus there is the method, "one amino acid - one codon", that optimizes all codons of the query sequence, i.e. all codons that codify the same amino acid are substituted by the synonymous codon that corresponds to the tRNA with more number of copies.

References

   Codon Adaptation Index (CAI):
      Sharp, P.M. and Li, W. (1987) The Codon adaptation index -a measure of directional synonymous codon usage bias and its potential applications. Nucleic Acids Res., 15:1281-1295.

   Efective Number of Codons (ENc):
      Wright, F. (1990) The 'effective number of codons' used in a gene. Gene, 87:23-29.

   Codon bias and heterologous protein expression:
      Gustafsson, C., Govindarajan, S. and Minshull, J. (2004) Codon bias and heterologous protein expression.Trends in Biotechnol., 22:346-353

   Robust codons:
      Archeti, M. (2004) Selectin on Codon Usage for Error Minimization at the Protein Level. J Mol Evol, 59:400-415

   Restriction enzymes:
      Roberts, RJ., Vincze, T., Posfai, J. and Macelis, D. (2005) REBASE - restriction enzymes and DNA methyltransferases.Nucleic Acids Res., 33:D230-D232

Examples

See an example of a DNA sequence from Bacteroides thetaiotaomicron
See an example of a Protein sequence from Bacteroides thetaiotaomicron

Format codon data allowed:

Codon Usage DB:

fields: [triplet] [frequency: per thousand] ([number])

Examples:

1) UUU 17.4(586747) UCU 15.0(507382) ....

OPTIMIZER (RSCU & Codon Usage):

fields: [triplet] [-> | - | :] [number] [,| ;]

Examples:

1) UUU: 17.4; UCU: 15.0 ....

2) UUU- 17.4; UCU- 15.0 ....

3) UUU-> 17.4; UCU-> 15.0 ....

4) UUU: 17.4, UCU: 15.0 ....

5) UUU- 17.4, UCU- 15.0 ....

6) UUU-> 17.4, UCU-> 15.0 ....

OPTIMIZER (Wi):

fields: [triplet] [-> | - | :] [relative codon weigth] [,| ;]

Examples:

1) UUU-> 1.0, UCU-> 0.925 ....

2) UUU- 1.0; UCU- 0.925 ....

3) UUU-> 1.0; UCU-> 0.925 ....

4) UUU: 1.0, UCU: 0.925 ....

tRNAs:

fields: [anticodon] ([number of tRNA copies])

Examples:

      AGC (0) GGC(2) CGC(0) TGC(3)
      ACC (0) GCC (4) CCC (1) TCC (1)
      AGG (0) GGG (1) CGG (1) TGG (1)
      AGT (0) GGT (2) CGT (1) TGT (1)
      AAC (0) GAC (2) CAC (1) TAC (5)
      AGA (0) GGA (2) CGA (1) TGA (1) ACT (0) GCT (1)
      ACG (4) GCG (0) CCG (1)TCG (0) CCT (1) TCT (1)
      AAG (0) GAG (1) CAG (4) TAG (1) CAA (1) TAA (1)
      AAA (0) GAA (2)
      ATT (0) GTT (4)
      CTT (0) TTT (6)
      ATC (0) GTC (3)
      CTC (0) TTC (4)
      ATG (0) GTG (1)
      CTG (2) TTG (2)
      AAT (0) GAT (3) TAT (0)
      CAT (8)
      ATA (0) GTA (3)
      ACA (0) GCA (1)
      CCA (1)

Abbreviations

   CAI: Codon Adaptation Index.
   ENc: Efective Number of Codons.
   %GC: G+C percentage
   %AT: A+T percentage
    | : Unchanged nucleotide
    * : Transversion change (Purines <-> Pyrimidines)
    # : Transition change (Purine <-> Purine / Pyrimidine <-> Pyrimidine)

Contact us

   If you are interested in optimizing a sequence from an unavailable genome, we recommend:

       a) Sequence from complete genome. Send us an e-mail (santi.garcia-vallve AT urv.cat). We will evaluate your suggestion and update our database as soon as possible.

       b) Sequence from incomplete genome. Use a closely related genome to optimize your sequence.

This application has been financied by the following projects:

"Plan Nacional de I+D+I (2003), Ministerio de Ciencia y Tecnologia" (Ref. BIO2003-07672).

HOME