Nomenclature and Symbolism for Amino Acids and Peptides

3AA-21.2

This version of 3AA-21.2 does not use the HTML table format.

3AA-21.2. The Code Symbols

The symbols are listed, in alphabetical order of amino-acid names, in Table 1. Table 5 gives them in alphabetical order of symbols.

Table 5. The One-Letter Symbols

 
 One-letter  Three-letter  Amino acid  
 symbol      symbol
   A           Ala           alanine           
   B           Asx           aspartic acid or asparagine           
   C           Cys           cysteine           
   D           Asp           aspartic acid           
   E           Glu           glutamic acid           
   F           Phe           phenylalanine           
   G           Gly           glycine           
   H           His           histidine           
   I           Ile           isoleucine           
   K           Lys           lysine           
   L           Leu           leucine           
   M           Met           methionine           
   N           Asn           asparagine           
   P           Pro           proline           
   Q           Gln           glutamine           
   R           Arg           arginine           
   S           Ser           serine           
   T           Thr           threonine           
   V           Val           valine           
   W           Trp           tryptophan           
   X **        Xaa           unknown or 'other' amino acid           
   Y           Tyr           tyrosine           
   Z           Glx           glutamic acid or glutamine (or substances such as
                             4-carboxyglutamic acid and 5-oxoproline that
                             yield glutamic acid on acid hydrolysis of 
                             peptides)                    
** See the Addendum for an alternative use of X.

Note on the Choice of Symbols

Initial letters of the names of the amino acids were chosen where there was no ambiguity. There are six such cases: cysteine, histidine. isoleucine, methionine, serine and valine. All the other amino acids share the initial letters A, G, L, P or T, so arbitrary assignments were made. These letters were assigned to the most frequently occurring and structurally most simple of the amino acids with these initials, alanine (A), glycine (G), leucine (L), proline (P) and threonine (T).

Other assignments were made on the basis of associations that might be helpful in remembering the code, e.g. the phonetic associations of F for phenylalanine and R for arginine. For tryptophan the double ring of the molecule is associated with the bulky letter W. The letters N and Q were assigned to asparagine and glutamine respectively; D and E to aspartic and glutamic acids respectively. K and Y were chosen for the two remaining amino acids, lysine and tyrosine, because, of the few remaining letters, they were close alphabetically to the initial letters of the names. U and O were avoided because U is easily confused with V in handwritten material, and O with G, Q, C and D in imperfect computer print-outs, and also with zero. J was avoided because it is absent from several languages.

Two other symbols are often necessary in partly determined sequences, so B was assigned to aspartic acid or asparagine when these have not been distinguished; Z was similarly assigned to glutamic acid or glutamine. X means that the identity of an amino acid is undetermined, or that the amino acid is atypical. See the Addendum for an alternative use of X.


Return to 3AA-21.2.