Nomenclature and Symbolism for Amino Acids and Peptides

3AA-3 to 3AA-5

Continued from 3AA-1 and 3AA-2

Contents of 3AA-3 to 3AA-5

3AA-3 Configuration at the α-Carbon Atom

3AA-4 Configuration at Centres other than the α-Carbon Atom

3AA-5 Optical Rotation

References for 3AA-3 to 3AA-5

Continued in 3AA-6 to 3AA-10


3AA-3. CONFIGURATION AT THE α-CARBON ATOM

3AA-3.1. Use of D and L

The absolute configuration at the α-carbon atom of the α-amino acids is designated by the prefixed small capital letter D or L to indicate a formal relationship to D- or L-serine and thus to D- or L-glyceraldehyde. The prefix ξ (Greek xi) indicates unknown configuration.

The structures of amino acids may be drawn to show configuration in several ways [13]. In the Fischer-Rosanoff convention each chiral centre is projected onto the plane of the paper in the orientation such that the central atom appears as the point of intersection of two straight lines joining the attached groups in pairs, so that one straight line (which should be vertical) joins three atoms of the principal chain. The central atom is then considered to lie in the plane of the paper, the other atoms of the principal chain behind the plane from the viewer, and the remaining two groups in front of this plane. Thus an L-α-amino acid may be represented as

The relationship between serine and glyceraldehyde may therefore be represented as:

3AA-3.2. Position of Prefix

In naming α-amino acids as derivatives of substances that have well-known trivial names, the prefix L or D is placed immediately before the trivial name of the parent amino acid and set off by a hyphen. Examples: trans-4-hydroxy-L-proline; 3,5-diiodo-L-tyrosine.

Note. Admissible exceptions to this rule are L-hydroxyproline and L-hydroxylysine, but only in general biochemical writing in a context such that the position of substitution is well understood. Note further that in the names of optically active derivatives of glycine, such as L-2-phenylglycine, the prefix must be placed before the name of the substituent as glycine itself is achiral. In the names of salts, esters and other derivatives, including peptides, the prefix is placed immediately before the trivial name of the parent acid or its radical. Examples: L-histidine monohydrochloride monohydrate; copper(II) L-aspartate; D-lysine dihydrochloride; N-acetyl-L-tryptophan; diethyl D-glutamate; N6-methyl-L-lysine.

Other semisystematic names involving α-amino-acid configurations are treated similarly. Example: S-(D-2-amino-2-carboxyethyl)-D-homocysteine, or S-(D-alanin-3-yl)-D-homocysteine (3AA-8), i.e. D-cystathionine.

3AA-3.3. Omission of Prefix

The prefix may be omitted where the amino acid is stated to be or is obviously derived from a protein source and is therefore assumed to be L. It may also be omitted where the amino acid is synthetic and not resolved and is therefore, save in exceptional cases, an equimolecular mixture of the enantiomers. Likewise it may be omitted in a general statement that is true for either enantiomer or for any mixture of these.

3AA-3.4. Subscripts to D and L

Where confusion is possible between the use of the small capital letter prefix for the configuration of the α-carbon atom in amino-acid nomenclature and for that of the highest numbered chiral carbon atom in carbohydrate nomenclature [7], a subscript (lower case Roman letter) is added to the small capital letter prefix. If the prefix is used in the amino-acid sense, the subscript is s (for serine); if the prefix is used in the carbohydrate sense, the subscript is g (for glyceraldehyde).

Examples: Ls-threonine, for which the synonym in carbohydrate nomenclature is 2-amino-2,4-dideoxy-Dg-threonic acid; Ds-threonine, for which the synonym is 2-amino-2,4-dideoxy-<L>Lg-threonic acid; Ls-allothreonine, for which the synonym is 2-amino-2,4-dideoxy-Lg-erythronic acid; Ds-allothreonine, for which the synonym is 2-amino-2,4-dideoxy-Dg-erythronic acid.

Note that the subscripts are essential only in discussions where both amino-acid names and those of carbohydrate derivatives occur. Nevertheless, these subscripts are highly desirable if L or L is used in naming α-amino acids that possess more than one centre of chirality (see 3AA-4).

3AA-3.5. The RS System

A more general system of stereochemical designation, which is especially convenient when there is no simple way of relating a compound to a defined standard, is the RS system of Cahn, Ingold & Prelog [13, 18]. In this system the ligands of a chiral atom are placed in an order of preference, based largely on atomic number. If the first three ligands appear clockwise in this order when viewed from the side remote from the least-preferred (fourth) ligand, the chiral centre is R; if anticlockwise, it is S.

The L-configuration, possessed by the chiral α-amino acids found in proteins, nearly always corresponds to S in the RS system. The most important exceptions are L-cysteine and L-cystine (see Appendix), which are R (in most amino acids the order of preference of the groups around C-2 is NH3+, COO, R, H, but in cysteine and cystine the group R takes precedence over carboxylate because the atomic number of sulfur attached to C-3 is higher than that of oxygen attached to C-l).

3AA-3.6. Amino Acids Derived from Amino Sugars

Amino acids that are derived from amino sugars and contain five or more carbon atoms are named in conformity with the system of carbohydrate nomenclature [17] or with a recommended trivial name.

Examples: (1) Dg-glucosaminic acid for 2-amino-2-deoxy-Dg-gluconic acid, the α-carbon of which has the configuration of that in D-serine, and in which C-5, the highest numbered chiral centre, also has the D-configuration; (2) Dg-mannosaminic acid for 2-amino-2-deoxy-Dg-mannonic acid, the α-carbon of which has the configuration of that in L-serine, but in which C-5 has the D configuration. The subscript g may be omitted unless confusion with the amino-acid use of the designations D and L is likely.

3AA-3.7. Use of meso

The prefix meso-, in lower case italic letters, is used to denote those amino acids or derivatives that, although they contain chiral groups, are achiral, usually because of a plane of symmetry, e.g. meso-lanthionine.

3AA-3.8. Use of DL

A mixture of equimolar amounts of D and L compounds is termed racemic and is designated by the prefix DL (no comma), e.g. DL-leucine. It may alternatively be designated by the prefix rac- (e.g. rac-leucine) or by the prefix (±)- (see 3AA-5).

3AA-4. CONFIGURATION AT CHIRAL CENTRES OTHER THAN THE α-CARBON

3AA-4.1. The Sequence Rule

The RS system (3AA-3.5) is preferred for designating configuration at centres other than α-C, e.g. (2S,3R)-threonine. To avoid using two different systems of designation in the same name, (2S,4S)-4-hydroxyproline may be used instead of (4S)-4-hydroxy-L-proline.

3AA-4.2 Carbohydrate Prefixes

The use of carbohydrate prefixes (e.g., L-erythro) cited in the 1974 version of these recommendations [6] as an alternative system for α-amino acids having two or more chiral centres is now discouraged.

3AA-4.3. Use of cis and trans

The amino acids 4-hydroxy-L-proline and 3-hydroxy-L-proline and analogous substituted prolines may also be named as follows (cf. 3AA-3.2).

The prefixes cis and trans refer to the relative positions of the hydroxyl and carboxyl groups in each compound.

Comment. The hydroxyprolines found in collagen are trans-4-hydroxyproline (predominantly) and trans-3-hydroxyproline. The prefixes may be omitted when no ambiguity arises (cf. 3AA-3.3).

3AA-4.4. Use of 'allo'

Amino acids with two chiral centres were named in the past by allotting a name to the first diastereoisomer to be discovered. The second diastereoisomer, when found or synthesized, was then assigned the same name but with the prefix allo-. This method can be used only with trivial names (see 2.1) but not with semisystematic or systematic names. It is now recommended that allo should be used only for alloisoleucine and allothreonine, as follows:

3AA-4.5. Designation of Centres with Unknown Configurations

When absolute or relative configurations at one or more centres are not known, such designations as 'isomer A' and 'isomer B' are frequently employed until the full configurational relationships are established.

If the configuration is known at one centre but not at a second, the RS system is used for the known centre, with a Greek xi (ξ), meaning 'unknown configuration' for the other, e.g. (2S,5ξ)-2-amino-5-hydroxyhexanoic acid (a single stereoisomer). If the configuration at two centres is unknown, the ξ may be used as in the example (2ξ,5ξ)-2-amino-5-hydroxyhexanoic acid. If a racemate is to be designated, this is done by reference to its optical activity (3AA-5), e.g. (±)-(2ξ,5ξ)-2-amino-5-hydroxyhexanoic acid. If the relative configuration of two centres is known, but the absolute is unknown,'R*' and 'S*' may be used, e.g. (2R*,5S*)-2-amino-5-hydroxyhexanoic acid.

3AA-4.6. Other Stereochemical Features

When other stereochemical elements are encountered, such as E/Z double-bond isomers, they are described according to the provisions of Section E of the IUPAC rules for organic nomenclature [13].

3AA-5. OPTICAL ROTATION

If it is desired to indicate the direction of rotation of plane polarized light of specified wavelength in a specified solvent, this can be done with a 'plus' or 'minus' sign in parenthesis (E-4.4 of reference [13]), e.g. (+)-6-hydroxytryptophan. This may be particularly useful if the configuration at C-2 is not known, but it may also be done for emphasis, with or without a configurational symbol L or L, when this configuration is known, e.g. (+)-glutamic acid, or (+)-L-glutamic acid. A racemic amino acid (3AA-3.8) may be indicated by (±), e.g. (±)-leucine.


References

6. IUPAC Commission on the Nomenclature of Organic Chemistry (CNOC) and IUPAC-IUB Commission on Biochemical Nomenclature (CBN), Nomenclature of α-Amino Acids, Recommendations 1974, Biochem. J. 149, 1-16 (1975); Biochemistry, 14, 449-462 (1975); Eur. J. Biochem. 53, 1-14 (1975); also pp. 64-77 in [7].

7. International Union of Biochemistry (1978) Biochemical Nomenclature and Related Documents, The Biochemical Society, London.

13. IUPAC Commission on Nomenclature of Organic Chemistry (CNOC), Nomenclature of Organic Chemistry, Section E: Stereochemistry, Recommendations 1974, Pure Appl. Chem. 45, 11-30 (1976); also pp. 1-18 in [7] and pp. 473-490 in [14]. [See also Biochemical Nomenclature and Related Documents, 2nd edition, Portland Press, 1992, pages 1-18.]

14. International Union of Pure and Applied Chemistry (1979) Nomenclature of Organic Chemistry, Sections A, B, C, D, E, F and H, Pergamon Press, Oxford.

17. IUPAC Commission on the Nomenclature of Organic Chemistry (CNOC) and IUPAC-IUB Commission on Biochemical Nomenclature (CBN),Tentative Rules for Carbohydrate Nomenclature, Part 1, 1969,Biochem. J. 125, 673-695 (1971); Biochemistry 10, 3985-4004 & 4995 (1971); Biochim. Biophys. Acta, 244, 223-302 (1971); Eur. J. Biochem. 21, 455-477 (1971), corrected 25, 4 (1972); J. Biol. Chem. 247, 613-635 (1972); also pp. 174-195 in [7]. [Revised version now available.]

18. Cahn, R. S., Ingold, C. K. & Prelog, V. (1966) Angew. Chem. 78, 413-447 (in German); Angew. Chem Int. Ed. Engl. 5, 385-415, errata 511.


Continue to the next section with 3AA6 to 3AA-10 of Amino Acids and Peptides.

Return to Amino Acids and Peptides home page.