3AA-11 to 3AA-13

Nomenclature and Symbolism for Amino Acids and Peptides

3AA-11 to 3AA-13

Continued from 3AA-6 to 3AA-10

Contents of 3AA-11 to 3AA-13

3AA-11 Definitions of Peptides

3AA-12 Amino-Acid Residues

12.1 Definitions of residues
12.2 Ionized forms of residues

3AA-13 The Naming of Peptides

13.1 Construction of names
13.2 Use of prefixes in peptide names
13.3 Names of simple polymers of amino acids
13.4 Numbering of peptide atoms
13.5 Prefixes formed from peptide names
13.6 Conformations of polypeptide chains

References for 3AA-6 to 3AA-10

Continued in 3AA-14 to 3AA-16

Part 1, Section C: PEPTIDE NOMENCLATURE

3AA-11. DEFINITION OF PEPTIDES

A peptide is any compound produced by amide formation between a carboxyl group of one amino acid and an amino group of another. The amide bonds in peptides may be called peptide bonds. The word peptide usually applies to compounds whose amide bonds are formed between C-l of one amino acid and N-2 of another (sometimes called eupeptide bonds), but it includes compounds with residues linked by other amide bonds (sometimes called isopeptide bonds). Peptides with fewer than about 10-20 residues may also be called oligopeptides; those with more, polypeptides. Polypeptides of specific sequence of more than about 50 residues are usually known as proteins, but authors differ greatly on where they start using this term.

3AA-12. AMINO-ACID RESIDUES

3AA-12.1. Definitions of Residues

When two or more amino acids combine to form a peptide, the elements of water are removed, and what remains of each amino acid is called an amino-acid residue. α-Amino-acid residues are therefore structures that lack a hydrogen atom of the amino group (-NH-CHR-COOH), or the hydroxyl moiety of the carboxyl group (NH₂-CHR-CO-), or both (-NH-CHR-CO-); all units of a peptide chain are therefore amino-acid residues. (Residues of amino acids that contain two amino groups or two carboxyl groups may be joined by isopeptide bonds, and so may not have the formulas shown.)

The residue in a peptide that has an amino group that is free, or at least not acylated by another amino-acid residue (it may, for example, be acetylated or formylated), is called N-terminal; it is at the N-terminus. The residue that has a free carboxyl group, or at least does not acylate another amino-acid residue, (it may, for example, acylate ammonia to give -NH-CHR-CO-NH₂), is called C-terminal. If the amino group of the N-terminal residue is free, the residue may be named as an acyl group under 3AA-9.3; indeed any internal residue is an N-substituted amino-acyl group.

Residues are named from the trivial name of the amino acid, omitting the word 'acid' from aspartic acid and glutamic acid. Examples: glycine residue, lysine residue, glutamic residue.

3AA-12.2. Ionized Forms of Residues

Click here for "table free" view if the following is faulty.

When it is desirable to mention or emphasize the particular ionic form of a residue, this may be done as follows

Name of residue Protonated Form Deprotonated form

arginine residue argininium residue arginine (base) residue

histidine residue histidinium residue histidine (base) residue

lysine residue lysinium residue lysine (base) residue

aspartic residue aspartic (acid) residue aspartate residue

cysteine residue cysteine (acid) residue cysteinate residue

glutamic residue glutamic (acid) residue glutamate residue

tyrosine residue tyrosine (acid) residue tyrosinate residue

Name of residue	Protonated Form	Deprotonated form
arginine residue	argininium residue	arginine (base) residue
histidine residue	histidinium residue	histidine (base) residue
lysine residue	lysinium residue	lysine (base) residue
aspartic residue	aspartic (acid) residue	aspartate residue
cysteine residue	cysteine (acid) residue	cysteinate residue
glutamic residue	glutamic (acid) residue	glutamate residue
tyrosine residue	tyrosine (acid) residue	tyrosinate residue

This system cannot easily be applied to N- or C-terminal residues.

3AA-13. THE NAMING OF PEPTIDES

3AA-13.1. Construction of Names

To name peptides, the names of acyl groups ending in 'yl' (3AA-9.3) are used. Thus if the amino acids glycine, NH₃⁺-CH₂-COO^-, and alanine, NH₃⁺-CH(CH₃)-COO^-, condense so that glycine acylates alanine, the dipeptide formed, NH₃⁺-CH₂-CO-NH-CH(CH₃)-COO^-, is named glycylalanine. If they condense in the reverse order, the product, NH₃⁺-CH(CH₃)-CO-NH-CH₂-COO^-, is named alanylglycine. Higher peptides are named similarly, e.g. alanylleucyltryptophan. Thus the name of the peptide begins with the name of the acyl group representing the N-terminal residue, and this is followed in order by the names of the acyl groups representing the internal residues. Only the C-terminal residue is represented by the name of the amino acid, and this ends the name of the peptide. Formulas should normally be written in the same order, with the N-terminal residue on the left, and the C-terminal on the right, e.g.

NH₂-CH(COOH)-[CH₂]₂-CO-NH-CH(CH₂SH)-CO-NH-CH₂-COOH
glutathione, γ-glutamylcysteinylglycine

A multiplicative affix (p. 5 of reference [14]) placed before 'peptide' gives the total number of residues in the peptide, e.g. hexapeptide. Since the higher affixes are not well known, they may be replaced by numerals, e.g. a 22-peptide.

Higher oligopeptides and polypeptides of biological origin often have trivial names; their sequences are usually described more conveniently by symbols (3AA-14 to 3AA-19 below) than by constructing long names.

3AA-13.2. Use of Prefixes in Peptide Names

Configurational prefixes (3AA-3) are placed immediately before the trivial names of the residues they refer to. The prefixes are set off from the names before and after them with hyphens. Examples: L-alanyl-L-leucine; L-alanyl-D-leucine; glycyl-L-alanine; L-alanylglycine; L-leucyl-L-phenylalanyl-L-leucylglycine; L-alanylglycyl-L-leucine.

The mixture of diastereoisomers formed by condensations between DL-amino acids will contain unspecified proportions of each pair of enantiomers. Names such as DL-alanyl-DL-leucine have been used in the past, but they are misleading because they contradict the accepted meaning of the prefix DL as signifying a racemate; here the racemate of L-alanyl-L-leucine and D-alanyl-D-leucine (which may be designated as rac-L-alanyl-L-leucine) is mixed in unspecified proportions with the racemate of L-alanyl-D-leucine and D-alanyl-L-leucine (which may similarly be designated as rac-L-alanyl-D-leucine). This is better indicated by the name ambo-alanyl-ambo-leucine (item 12c of reference [20]).

A mixture of L-alanyl-L-alanyl-L-alanine and L-alanyl-D-alanyl-L-alanine may likewise be called L-alanyl-ambo-alanyl-L-alanine. [See also 3AA-19.2]

3AA-13.3. Name of Simple Polymers of α-Amino Acids

Simple polymers of amino acids may, if preferred, be named with prefixes to indicate the number of amino-acid residues present, e.g. tetraglycine. Mixtures of polymers with varying numbers of residues may be given names like oligoglycine, polyglycine, poly(L-lysine), etc. [21].

3AA-13.4. Numbering of Peptide Atoms

The atoms of a peptide may need to be numbered as locants for substitution or isotopic replacement. Often no more numbering is required than that of atoms within a residue (see 3AA-2.2), e.g. alanyl-3-chloroalanylalanine. It may sometimes be convenient to indicate substitution of the peptide as a whole. This may be done by adding the residue number, obtained by numbering residues from the N-terminus, after the atom number, and separated from it by a point. The above compound may therefore be called 3.2-chloro(alanylalanylalanine). Thus the atom C-3.2 is C-3 of the second residue of the peptide. Example: Alanylthreonylglycylaspartylglycine 4.4-3.2-lactone for the compound that can be represented (3AA-16, -17 and -19 below) as .

Such numbering is especially useful for peptides with trivial names (see 3AA-22.5), e.g. N^5.4-methyloxytocin would indicate a methyl substituent on N-5 of the glutamine residue at position 4 of oxytocin. If the peptide name that follows a substituent indicated in this way is constructed residue by residue, it must be placed in parentheses to show that the numbering applies to the peptide as a whole, rather than to the first residue.

3AA-13.5. Prefixes Formed from Peptide Names

When it is necessary to treat a peptide as a substituent, the point of attachment is specified by the suffix 'yl' (see 3AA-8) with the appropriate locant. If the group formed from the peptide is not the acyl group derived by removing hydroxyl from C-1 of the C-terminal residue, the position at which hydrogen (or hydroxyl from a side-chain carboxyl group) is removed should be indicated by a locant before the 'yl'; if the sequence of the peptide is given in full, it should be placed in parentheses to avoid implying that the group is formed by removing H or OH from the C-terminal residue. Examples:

(1) Leukotriene D, or (7E,9E,11Z,14Z)-(5S,6R)-6-[(cysteinylglycin)-S-yl]-5-hydroxyicosa-7,9,11,14-tetraenoic acid;

(2) Leukotriene C, or (7E,9E,11Z,14Z)-(5S,6R)-6-(glutathion-S-yl)-5-hydroxyicosa-7,9,11,14-tetraenoic acid, or (7E,9E,11Z,14Z)-(5S,6R)-6-[(γ-glutamylcysteinylglycin)-S-yl]-5-hydroxyicosa-7,9,11,14-tetraenoic acid;

(3) (2S)-2-O-[(serylalanylserin)-3.2-yl]lactic acid, or (2S)-2-[(serylserylserin)-O^3.2-yl]propanoic acid, or O^3.2-[(lS)-1-carboxyethyl](serylserylserine).

If the locant before 'yl' indicates the carbon of a carboxyl group, the prefix indicates the acyl group formed by removing hydroxyl from this atom. Example: 4-O-[(glutamylglutamylglutamic acid)-5.2-yl]-D-gluconic acid.

3AA-13.6. Conformation of Polypeptide Chains

Abbreviations and symbols for describing the conformation of peptide chains have been published separately [16].

References

7. International Union of Biochemistry (1978) Biochemical Nomenclature and Related Documents, The Biochemical Society, London.

14. International Union of Pure and Applied Chemistry (1979) Nomenclature of Organic Chemistry, Sections A, B, C, D, E, F and H, Pergamon Press, Oxford.

16. IUPAC-IUB Commission on Biochemical Nomenclature (CBN), Abbreviations and Symbols for the Description of the Conformation of Polypeptide Chains, 1969, Arch. Biochem. Biophys. 145, 405-421 (1971); Biochem. J. 121, 577-585 (1971); Biochemistry, 9, 3471-3479 (1970); Biochim. Biophys. Acta, 229, 1-17 (1971); Eur. J. Biochem. 17, 193-201 (1970); J. Biol. Chem. 245, 6489-6497 (1970); Mol. Biol. 7, 289-303 (1973) (in Russian); Pure Appl. Chem. 40, 291-308 (1974); also pp. 94-102 in [7].

20. IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN), Nomenclature of Tocopherols and Related Compounds, Recommendations l981, Arch. Biochem. Biophys. 218, 347-348 (1982); Eur. J. Biochem. 123, 473-475 (1982); Pure Appl. Chem. 54, 1507-1510 (1982).

21. IUPAC-IUB Commission on Biochemical Nomenclature (CBN), Abbreviated Nomenclature of Synthetic Polypeptides (Polymerized Amino Acids), Recommendations 1971, Arch. Biochem. Biophys. 151, 597-602 (1972); Biochem. J. 127, 753-756 (1972); Biochemistry, 11, 942-944 (1972); Biochim. Biophys. Acta, 278, 211-217 (1972); Eur. J. Biochem. 26, 301-304 (1972); J. Biol. Chem. 247, 323-325 (1972); Mol. Biol. 5, 492-496 (1971) (in Russian); Pure Appl. Chem. 33, 439-444 (1973); also pp. 88-90 in [7 ].

Continue to the next section with 3AA14 to 3AA-16 of Amino Acids and Peptides.

Return to Amino Acids and Peptides home page.