Nomenclature Notes

1What is IUPAC Nomenclature?

No one, except a few pedants, enjoys working on chemical nomenclature. However, accurate and widely-accepted nomenclature is a vital need for communication amongst more than academic chemists. For example, politicians writing treaties and customs officers inspecting trade goods need to know exactly what materials they are dealing with. It is now generally accepted that IUPAC should be responsible for providing this kind of nomenclature for the world to use. IUPAC chemical nomenclature is widely-regarded as the world standard. When a nomenclature question arises, the first reaction is often: what is the name that IUPAC gives? Principles of Chemical Nomenclature is an attempt to show people how to find the name they require, but it also explains the misunderstandings that may arise before such a process is complete.

In the first place, there is no monolithic construction called IUPAC Nomenclature. Nomenclature is a subject that has grown and changed over the years. The first widely accepted systematic nomenclature proposals arose in France amongst Lavoisier and his colleagues in the 1780s, and they were dealing with what today we recognise as inorganic compounds. Internationally-accepted systematic nomenclature may be reckoned to stem from the Geneva Congress of 1892. These nomenclatures attempted, and still attempt, to be systematic, but the systems they use are different. Consequently the methodologies employed in deriving the names of inorganic and organic compounds are generally different. [See box for examples.]

Examples of changes over time are many, and some are cited here. It sometimes takes time for such changes to be assimilated!

—The nitroprusside ion should now be called systematically pentacyanidonitrosylferrate(2–).

—Ferric, ferrous and stannic and similar forms should be replaced by iron(II), iron(III) and tin(IV), and so on.

—Propylene is no longer a recommended name for C3H6, which is now simply propene.

—n-butane should now be called just butane, though isobutane is still allowed.

—Butanol is the name for an alcohol with the OH group bound to an end carbon atom of a linear saturated four-carbon chain, but isobutanol is not an allowed name when the OH group is bound to a terminal carbon of the parent isobutane. Then the molecule should be called 2-methylpropan-1-ol.

In the second place, there are more systems sheltered under the IUPAC umbrella, such as that for polymers, and also systems to deal with newer materials such as organometallic compounds, which may be regarded as falling into more than one category of compound, and may require the use of more than one system, or even a specially-modified system, to name them satisfactorily.

In the third place, studies related to chemistry, such as biochemistry, pharmacy, and cosmetics, have developed their own specific nomenclatures, which may be abbreviated and modified for commercial purposes, and these can also produce unequivocal names, but not names which necessarily convey the complete structural information usually required by IUPAC nomenclatures. However, IUPAC is usually involved jointly with other international bodies in the elaboration of these systems.

— {first published as ref 2}

This blogpost provides short notes and briefs about Principles of Chemical Nomenclature – A Guide to IUPAC Recommendations. Each section addresses a specific topic such as systematic nomenclature, the constructing of names, or the use of abbreviations. Nomenclature Notes were first published in Chem Int, from March 2012 to Dec 2013 <https://iupac.org/publications/ci/indexes/nomenclature-notes.html>

The author, Jeffery Leigh is the editor and contributing author of Principles of Chemical Nomenclature—A Guide to IUPAC Recommendations, 2011 Edition (RSC 2011, ISBN 978-1-84973-007-5) [overview given in ref 1]. He has been an active contributor to IUPAC nomenclature since 1973.

2On the Various Nomenclature Systems

In the first “Nomenclature Notes” (March-April 2012 CI) [above], it was stated that IUPAC nomenclature is composed of a set of different nomenclatures, which do not necessarily use the same conventions and methods. IUPAC bodies such as the Interdivisional Committee on Terminology, Nomenclature and Symbols (ICTNS) are trying to remove these inconsistencies, but nevertheless, if you are trying to decipher an IUPAC name, or to construct one, it is necessary to realize which type of nomenclature is being used, which means recognizing the type of compound that is being treated. An ideal IUPAC name should be unique, but should also convey the structure of a given compound.

At the lowest level, compositional nomenclature simply lists the constituents of a compound in some prescribed order, generally alphabetical, and says nothing about the structure. It is really another way of presenting an entry in a formula index. Substitutive nomenclature is a principal IUPAC nomenclature system and the preferred method for naming organic compounds. This relies on selecting a suitable unsubstituted parent compound from which the compound at issue can be considered as ultimately derived, and then modifying the parent name in a series of formal operations. For example, a compound such as C₂H₅Cl would be considered to be derived from the parent compound, ethane, C₂H₆, which is then substituted by removing a hydrogen atom to produce C₂H₅•, an ethyl radical, and then adding a chlorine atom to yield C₂H₅Cl, called chloroethane.

This is a particularly simple case, but the principles used for naming in this way are all very similar, though as the complexity of the compound being treated increases, more and more complex parent compounds and many more rules are required.

An alternative name for chloroethane is ethyl chloride, and this is an example of binary nomenclature which developed from inorganic chemistry and its original systematization by Lavoisier and his colleagues. Binary nomenclature groups constituents of compounds into two classes, positive and negative. This works well for salts, such as sodium chloride, but more complex organic materials may require the employment of at least two classes of nomenclature, for example, substitutive to name a formal cation such as chloroethyl, and binary to name the related ester, chloroethyl acetate. Functional class nomenclature is a system used for organic compounds when substitutive nomenclature is less appropriate, for example with organic acids and anhydrides.

A more recent, though similarly historical, system is additive nomenclature, a second principal IUPAC nomenclature system, with its origins in inorganic coordination chemistry. In this approach, the name of a compound considered to be a coordination complex, such as Na₃[CoCl₆], is derived by identifying the three cations and the single anion from which it is composed. The latter is a complex anion, and it is then divided formally into its constituent central cation and its associated six chloride anions or ligands. Finally, the individual names are then assembled to give the compound name trisodium hexachloridocobaltate. In this system, organic ligands would receive substitutive names. Then, to be used in an additive context, organometallic compounds almost always require such a nomenclature mixture. The additive system may be used to name some compounds which would not usually be considered to be coordination complexes.

The third major class of IUPAC nomenclature is polymer nomenclature. It is generally impossible to specify a polymer molecule exactly, as we attempt to do in the cases treated so far, because the chain lengths and end groups are rarely accurately known. One approach is to use a source-based system, because one generally knows the monomer from which the polymer was produced. This gives names such as poly(buta-1,3-diene), but it lacks structural information. A structure-based name for this material might be poly(1-vinylethane-1,2-diyl), but this represents only the units of the polymer chain, not the end groups, so it may be equally incomplete.

The 2011 edition of Principles of Chemical Nomenclature includes full descriptions of all these nomenclature systems, and of several others which are not necessarily due to IUPAC, with details of how to apply them. It also contains instructions for how to decipher IUPAC names to identify the structures that such names are intended to convey. Principles of Chemical Nomenclature also introduces what gives the promise of being a unique class of identifier applicable to a wide class of compound, the IUPAC International Chemical Identifier, or InChI.

{first published as ref 3}

3Non-IUPAC Nomenclature Systems

In the second “Nomenclature Notes” (May-June 2012 CI, p. 28) [above] we alluded to various kinds of nomenclature that fall under the aegis of IUPAC. There are other nomenclatures in wide use, to some of which IUPAC contributes. For example, biochemical nomenclature is reviewed regularly by a committee which has joint membership from both IUPAC and IUBMB (International Union of Biochemistry and Molecular Biology). It is fair to say that biologists and biochemists are often not interested in systematic nomenclature in the way that chemists are. For them it may be sufficient that a given name is specific and defined so as to make the compound to which it refers unequivocally and universally-recognized within the discipline. The IUPAC requirement that a name should convey the structure of a compound is not necessary or even desirable, as long as all practitioners accept it. For example, the compound

may receive a perfectly good IUPAC name, 2,3,4,5-tetrahydroxypentanal, but this might not be recognizable to biochemists unless the compound were named as a sugar, specifically an open-chain form of ribose. As this molecule also contains three chirality centers (asymmetric carbon atoms) to which are assigned the locants 2, 3, and 4, a more complete name would be (2R*,3R*,4R*)-2,3,4,5-tetrahydroxypentanal. Biochemists would also prefer to know the absolute configuration at carbon atom 4, representing it by small capital letters, either D or L, and they would be very comfortable with a name such as D-ribose. When even more complex structures are considered, such as amino acids, steroids, and proteins, the use of defined but semisystematic names becomes necessary for ease of communication and comprehension in biochemical and biological contexts.

A system of nomenclature which is independent of IUPAC, but which is widely used by chemists, is due to the Chemical Abstracts Service (CAS) of the American Chemical Society. This has been developed to produce names for CAS use in both running text and indexes. Although nomenclaturists of CAS have contributed substantially to IUPAC, they cannot afford the leisurely contemplation of data in which IUPAC indulges. They have to make decisions rapidly to meet publication deadlines, and CAS names are designed to be adaptable for use in indexes. Because an index compiler would wish to locate all derivatives of, say, the parent compound butane, at the same place in the alphabetical index, CAS names are sometimes termed “inverted,” because, in this specific case in an index the parent name butane would always be cited first. In its Index Guide, CAS explicitly states that, outside of a CAS index, a name should be used in its uninverted form. In contrast, IUPAC nomenclature presents a name written in a continuous and linear fashion from left to right, and which contains prefixes and suffixes in a specific, linear order. The consequence is that IUPAC names for derivatives of a given parent compound would appear at different places in an index using IUPAC names because the initial letters of the names depend upon the specific prefixes employed. Two examples follow:may receive a perfectly good IUPAC name, 2,3,4,5-tetrahydroxypentanal, but this might not be recognizable to biochemists unless the compound were named as a sugar, specifically an open-chain form of ribose. As this molecule also contains three chirality centers (asymmetric carbon atoms) to which are assigned the locants 2, 3, and 4, a more complete name would be (2R*,3R*,4R*)-2,3,4,5-tetrahydroxypentanal. Biochemists would also prefer to know the absolute configuration at carbon atom 4, representing it by small capital letters, either D or L, and they would be very comfortable with a name such as D-ribose. When even more complex structures are considered, such as amino acids, steroids, and proteins, the use of defined but semisystematic names becomes necessary for ease of communication and comprehension in biochemical and biological contexts.

IUPAC, 2-sulfanylethanol CAS, 2-mercaptoethanol CAS inverted, ethanol, 2-mercapto-

In a similar sequence, we have: 4-(methylsulfanyl)benzoic acid, 4-(methylthio)benzoic acid, and benzoic acid, 4-(methylthio)-

In addition to these nomenclatures, others are current in the literature. The Internatioanl Union of Biochemistry and Molecular Biology (IUBMB) categorizes and specifies and names of enzymes according to the types of reaction they catalyze. Considerations of enzyme chemical structure are secondary here. The World Health Organization (WHO) produces a list of nonproprietary names for drugs which are shaped so as to allow easy conversion to languages other than English and which indicate the action of the drug. Hence, the names of beta-blockers all carry the ending -olol. An example is propranolol, the structure of which is hardly reminiscent of that of propane, though the name was apparently suggested in part by the propane residues it may be considered to contain.

The International Organization for Standardization (ISO) lists names for pesticides, which bear a fleeting similarity to IUPAC names; for example, “trimesium” stands for “trimethylammonium.” There is also list of recommended names for cosmetic ingredients, the International Nomenclature of Cosmetic Ingredients. What characterizes these latter names is that they serve as specific identifiers for commercial and industrial materials which must be quickly-recognizable by users who are nonchemists, but they are usually inadequate for use by chemists. All these non-IUPAC nomenclatures are briefly described in the new edition of Principles, together with references allowing the reader to obtain more detailed information should it be required.

{first published as ref 4}

4Systematic and Trivial Nomenclature

Nomenclaturists recognize two general classes of nomenclature, systematic and trivial. Perhaps the use of the word trivial is unfortunate, because its usual meaning in every-day English according to the Oxford English Dictionary (OED) is “of small account, little esteemed, paltry, poor, trifling, inconsiderable, unimportant, slight.” However, the OED lists several other meanings, some derived from a Latin word implying “three.” A more general common meaning listed in the OED is “such as met with anywhere, common, commonplace, ordinary, trite.” The word trivial was adopted when nomenclature was in its infancy and when its use in the latter sense was more usual, and that is why it is still used in that sense today. It is not intended to be dismissive.

The traditional names of the elements are trivial in this sense. They are non-systematic and many have been adopted from alchemy and early chemistry. For example, the term mercury was applied to many plants, persons, and things as well as the metal itself, which was also called quicksilver, for obvious reasons. An alternative name, hydrargyrum, from which the symbol Hg was derived, is a compound word from Latin and Greek meaning liquid silver. The reason for such names is very evident, but that can hardly form the basis of a systematic nomenclature for all elements. However, most element names are so deeply embedded in many languages that even IUPAC has refrained from generally systematizing them. Nevertheless, during the 1990s it became clear that many scientists needed to write and speak about elements that had yet to be prepared, and that names and symbols were required. Hence, IUPAC developed names and symbols for such elements that are immediately recognizable and based upon their atomic numbers. These names are provisional and are replaced as soon as a given element is prepared and unequivocally characterized. Perhaps unfortunately, the unambiguous systematic name is then replaced by a trivial name suggested by the scientists who first prepared the element.

Trivial names for compounds are used by chemists everywhere, and such names are clearly useful for much exchange of information, especially within a given lab. However, IUPAC attempts to devise a fully systematic approach to the names of substances, which imply unequivocally their chemical constitutions. Such names should be used when an unambiguous identification of compounds is required, as in scientific documents, international treaties, patents, and legal definitions. This is why IUPAC nomenclature can sometimes appear to be so complicated.

There are other kinds of systematic nomenclature. The Chemical Abstracts Service of the American Chemical Society has its own systematic system, similar to IUPAC’s, and with similar aims, but not identical. Other nomenclatures may be systematic, but in a manner differing from IUPAC’s. For example, The International Organizarion for Standardization (ISO) lists approved names for pesticides, such as afidopyropen, which should be recognizable by professionally-qualified users rather than solely by chemists. Such names should be translatable into other scripts and into languages other than English. The entry for the pesticide afidopyropen (for details see <www.alanwood.net/pesticides>) also lists French and Russian versions of the name, a guide to (British) English pronunciation of the English name, the chemical structure, and full IUPAC and CAS names. The short ISO name is clearly preferable to the IUPAC systematic name for everyday commercial use, though the ISO listing also provides citations of relevant InChIKeys and InChIs.

Similarly, the World Health Organization (WHO) issues a list of international nonproprietary names for drugs (INNs), which are names devised and classified according to the pharmacological activity of the substance cited. This is also systematic, but is only loosely derived from IUPAC nomenclature. WHO lists proposed INN names together with a structural formula. Details of the listings can be found, for example, in WHO Drug Information, 23(2), 2009, which may be accessed on-line. The names are given in a form of Latin, and then in English, French, and Spanish, with an indication in each language of its claimed activity, its molecular formula, and its CAS registry number. The names listed here are proposed names, and recommended names are listed elsewhere.

These systems are a selection of those available. Many refer back to systematic IUPAC names, but are adapted to specific purposes. Principles offers a brief guide to some of these, together with suitable references for wider consultation.

{first published as ref 5}

5InChIs and Registry Numbers

Constructing a systematic name of a chemical compound of known structure means that it is necessary for the reader to know the detailed nomenclature rules required to do this. Such a person must work within a particular system, of which IUPAC and the Chemical Abstracts Service (CAS) provide possibly the two most complete. These are both designed in the English language, but a person whose mother tongue is not English may face a further barrier to developing a name if another language, such as Russian or Japanese, should be the language of primary use. However, all trained chemists should be able to recognize a chemical structure displayed using atomic symbols and bond connections, etc., as these are independent of language, even if the basic chemical symbols, recognizable by all chemists, use the roman alphabet. IUPAC has recently developed a computer methodology for recognizing and codifying chemical structures, and, the converse, for reproducing the chemical structure from such a code. This code is called an InChI (IUPAC International Chemical Identifier), and there is a related, more abbreviated, version called an InChIKey.

Principles provides a brief introductory guide to InChIs, and InChIKeys, along with CAS Registry Numbers. These are quite distinct from each other. Registry Numbers are unique numbers used to identify a specific compound in the CAS database. The numbers are assigned to a compound the first time that it appears in Chemical Abstracts (CA) and can be used thereafter to find all references to this compound when it appears again in CA. However, it has no further significance, and does not contain structural information. The user of CA must be familiar with CA nomenclature. In contrast, the recently developed InChI and its related InChIKey are strings of numbers, letters, and other symbols that provide a complete description of the structure of a compound (see www.iupac.org/inchi).

The strings are not comprehensible to a casual reader, though they are to a computer that is equipped with the necessary programs. The InChI system can now deal with many, but as yet not all, compounds that appear in the literature, but it is still under development. In theory, InChI software will eventually provide an InChI character string from structural data for any compound. InChIs are already being used by most of the major chemistry publishers and databases. The InChI Trust website (www.inchi-trust.org) lists some of the many organizations now using it.

The InChI software represents features of the compound structure as a sequence of levels and in strict order, starting with the formula and then dealing with various features, such as atom connectivity and stereochemistry. Production of an InChI is reversible, in that with the appropriate computer program it can be derived from a drawing of the structure and it can then be used to regenerate the structure from which it is derived. This is not true of the InChIKey. An InChI may contain many tens of characters. An InChIKey is an abbreviated form which contains only 27 characters. It cannot be used to regenerate its parent structure, but it is still unique and is designed primarily for searching databases. In that sense it is like a Registry Number, but unlike the Number, it derives ultimately from a compound’s structure.The strings are not comprehensible to a casual reader, though they are to a computer that is equipped with the necessary programs. The InChI system can now deal with many, but as yet not all, compounds that appear in the literature, but it is still under development. In theory, InChI software will eventually provide an InChI character string from structural data for any compound. InChIs are already being used by most of the major chemistry publishers and databases. The InChITrust website (www.inchi-trust.org) lists some of the many organizations now using it.

Of course, a primary requirement for someone to use both InChIs and InChIKeys is that they possess the programs that can construct and interpret them. Like all IUPAC products, these are freely available to the chemistry community. Principles contains enough detail and references for a beginner to obtain the programs and to start to use them.

{first published as ref 6}

6Deciphering and Constructing Names

Someone attempting to construct a name using a specified kind of nomenclature, such as IUPAC nomenclature, or trying to discern the chemical structure implied by a name encountered in, say, an article, must first decide what kind of nomenclature is being used. Few chemists would have problems understanding a name such as sodium chloride. This is a venerable name, and the implications of the presence of the positive and negative entities of a salt are generally well understood. However, as we have seen in previous Notes, IUPAC promulgates several classes of nomenclature, sometimes specific to particular classes of compound, so that the chemist has first to decide which particular class of nomenclature should be used, or, if trying to decipher a name, which particular class has been used to construct the name under consideration.

The new edition of Principles, like the first edition, contains enormous amounts of information on how to construct names once the compound class has been recognized. This includes various classes of compound, such as organic, inorganic, organometallic, polymeric, and biochemical. However, a novel departure in the new edition is the incorporation of a new Chapter on deciphering (or deconstructing) names.

Upon encountering a new name, the chemist must first decide to which class of compound the name belongs. This is generally, but not always, straightforward. Then, for an organic compound, for example, it is recommended to decide first whether functional class or substitutive nomenclature is being used. Then, the chemist must deduce the identity of the parent hydride and hence its numbering scheme, recognize any suffixes to parent the name (there may not be any), and finally recognize the detachable prefixes and endings. These operations should enable the chemist to begin to write a structural formula. The names of biological compounds, mainly organic compounds, generally fall outside the scope of Principles, but information on the important groups of such compounds, such as sugars and nucleotides is given, and sources of more information are noted. Often the complete systematic names of such compounds may be so complicated and large that regular use requires alternative simpler, more compact names.

To name inorganic compounds, again the class of compound must be determined. A compound may span more than one class (e.g., it may be a salt and also contain a coordination entity) so that more than one system of nomenclature may need to be employed. To decipher the name of a coordination entity, the following steps are necessary: identify the central atom(s), identify the ligands (which may be organic compounds and be named using organic methods), and identify the coordination geometry and stereochemistry. The names of organometallic compounds reveal some specific aspects that need to be appreciated, and boron compounds often bear names that are derived using a different set of conventions.

The rules for constructing the names of polymeric compounds are different yet again. In decipherment, they are often easier to recognize because they contain the term “poly” at or near the beginning of the name. Then, the specific rules of the types of polymer nomenclature should be consulted to unravel the structure.

Constructing and deciphering chemical names are certainly not easy problems for a beginner, and are sometimes not easy even for the nomenclature expert. However, Principles provides a summary of methods employed in both name construction and name decipherment, replete with many examples of both kinds of process. Although the tasks of applying nomenclature rules may appear imposing, chemists should persist. A couple of old English sayings should always be remembered. One is that “practice makes perfect,” even though the most experienced nomenclaturists are still striving for perfection. None of us is ever likely to reach the stage where “familiarity breeds contempt.”

{first published as ref 7}

7Drawing Chemical Structures

Although the drawing of chemical structures is not strictly a nomenclature matter as usually understood, it is a way of conveying the structure of a chemical compound just as is an IUPAC systematic name, though using a visual language rather than a verbal one. Consequently, it is necessary to use certain widely-accepted conventions when drawing a chemical structure, particularly if that structure is three-dimensional and the representation of that structure as drawn on a sheet of paper is necessarily in two dimensions. IUPAC has attempted to define preferable methods for achieving this, and the new edition of Principles, unlike its predecessor, contains a summary of what is currently considered to be best practice to this end.

Certain recommendations are almost self-evident. For example, it is common, but not mandatory, to use the same font and font size in your structural diagrams as in your text. Principles uses Times New Roman, which was this editor’s choice. Some people prefer to use sans serif fonts such as Arial, though this can lead to minor confusion between symbols, such as l and 1(Times New Roman) with l and 1(Arial). Unusual abbreviations used in a diagram label should be defined somewhere in the article being written. It is not enough to assume that everyone will know what thf stands for, though this is probably acceptable for Ph. Bond lengths, thicknesses, and angles should be used consistently in all your diagrams. You should decide whether to represent aromatic rings as localized systems, as in (a) below, or as delocalized systems as in (b). Which you choose is not important, as long as your symbolism is clearly understood and is used consistently.

In the past, a variety of methods has been employed to portray a three-dimensional structure in two dimensions, for example, to show whether a bond which does not lie in the plane of the paper is pointing behind that plane or forwards towards the reader. In Principles, we have settled for conventional methods, as shown below in (i) for the tetrahedral molecule CH₄.

Principles also introduces adaptations of this convention, for example, in order to represent certain ring structures, as in the representation for the ferrocene molecule shown below (ii).

More complicated polyhedral shapes than tetrahedral are frequently encountered in chemistry, and these are also discussed in Principles, which introduces the most common three-dimensional structures found in coordination chemistry and also the most common projections used to represent the three-dimensional structures of organic molecules. For the beginner, these are sometimes not easy to understand.

The octahedron is a shape very often found in coordination chemistry, but the manner in which it is represented depends upon the circumstances. In the representation (iii) below of a coordination complex formally written as [Mabcdef], the bonds between the ligands (a, b, c, etc.) to the central metal [not specifically indicated in diagrams (iii)-(v)] are represented using the formalism described above in (iii), but in (iv) the principal plane of the octahedron is drawn, but only the bonds to ligands e and f. Finally, in (v), only the octahedron is delineated and no bonds. The edges of the solid octahedron invisible to a viewer are represented by dashed lines. Yet, all three are acceptable representations of the molecule [Mabcdef], and should be equally comprehensible to the informed reader. More complicated polyhedral shapes than tetrahedral are frequently encountered in chemistry, and these are also discussed in Principles, which introduces the most common three-dimensional structures found in coordination chemistry and also the most common projections used to represent the three-dimensional structures of organic molecules. For the beginner, these are sometimes not easy to understand.

Organic chemists have related problems when representing organic structures in three dimensions, and they use a variety of projections to do this, the principal ones being named after their inventors: Fischer, Haworth, and Newman. These particular projections are usually applied to specific classes of molecule. In a Fischer projection, the bonds to the carbon at the center of the tetrahedron are not represented as in the drawing of Cabcd (vi), but in a plane as in (vii), the convention being that bonds drawn vertically are pointing behind the plane of the paper, and the horizontal ones in front. The central carbon atom is not specifically represented. This type of projection is used primarily for carbohydrates and amino acids.

A Newman projection is employed to represent more complex molecules, such as an ethane-type molecule C₂ abcdef, where different conformations with respect to a selected carbon-carbon bond may be present. This is illustrated in (viii) and (ix) below. Finally, Haworth projections are often applied to compounds such as monosaccharides and polysaccharides.

The use of all these devices and more, including various ways to represent conformations, are discussed in the new volume of Principles, together with appropriate examples and literature references.

{first published as ref 8}

8Organometallic Nomenclature

The nomenclature of organometallic chemistry often poses a challenge to the chemist. For example, when constructing a systematic name for such a compound should one employ the methods of organic chemical nomenclature, or should one try to adapt the methods of inorganic nomenclature? The answer will depend upon the compound under consideration, with the addition that neither may be directly applicable because the compound presents a problem of assigning a structure with conventional electron-pair bonds. Principles goes some way to dealing with this poser.

Some organometallic compounds, primarily of main-group elements in Groups 13–16, are clearly formally so similar to their carbon analogs that they may be named in a rather like fashion, using an organic-type nomenclature. Hence, we can derive names such as methylalumane and trimethylsilanamine for the compounds AlH₂Me and SiMe₃NH₂ (Me = CH₃). The first name is derived from the name alumane, assigned to AlH₃, and the second from silane, assigned to SiH₄. More complex compounds, such as chain compounds may often be named by applying the methods of substitutive nomenclature to the names of the formal hydrocarbon parents. For example, skeletal replacement nomenclature can be used to develop a name for substances such as MeSiH₂OPHOCH₂Me. This may be considered to be derived from heptane, which, of course, contains a seven-carbon chain. Hence, the suggested name would be 3,5-dioxa-4-phospha-2-silaheptane. Although it actually contains only three carbon atoms in the chain, the name is unequivocal.

Although organometallic derivatives of Main Groups 1 and 2 may often also be conveniently named by using established additive nomenclature, originally developed to name transition-metal complexes such as [Co(NH₃)₆]Cl₃, hexaamminecobalt(III) trichloride, organometallic derivatives of transition elements may often not be dealt with so easily because they exhibit not only metal-carbon single bonds, but features such as metal-carbon multiple bonds, and also bonds between a metal ion and unsaturated molecules and groups. Nevertheless, additive nomenclature has been adapted to name them too, though additional strategies have had to be devised.

To define the names of non-organometallic complex compounds which exhibit formally different possibilities for the donor atoms, it has been found useful to employ the so-called κ (kappa) convention, whereby the actual donors are specifically indicated. An example is shown below, where two modes of designating the binding of a nitrite ion (NO₂^̶ ) to a metal ion, M, are exemplified.

M-NO₂ designated nitrito-κN
M-ONO designated nitrito-κO

Compounds that contain a single metal-carbon bond present no new problems, but if such a bond is formally double or triple, the name of the carbon-donor must be altered by changing the termination –yl (as in methyl) to –ylidene or –ylidyne. However, the κ convention may also be applied to define which of more than one available carbon atom is a donor. Sometimes, two correct systematic names may be derived, as in the example below:

(2,4-dimethylpenta-2,4-diene-1,1,5-triido-κ²C¹,C⁵)tris(triethylphosphane)iridium

(2,4-dimethylpenta-1,3-dien-1-yl-5-ylidene)tris(triethylphosphane)iridium

The carbon ligand is regarded as anionic in the first name and a radical in the second. Whether it is preferable to treat a specific carbon-donor ligand as an anion or as a radical is a matter still open for discussion, and Principles does not attempt to differentiate between the two approaches.

Sometimes contiguous atoms in a ligand act together as a donor group to a metal ion. This is particularly true of carbon-containing systems such as alkenes, alkynes, and various aromatic groups. To indicate which of the carbon atoms in a ligand is bonded to the metal ion, then the η (eta) or hapto system is applied, as in the examples below.

However, some compounds, notoriously those of boron, are even more difficult to name. They are sometimes termed electron-deficient and do not always obey the more usual rules governing compounds with two-atom localized electron-pair bonds. Researchers have developed a unique approach to naming these compounds, based upon the names for simple borane polyhedra and boron hydride anions. Often the numbers of hydrido-ligands combined with the polyboron skeleton need to be stated. The practitioners in this area also employ several specialized terms. For example, they may differentiate between heteraboranes and metallaboranes, and they employ the word “subrogation” where others might prefer to use the phrase “skeletal replacement.” A whole new chapter in the new Principles is devoted to these compounds alone.

The methodologies outlined here are described in the new version of Principles, together with guidance upon where to apply them and to which type of compound, with examples of such applications and, for those who require them, references to the more detailed literature.

{first published as ref 9}

9Polymer Nomenclature

The aim of systematic IUPAC nomenclature is usually to introduce naming systems that define the structure of a molecule precisely, so that the reader can reproduce the exact structure of the molecule being discussed. The system works reasonably well for completely characterized small molecules, but cannot do so with the same level of precision for polymeric materials comprised of molecular chains (macromolecules), the structures of which are based on constitutional repeating units (CRUs). These repeat units may not be all of the same type and need not repeat in a regular fashion. A given polymer may consist of more than one chain and often of mixtures of different kinds of chain. In addition, there can be regular or irregular steric variations along the length of individual macromolecules and chains might be branched or linked to one another in diverse ways. For many polymers, the repetition may not be exactly regular, the material may consist of a mixture of chain macromolecules of different lengths, and the precise structure may not be known. Nevertheless, methods for naming such materials are necessary for general communication, and polymer chemists have been obliged to develop them. It is probably true that there is not yet a universal agreement among all polymer chemists as to how this should be done in every case, but there is a considerable consensus, and the new Principles presents its basic details.

A polymer is a substance composed of a collection of macromolecules of a range of molecular masses. As a consequence, it is characterized by an average molecular mass rather than a mass of a definite value, as typical of relatively small molecules. These macromolecules may consist of single strand, regular or irregular chains, or they may be double-stranded ladder-like structures or even sheets, the limit being a three-dimensional structure, which may be considered no longer to be within the province of polymers but better treated as a three-dimensional structure such as in a ceramic or glass. Finally, the polymers may be constituted of organic, organometallic, or even inorganic groups, including those of coordination type. Polymer nomenclature must attempt to describe all these types, and no satisfactory universal methodology has been developed.

Two basic methods have been developed to give names which are comprehensible and broadly consistent with the apparent structure. Neither method conveys all the details of the polymer structure, but one capable of doing so would probably be too long and complicated to be easily comprehensible, even to the informed reader. A shorter form is often adequate for many purposes.

Most polymers consisting of regular, single macromolecular chains may be named using structure-based nomenclature. Example (a) shows a generalized structure of such a polymer. A, B, C, and D represent groups of atoms comprising the main chain while E and R denote the chain end-groups and pendant groups, respectively. The CRU for this generalized structure, and the CRU and name and of a real polymer are also shown. Precise rules are necessary to govern the selection of the CRU.

Example (a)

	Representative single-strand polymer structure



	Constitutional repeating unit (CRU)


	CRU of a polymer named poly[oxy(1-methylene)]

Methods have also been developed to name irregular single-strand polymers and polymers of other structures, and these are also mentioned in Principles. Nevertheless, a precise structure-based name may be impossible to devise for a variety of reasons, such as lack of enough structural information. By far, the most widely used and easily implemented method of naming polymers is source-based nomenclature and example (b) shows three such names.

Example (b)

polyacrylonitrile
polystyrene
poly(dimethylstannanediyl)

The first two names are based on the names of the reagents (which may be monomers) from which the polymers were synthesized, acrylonitrile and styrene in these cases, but they convey little information about structure. However, this use of a reagent name may not be always applicable. For example, the source reagent to synthesize the third polymer cannot be simply dimethylstannanediyl, even though the polymer name itself is easily comprehensible. All three source-based names are organic style, but for the third, which describes an inorganic single-strand polymer, an inorganic-style name is also available:

catena-poly[dimethyltin]

These are only very simple examples, but they hint at some of the complexities involved in naming polymers, which differ from some of the methods used for naming small molecules. Principles also describes how more complicated polymer structures can be named, and also abbreviations for names that are commonly used both in academe and in industry. As usual, a list of basic references is also provided.

The recommendations and advice of Professor Richard G. Jones (University of Kent, Canterbury, and chair of the IUPAC Subcommittee on Polymer Terminology) during the preparation of this note are gratefully acknowledged.

{first published as ref 10}

10The Special Case of Boron Hydrides

When writing down chemical structures, chemists feel happiest if they can depict how atoms are arranged in space and join them together appropriately with lines, each of which represents a two-electron bond. Unfortunately, though this type of model is adequate for many structures and compounds, it is not true for all. Organic chemists have developed methods that allow for the fact that aromatic compounds are not always adequately represented by names and structure based solely upon two-center two-electron bonds, and inorganic chemists have faced similar problems with certain classes of inorganic compound, such as boron hydrides. This edition of Principles carries a completely new chapter devoted to such compounds.

Like aromatic rings, boron hydrides are often not satisfactorily represented by structures consisting solely of two-center electron-pair bonds, though Nature still aims for full shells. The simplest boron hydride, B₂H₆, contains 12 valence electrons, and formally four pairs are localized in two-electron B-H bonds, with a further four in two three-center two electron B-H-B bonds, as in example (a). The complete name specifies both the number of boron atoms and the number of hydrogen atoms, which differs from organic practice, which assumes that the number of hydrogen atoms in the carbon analog of diborane would be obvious.

This method is extended to other boron hydrides as shown in example (b). All the apices represent B-H groups, and four three-center B-H-B bonds are designated. The name specifies the number of hydrogen atoms. In general, the polyboranes adopt the conformations of triangulated polyhedra, each of which has its own numbering convention.

These polyboranes may formally lose hydrons to yield anions, of which Example (c) is typical.

Note that in this particular case, there are no bridging hydrogen atoms, and the number of hydrogen atoms is specified in the name.

The variations chemists can produce in these materials generate a wide range of different structures. A neutral borane can lose a one or two boron hydrides to yield so-called nido and arachno structures as depicted in example (d).

In addition to these variations, hydrogen atoms may be substituted. The names of resultant products must contain locants to specify at which skeletal positions substitutions have occurred.

Skeletal boron atoms may also be replaced by other atoms, operations termed subrogations by many boron chemists, yielding materials of which example (e) is an instance. There is an exo hydrogen atom, not shown, attached to each apical atom, carbon as well as boron.

Finally, the boranes may be considered to be similar to electronic delocalized aromatic systems, and like benzene and related derivatives they can also form sandwich compounds, as shown in example (f).

Sometimes all these variations may occur in the same structure, so that considerable care is required in determining the appropriate parent borane and the number of hydrogen atoms. Example (f) is derived from a parent dodecaborane, although there are only nine boron atoms in the actual structure, the three other apical positions being occupied by an iron atom and two carbon atoms. Consequently, an accurate structural diagram and the corresponding name can be rather large and complex. Principles summarizes all these structural types and the appropriate methods for naming them, together with references to the original literature.

{first published as ref 11}

11Use of Abbreviations, Enclosing Marks, and Line-Breaks

The use of abbreviations sometimes causes more problems for a reader than is strictly necessary. The current free use of acronyms in texting has exaggerated these problems. In chemistry, care must always be taken to write in terms that are as clear as possible for any potential reader, and certain rules should always be followed in an attempt to achieve this. For the purposes of this post, the words abbreviation and acronym may be used interchangeably.

IUPAC has suggested a set of guidelines for the employment of abbreviations in chemistry texts (“The Use of Abbreviations in the Chemical Literature, Recommendations 1979,” PAC, 1980, 52(9), 2229– 2232). These recommendations suggest that “there are great advantages in defining all abbreviations . . . in a single conspicuous place in each paper . . . preferably near the beginning in a single list.” Included in these recommendations is a suggestion that no abbreviations should be used in titles or abstracts. The use of abbreviations in formulae is often preferable to the use of recommended names, but in such cases an accompanying definition may be absolutely necessary.

In English texts, there are certain abbreviations that are generally understood by all chemists, though thought should be given as to whether this will be true for speakers of other languages. Abbreviations such as thf (for tetrahydrofuran) may be self-evident to an English speaker but not to, say, a German or Hungarian speaker. Abbreviations for more complex organic groups should generally be defined. Generally accepted English abbreviations include those for organic substituent groups such as Ph, Me, Et. Pr, and Bu, though whether specific variants of qualified versions, such as t-Bu or Bu^t , are preferred may be a matter of editorial style. Care should be exercised, because it is sometimes not evident whether an abbreviation such as Bz is meant to indicate benzyl or benzoyl or even benzene.

Inorganic chemists also generally have problems with abbreviations, especially for the names of ligands in the formulae of coordination complexes, because specific rules for producing abbreviations from systematic names are not generally available and lists of recommended abbreviations cannot be complete and comprehensive. The new Principles goes some way to deal with this by providing a long list containing the names of some of the commonest ligands, their recommended abbreviations, and the names from which the abbreviation was derived. For example, the abbreviation acac, derived from the non-standard name acetylacetonate, may be widely understood, though the current recommended IUPAC systematic name is 2,4-dioxopentan-3-ide. Some general principles for developing suitable abbreviations are also presented.

Polymers also have names that are often abbreviated, especially when the use is to define unequivocally a given material rather than to convey a detailed chemical structure. This is especially true in industry and commerce, names such as PTFE and PVC being common examples. Whereas IUPAC nomenclature methods can be used unequivocally to name specific polymers, the accepted abbreviations are often not based upon systematic names but upon trivial names, and many of the users of the abbreviations may not be chemists anyhow. The new Principles contains a discussion on polymer nomenclatures, including a list of the most widely used names and abbreviations. In addition, the subject of abbreviation is still a matter for discussion in particular areas, as demonstrated recently by Brimble et al. in “Rules for Abbreviation of Protecting Groups (IUPAC Technical Report),” PAC, 2013, 85(1), 307–313.

The IUPAC names of natural products are often rather long and complicated. For example, most people can identify what is meant by an acronym such as DNA, though each person probably understands its significance only in as much detail as is needed. Certainly the IUPAC name would only confuse most people, as well as consuming much time and space in presentation. In addition, IUPAC is not the only international body concerned with the nomenclature of materials such as DNA. Biochemical nomenclature is often based upon trivial names, and bodies such as the International Union of Biochemistry and Molecular Biology (IUBMB) are involved in developing and publishing recommendations, the latest IUBMB recommendations dating from 1992. There is a joint IUPAC/IUBMB committee that considers matters of interest to both Unions, including nomenclature problems. Amino acids, carbohydrates, and peptides, as well as nucleic acids, have generated their own specific nomenclatures, and all are dealt with in the new Principles, which provides references for those seeking more information.

Enclosing marks and line breaks are in common use throughout chemical literature. However, though their use may be defined quite clearly by nomenclaturists, their employment is often not consistent. The correct use is important when a sequence of enclosures is being used because these marks are employed in nomenclature as a hierarchy, dictating which set of marks enclose which. The principal enclosing marks are parentheses, ( ), sometimes simply called brackets or round brackets, curly brackets, { }, also often called braces in U.S. texts, and square brackets, [ ]. These are the principal marks used, and though some others may be found in specialized literature, these are those used by chemists. However, the order in which they are used depends upon the specific context.

In organic nomenclature generally and in inorganic names (but not formulae) the sequence to be employed is {[({[( )]})]} or ( ), [( )], {[( )]}, ({[( )]}), [({[( )]})}] {[({[( )]})]}. It will rarely be necessary to use a longer set than this. However, in formulae, and perhaps unfortunately, a different sequence is employed. One reason for this is the universal practice of enclosing the formulae of coordination entities, whether positively or negatively charged, or neutral, in square brackets. The sequence thus becomes [ ], [( )], [{( )}], [({( )})], [({({ })})], [({({( )})})], etc. This sequence, as printed, raises another question often posed when writing long names and formulae: Is the break at the end of the line in the fifth member of the last sequence simply an accident arising from the particular line and word length, or is it an intended break, so that the item is meant to read [({({ })})]? From the context, it is clearly the latter.

It is not possible here to describe all types of use of enclosing marks, and there may be some more specialized instances when minor variations to the above sequences are used. For example, polymer chemists employ an abbreviated hierarchy, which suffices for most general presentations of polymer formulae, namely {[( )]}, and there are other occasions when enclosing marks can help, even when their use is not mandatory. For example, simple parentheses may be used to distinguish terms such as trioxido, O₃ ^2-, from tri(oxido), (O^2- )₃. Such uses often amount to common sense. Clearly, the writer of names and formulae must be aware of the precise context in which the enclosing marks are being used, and select the appropriate sequence. All these matters are dealt with in Principles.

Principles also uses a specific device to deal with the problems sometimes posed by line-breaks. This device is not part of any IUPAC recommendation, but this writer has found it very useful and recommends it for consideration by the community as a whole. Many of the names, systematic or otherwise, employed by chemists contain hyphens to isolate and indicate distinct parts of the name. This is particularly common in names for organic compounds. They are often very long, and as written or printed contain a line-break, because it is not always possible or convenient to write a given name entirely on a single line. Since such names often contain hyphens anyhow, it may not be clear whether the hyphen at the end of a printed line is part of the name or simply indicates a line-break.

Take, for example, the following name: (1R*,3R*,5R*)-[(1S)-sec-Butoxy]-3-chloro-5-nitro-
cyclohexane

Is the hyphen at the line end part of the name or should the final part read: nitrocyclohexane?

An inorganic example would be undecahydro-7,8-
dicarba-nido-undecaborate(2-).

The hyphen at the end of the line poses a similar question.

In Principles these names would appear as follows, (1R*,3R*,5R*)-[(1S)-sec-Butoxy]-3-chloro-5-nitro-►
cyclohexane and, in the inorganic example, undecahydro-7,8-►
dicarba-nido-undecaborate(2-)

The symbol ► used as a line-break makes it clear that the hyphens are indeed part of the name and not imposed by typographical considerations. Principles contains many examples of the use of this device, and consideration of its adoption is recommended to the English-speaking chemical community. Whereas experienced chemists may not feel the need for such a device, the same will not be true for students, which is why it was employed in Principles. The use and value of such a device may vary from language to language and, as Bernardo Herold showed in CI, 2013, 35(3), 12–15 (ref 13), translations of chemistry texts and formulae between different languages raise all sorts of problems, for some of which this kind of device might also be useful.

{first published as ref 12}

References

Nomenclature Notes, as published in Chem Int, from March 2012 to Dec 2013 - https://iupac.org/publications/ci/2012/3401/bw.html
What is IUPAC Nomenclature? Chem. Int. Vol. 34, No. 2, Mar-Apr 2012 - https://iupac.org/publications/ci/2012/3402/nn.html
On the Various Nomenclature Systems. Chem. Int. Vol. 34, No. 3, May-Jun 2012: - https://iupac.org/publications/ci/2012/3403/nn.html
Non-IUPAC Nomenclature Systems. Chem. Int. Vol. 34, No. 4, July-Aug 2012 - https://iupac.org/publications/ci/2012/3404/nn.html
Systematic and Trivial Nomenclature. Chem. Int. Vol. 34, No. 5, Sep-Oct 2012 - https://iupac.org/publications/ci/2012/3405/NN.html
InChIs and Registry Numbers. Chem. Int. Vol. 34, No. 6, Nov-Dec 2012 - https://iupac.org/publications/ci/2012/3406/nn.html
Deciphering and Constructing Names. Chem. Int. Vol. 35, No. 1, Jan-Feb 2013 - https://iupac.org/publications/ci/2013/3501/nn.html
Drawing Chemical Structures. Chem. Int. Vol. 35, No. 2, Mar-Apr 2013 - https://iupac.org/publications/ci/2013/3502/nn.html
Organometallic Nomenclature. Chem. Int. Vol. 35, No. 3, May-Jun 2013 - https://iupac.org/publications/ci/2013/3503/nn.html
Polymer Nomenclature. Chem. Int. Vol. 35, No. 4, July-Aug 2013 - https://iupac.org/publications/ci/2013/3504/NN.html
The Special Case of Boron Hydrides. Chem. Int. Vol. 35, No. 5, Sep-Oct 2013 - https://iupac.org/publications/ci/2013/3505/index.html
Use of Abbreviations, Enclosing Marks, and Line-Breaks. Chem. Int. Vol. 35, No. 6, Nov-Dec 2013 - https://iupac.org/publications/ci/2013/3506/index.html
Lost in Nomenclature Translation, Bernardo Herold, Chem. Int. Vol. 35, No. 3, May-Jun 2013, 12–15 - https://iupac.org/publications/ci/2013/3503/5_herold.html

Citation

Leigh, J. (25 Sep 2018) "Nomenclature Notes" IUPAC 100 Stories. Retrieved from https://iupac.org/100/stories/nomenclature-notes/. (Accessed: day month year)

References

Citation

Subscribe now