Project Details Enhanced recognition and encoding of stereoconfiguration by InChI tools

Project No.: 
2019-017-2-800
Start Date: 
09 September 2019
End Date: 

Objective

InChI tools are widely used as substance identifiers in various sources of chemical information. However, the current support of stereochemical information is limited to tetrahedral, double bond and short allene stereoisomerism. Among the unsupported stereo types are atropisomers and some special cases including centers with more than four ligands. An additional significant problem is an incomplete recognition of configurations for very common Haworth and chair representations of carbohydrates. The absence of support for MOLFile V3000 enhanced stereo used to represent relative and racemic configurations is another significant limitation.
Updated procedures will allow InChI to support additional stereochemical cases and avoid mistakes in designation of stereoisomers.

> Access InChI SubCommittee

Description

InChI and InChIKey already are very useful for identification and search of substances in various sources of chemical information. While InChI tools resolve most aspects of constitutional isomerism, especially for organic substances, some types of stereoisomerism and representations of stereoconfiguration are not recognized. The most significant examples include atropisomerism and very common representation of carbohydrates in Haworth and chair forms. Unsupported stereoisomerism types results in the inability to distinguish specific isomers and the incorrect treatment of some representations results in incomplete or even wrong InChI identifiers.

Another distinct problem is the unsupported MOLFile V3000 enhanced stereo information that is currently used to represent mixtures of stereoisomers for industrial chemicals. Such structures are currently incorrectly interpreted by InChI tools as representing single stereoisomers.

Other problems that the project intends to address include nontetrahedral stereoconfigurations often encountered in coordination compounds but existing for organic compounds as well and unrecognized configurations for several specific cases including pyramidal arrangements and cumulenes with more than three cumulated bonds. The development of the principles for recognition and encoding of configurations for coordination structures will take into account the results of InChI for organometallics project 2009-040-2-800.

See FAQs on www.inchi-trust.org

Progress

Project announcement published in Chem Int Jan 2020, p. 30

Sep 2021 update – The project is in gradual progress. The following tasks have been already investigated and the documents with possible solutions are prepared for further considerations by the InChI Subcommittee and
InChI developers:
1. Enhanced stereo marks;
2. Atropisomers;
3. Support of longer allenes; and
4. Additional tetrahedral cases, including special case of stereo at spiro atoms (last part is already implemented in InChI 1.06).
According to the preliminary project plan the TG is expected to produce this year the recommendations to resolve problems with recognition of configurations of carbohydrates represented in Haworth and chair forms.
After the discussion at the TG meeting on 21 April 2021, communication with InChI developers (Igor Pletnev and Dmitrii Tchekhovskoi) and discussions with InChI Subcommittee members the following decisions have been made:
– a support of atropisomers and longer allenes should be implemented in the ‘core InChI procedures’ with atropisomers needing further details for decision about implementation.
– a correct recognition of configurations for Haworth and chair representations should be implemented via some external module that will modify the input structures allowing the standard InChI procedures to recognize configurations correctly. Such preliminary ‘preprocessing module’ can help to resolve other issues related to representation of stereo
configurations and probably other input problems.
– the most reasonable way for encoding of enhanced stereo marks remains unclear needing further considerations. There are some arguments for encoding of them via InChI AuxInfo and with implementation outside the core InChI procedures.

The tasks ahead include:
1. Clarify the necessary details for implementation of recognition of atropisomers.
2. Prepare the proposals for development of the external ‘preprocessing module’ to resolve problems with specific representations of stereo configurations.
3. Organize further discussions of possible ways to encode enhanced stereo marks with InChI tools.

 

June 2022 update – Since the previous report the task group sadly lost the key InChI developer and valuable group member Igor Pletnev. At the same time, we welcomed five new members that help us with progress in all areas of this project. The task group had two virtual meetings in 2022 – on March 21 and June 7.

We already have formulated decisions for most our tasks with further activities assuming preparation of technical specifications for implementation of new procedures in InChI code. More discussions are expected for recognition and encoding of polyhedral stereo for coordination structures and consideration of possible new classes including compounds with chirality plane.
The next meeting is planned along the InChI technical sessions that will be held on 17-19 June 2022 in Cambridge, UK. https://www.inchi-trust.org/june-2022-technical-sessions/

Page last updated 8 June 2022

Partners