Project Details Database on Molecular Compositions of Natural Organic Matter and Humic Substances as Measured by High Resolution Mass Spectrometry

Project No.:
Start Date:
01 November 2016
End Date:
31 August 2020
Division Name:
Chemistry and the Environment Division
Division No.:


The long-term goal of the project is to set grounds for molecular systematics of non-living organic matter using comprehensive data base on molecular constituents of natural organic matter (NOM) and humic substances (HS) as identified by Fourier transform ion cyclotron resonance mass spectrometry (FTICR MS).

The immediate goal of the project is to compile initial data on molecular compositions of NOM and HS from diverse environments (soil, river, marine, permafrost, peat, coal, and others) into user-friendly and publicly available database. This database can become an important tool for elaboration of criteria for evaluation of data on molecular constituents of NOM/HS, which have grown exponentially over last decade.


Investigation of non-living organic matter at the molecular level is in the focus of modern environmental chemistry. Fourier transform ion cyclotron resonance mass spectrometry (FTICR MS) is a method of choice due to its unprecedented resolution capacity (Hertkorn et al. 2007; It resolves millions of carbon atoms with distinct chemical environments both in terrestrial and extraterrestrial organic matter. The data on molecular diversity of non-living organic matter coupled to numerous data on biomarkers can be used to conclude on various biogeochemical processes in the present and in the past. The condition for further progress in this field is development of publicly accessible electronic database containing information about detected compounds and their concentration in non-living organic matter defined as natural organic matter (NOM) in waters and humic substances (HS) in solid systems (e.g. soils, coals, peats, bottom sediments).

Specific objectives of the proposed work are:
– developing database architecture for storing, visualization and classification of the big data on molecular constituents of non-living organic matter
– compiling initial data on molecular compositions of HS and NOM from different environments with fully annotated protocols of their isolation, FTICR MS data acquisition, and data treatment; and uploading them into the developed data base.

This will enable elaboration of criteria for further evaluation of data on molecular composition of NOM and HS.

To achieve the formulated tasks, the project brings together the researchers with long-term experience in FTICR MS data acquisition and treatment who work with HS and NOM of various origin: water, soil, coal, peat, bottom sediments, in the different geographic regions using different isolation techniques and data treatment approaches. In addition, task group includes professionals in the field of geostatistics, information technologies, and big-data treatment.

Combination of this diverse expertise and rich FTICR MS data pools available in the working groups of the task group members is a key prerequisite for compilation of initial data on molecular constituents of NOM/HS into user-friendly data base. We will use the existing Biogeochemistry Database architecture developed by the task group members from the Lomonosov MSU, adapt it to the project needs, and distribute upload tasks among other TG members in accordance with their specific expertise (freshwater NOM, marine DOM, terrestrial HS from different sources) to come up with the demonstration database at the end of the project.


May 2017 – Project announcement published in Chem. Int. Apr 2017, p. 21;

March 2019 update – The project is close to accomplishment. The task group had three working meetings – the last one took place in April 2018, at the European FTMS conference in Freising (Germany). The meeting was devoted to refinement of the protocol of intercalibration studies on FTICR MS measurements of six samples of humic substances from soil, water, and coal. The samples were distributed to five participating labs (2 in Germany, 1 in USA, 1 in Korea, 1 in Russia). The measurement guidelines were prepared and sent along the intercalibration set, as well as metadata guidelines and guidelines for data collection.  The data obtained were collected by the task group leader. They were treated by the Lomonosov MSU team using multidimensional statistics and machine learning approaches. First project outcomes with regard to data treatment approaches are published by the task group leader in PAC (Perminova, 2019, AOP 2019-02-14, The results of intercalibration study are summed up as a draft of manuscript for the Pure and Applied Chemistry.


August 2020 update – A Technical Report by Zherebker, A., Kim, S., et al. entitled Interlaboratory comparison of humic substances compositional space as measured by Fourier transform ion cyclotron resonance mass spectrometry (IUPAC Technical Report) has been published in Pure and Applied Chemistry (published AOP 18 Aug 2020). Supporting Information for this paper includes a table with the description of the participating laboratories and the files with raw mass lists, with assigned formulae, and source code (jupyter notebooks), which were used for data analysis can be found at The detailed description is available in this repository in the file.

Sept 2020 update – The Technical Report is in print published in the September 2020  issue of PAC, as 92(9), 1447-1467, 2020 ;

Page last updated 23 August 2020