Project Details Backup, Maintenance, and Redevelopment of the IUPAC Gold Book website

Project No.:
2016-046-1-024
Start Date:
01 January 2017
End Date:
Division Name:
Committee on Publications and Cheminformatics Data Standards
Division No.:
024

Objective

There are four primary goals of this project
1) Create a stable, modern version of the current Gold Book website (http://goldbook.iupac.org/)
2) Create a downloadable vocabulary of Gold Book terms
3) Create a simple website to administer updates to Gold Book terms
4) Create a simple Application Programming Interface (API) to access the Gold Book terms

These activities are intended to stabilize and prepare the Gold Book website for future development via subsequent projects. In addition, the use of the existing Digital Object Identifiers (DOIs) assigned to the Gold Book terms will be leveraged to provide machine readability of the terms.

Description

The IUPAC Gold Book website (http://goldbook.iupac.org) is an important product of IUPAC. It is a visible face of a significant investment that the members of the IUPAC Divisions have made over the years to formal definitions many important chemical terms. It is also a valuable resource that needs to be tapped for representation of chemical concepts in the digital realm and this project is a significant step toward that goal.

Background
Digital representation of the IUPAC Compendium of Chemical Terminology – the “Gold Book” – started in 2002 in project 2002-022-1-024, and was continued by project 2007-016-1-024. The most recent project (2013-052-1-024) was focusing on building a management system for the terms in the Gold Book. A database and web portal were designed that could be used to develop a dynamic website. However, with the recent move of the main iupac.org website to WordPress (different from the platform used for old.iupac.org), that project is on hold to first address issues of stability and integration into the new website.

Status
Currently, the Gold Book is at version 2.3.3 (as of February 2014) and is in need of significant technical maintenance. In the last few months, pages from the website have ‘disappeared’ (the files have become empty) for no apparent reason. As an emergency measure the website was moved to a new hosting platform and over ~8,000 files were recovered from the Internet Archive. The new hosting platform can be administered through a web interface and this will allow maintenance of the existing pages while the activities of this project are undertaken.

This Project
The previous project raised a number of issues relative to the data currently used in the Gold Book website. The complexity of the XML encoding of the website made it difficult to ‘pull apart’ features of the site from the text. Reliance on python scripts to generate maps (from the XML) and ‘Goldify’ the pages further complicated maintenance. These issues, in addition to the stability of the current website and the ongoing need to update terms in the Gold Book demand a better solution moving forward. Note: No modifications to the current terms will be made during this website redevelopment project unless requested by IUPAC Division representatives. Pending the creation of a new, stable Gold Book website, IUPAC Divisions will be requested to update/add entries as they deem necessary.

There are four primary goals of this project

1) Create a stable, modern version of the current Gold Book website. The activities are as follows:
a) Conversion of the current XML-based webpages to dynamically generated webpages. The XML webpages will be deconstructed and imported into a MySQL database. Subsequently, a web scripting framework (CakePHP) will be used to re-create identical versions of the Gold Book pages reusing the existing image files, MathML files and formatting. This will be developed and deployed at a separate URL than that of the current Gold Book so that the Task Group members can test and evaluate the pages prior to replacement of the existing pages. Once the new version of the site is deemed acceptable for community access it will be made available as a beta version of the current site for three months. This will allow us to gather user feedback on any additional changes that are needed. After that period the new site will replace the existing one.
b) Website functionality (search, etc.) currently implemented in python will be converted to CakePHP.
c) Gold Book pages will be augmented with Google Analytics code to allow monitoring of the usage of the Gold Book terms. This will generate usage statistics, demographics, and referring page information (useful for identification of users).

2) Create a downloadable vocabulary of Gold Book terms – the Gold Book terms and DOI’s will be made available in different formats (CSV, XML, JSON) to make it easier for developers of science-related sites to use the terms on the Gold Book website. Documentation of the vocabularies and guidelines for appropriate usage will also be made available.

3) Create a simple website to administer updates to Gold Book terms – once the webpage content (Gold Book terms and related information) is in the MySQL database a simple administrative interface will be created to update the text of the terms. This will be tested by Dr. Chalk based on updates submitted from IUPAC Division representatives.

4) Create a simple Application Programming Interface (API) to access the Gold Book terms – as a forward thinking activity, a small API will be developed in order to demonstrate the potential for the website to provide information about the Gold Book terms programmatically via a JSON encoded representation of the Gold Book terms. This proof-of-concept will inform use cases for future project proposals.

Project code will be backed up (ongoing) to a GitHub repository, which will also allow versioning and a history of the code development. This, plus documentation within the code will make it easier to maintain and transition to future developers.
Additionally, a log of time spent on the project will be captured and used to develop and estimate of ongoing maintenance needs of the Gold Book website.

Progress

This project is complementary to project 2013-052-1-024

Page last updated 6 Jan 2017