(710g) Open Chemistry, Avogadro and Jupyter: User Friendly Frontends

Hanwell, M. D., Kitware
Open, interactive interfaces employing data-centric workflows have been developed, reusing best-of-breed open source software in order to deliver an integrated platform for knowledge discovery. The Open Chemistry project serves as an umbrella, Avogadro offers a desktop molecular editor, and the Jupyter project offers a powerful electronic notebook interface, with server-side software kernels executing Python code. The JupyterLab web frontend provides a web-based interface with interactive cells where code can be edited, and panels where data can be visualized. Coupling these interfaces with a powerful data server, capable of triggering simulations, analyses, and other workflows offers a powerful capability to seamlessly execute codes from a pre-configured environment, store data, and share the results. The use of software containers improves both modularity and reproducibility.

Extension of the Python software kernels and web interface with chemistry specific capabilities results in a software environment. The development of frontends is important to foster wider adoption, the use of reproducible electronic notebooks offers an environment where simulation steps can be shared accurately, and robustly. The use of collaborative editing capabilities offers the peer-to-peer review of the complete process from initial chemical coordinates to final results. A data-centric environment, with Python scripting, web programming interfaces, and modern web visualization widgets creates an environment that can be used for teaching, research, and wider dissemination.

The development of an open ecosystem of tools will be described, with open programming interfaces that feature working reference implementations. The best way to make molecular simulation more mainstream is to embrace these modern, data-centric user interfaces, use modern, documented file formats for exchange, and to make all steps of the process transparent. Software containers are used to package computational codes, and to run them using cloud or high-performance computing resources. Improved standardization of file formats will make it easier to add more codes, and open systems enable extension by simulation code developers, researchers, developers, and instructors.