(729g) Recent Developments in the Signac Data Management Framework
AIChE Annual Meeting
2019
2019 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Making Molecular Simulation a Mainstream Chemical Engineering Tool
Thursday, November 14, 2019 - 5:20pm to 5:40pm
Computational resources for high-throughput data generation offer incredible potential for accelerating scientific discovery, especially if used in conjunction with well-managed computational workflows. The open-source signac data management framework enables researchers to maintain well-formed, reusable data spaces from early exploration through production runs on supercomputers. This is achieved through a transparent data and workflow model, as well as a simple and unobtrusive programmatic interface. The framework is application-agnostic, and has been applied in molecular simulations, quantum chemistry, photonics, computational fluid dynamics, machine learning, graph mining, and even organizing experimental data. Recently, the framework has been significantly extended. Among the new data management features are HDF5 integration for rapid access to large numerical data arrays; tools to import and export data spaces for long-term archival and publication of data sets; and increased integration with the scientific Python ecosystem to support easy export to pandas data frames or visualization in Jupyter notebooks. Furthermore, workflow automation has been expanded to support an increased set of supercomputers and allow more complex operations. We show examples of recent scientific applications that demonstrate the efficacy and versatility of signac across a range of research domains.