(39i) Signac: Data Management and Workflows for the Molecular Sciences


Butler, B. - Presenter, University of Michigan
Glotzer, S. C., University of Michigan
The signac data management framework (https://signac.io) helps researchers execute reproducible computational studies, scaling from laptops to supercomputers and emphasizing portability and fast prototyping. With signac, users can track, search, and archive data and metadata for file-based workflows and automate workflow submission on high performance computing (HPC) clusters. We will discuss recent improvements to the software’s feature set, scalability, scientific applications, usability, and community. Newly implemented synced data structures, workflow subgraph execution, and performance optimizations will be covered, as well as recent research using the framework and the project’s efforts on improving documentation, contributor onboarding, and governance. Motivated by the needs of computational molecular scientists for heterogeneous data processing with bespoke parallel execution patterns on widely varying hardware, we will demonstrate how these new capabilities enhance signac’s integrated approach to molecular simulations and facilitate the high level of flexibility required by this domain of research.