(169g) A Platform Facilitating Workflow Management of Multi-Task, Multi-Scale Simulations for Distributed Computing Environments

Authors: 
Preisig, H. A., Norwegian University of Science and Technology (NTNU)
Rusche, H., Wikki
Karolius, S., Norwegian University of Science and Technology
Thombre, M., Birla Institute of Technology and Science

The MoDeNa project [4] is developing a framework for performing simulations of multi-scale systems. The motivation arises from the need to design and control the material structure of polyurethane foam by performing detailed simulations that connects models across all levels of scale, i.e. from molecular dynamics, through CFD and ending with final product. Moreover, the project focuses on reducing computational cost by employing inexpensive surrogate models that approximate the behaviour of the corresponding complex models on the lower scales.

The practical challenges, from a software perspective, are threefold: connecting models written in different programming languages, facilitating communication between the models and dynamically modifying the computational workflow during the simulation. The management of the computational workflow is particularly important because there is no way to know, apriori, which models will have to be executed, when, and in what order. The solution to this problem is to employ an orchestrator, i.e. a program that manages the computational jobs: where, when and in what order they are executed.

The most basic orchestrator is a que where jobs are executed in sequence and new jobs (N) are staged at the back.

This is sufficient for systems where the sequence is known apriori, e.g. for a simple process simulator, but MoDeNa requires a more flexible orchestrator.

The reason is the multi-scale connection of the models, where every model requires information from models on a lower scale before it can be executed. Specifically, the framework must accomodate the functional dependency of models across the scales. The framework also allows one dependency to trigger a chain of new dependencies, some of which must be executed in sequence while others can be run in parallel. Moreover, in order to ensure the range of validity and accuracy of the surrogate models, the framework also allows design of experiments and parameter fitting procedures to be started by any model, at any time.

The following example shows how the orchestrator allows the workflow to grow with arbitrary complexity when the new job (N) and all its dependencies are added to the workflow.

The orchestrator used for MoDeNa is the Python-based program FireWorks [2], originally developed for the materials project [1]. In addition to providing all the flexibility needed for dynamic management of the MoDeNa workflow it also suports staging of jobs on arbitrary computing resources. The use of a workflow management, such as FireWorks, facilitates the flexibility of connecting models from different programming languages or simulation tools. By adding a framework that facilitates communication and connection through the same backend that FireWorks uses (MongoDB[3]), MoDeNa makes it easier to cooperate across disciplines to create models that easily incorporate details that would otherwise be assumptions in a simplified model. References

[1] Anubhav Jain, Shyue Ping Ong, Geoffroy Hautier, Wei Chen, William Davidson Richards, Stephen Dacek, Shreyas Cholia, Dan Gunter, David Skinner, Gerbrand Ceder, and Kristin a. Persson. The Materials Project: A materials genome approach to accelerating materials innovation. APL Materials, 1(1):011002, 2013.
DOI |
http ]
[2] Fireworks: workflow software.
http ]
[3] MongoDB: open-source, document database.
http ]
[4] MoDeNa: The MoDeNa project.
http ]