(506a) Open Source Data Science Education Materials for Chemical Engineers | AIChE

(506a) Open Source Data Science Education Materials for Chemical Engineers


Beck, D. - Presenter, University of Washington
Pfaendtner, J., University of Washington
Wolf, C., University of Washington
Curtis, C. D., University of Washington
All areas of science and engineering have been the beneficiaries of a data-deluge resulting from new instruments, scalable computing architectures and robotics. Chemical Engineering (ChemE) is at the forefront of this wave as a result of the commoditization of tools such as robotic instrumentation, high throughput experimental methods including gene sequencers, a proliferation of inexpensive and networked sensors and mixed modality data generators. To be competitive in the changing landscape of industry and academia, the next generation of ChemEs will need to know how to effectively use data science tools such as data management, statistical and machine learning, scalable data management and visualization techniques to communicate their findings effectively to stakeholders and peers.

The University of Washington (UW) has embraced data science in a campus wide initiative and the ChemE department at UW has led the effort to integrate highly contextualized molecular data science instruction at the graduate level, for both MS and PhD. ChemE @ UW offers two transcriptable options to graduate students: Advanced Data Science & Data Science. Students who participate in these options get a graduate degree in Chemical Engineering with transcripted recognition of their participation in the program, e.g. a PhD in ChemE with Advanced Data Science Option, or a MS in ChemE with Data Science Option. The former is designed for chemical engineers who want to design and create new data science algorithms and tools, e.g. new machine learning methods for sparse molecular descriptor spaces, while the latter is intended for data science tool users who want to be confident in their selection of methods and application of best practices, e.g. choosing an effective neural network architecture for a predictive model of blood brain barrier permeability.

This presentation will review the structure of ChemE @ UW’s graduate data science options and professional programs from their inception as a result of two NSF training grants and in the context of the UW campus wide investment in data science education, research and training. It will highlight the NSF NRT Data Intensive Research Enabling Clean Technology (DIRECT) program which has been the primary vehicle for sustainable graduate molecular data science education in the department that includes 1) new graduate coursework, including active learning style courses that teach data science skills in a manner contextualized to molecules and 2) a capstone program that uses project based learning to practice and apply data science skills to challenging real world problems, supplied by internal and external partners, in a team-based setting. It will close with outcomes from early cohorts that participated in the transcriptable options.