(141b) An Intelligent Machine for Document Preparation
AIChE Annual Meeting
Monday, October 29, 2018 - 12:55pm to 1:20pm
This talk will describe a Proof-of-Concept project to develop an artificial intelligence-based solution to rapidly assemble an End-of-Phase II briefing document for molecule A. Both structured and unstructured data derived from electronic lab notebooks and technical report repositories was accessed via an data integration layer and analyzed using annotation-free natural language processing tools. A training corpus of 70,000 pdf files from five molecules, including molecule A was ingested and used as a the design corpus, and trained against. Prior EOPII documents from the four molecules (other than molecule A) was utilized to generate a set of training questions and answers. The design corpus was then utilized to help assemble the EOPII document given key questions that needed to be asked. The prototype tool featured accessing key textual phrases using Natural language search, a table builder, an image/plot finder and a free-text addition capability to collaboratively create an html or docx file as an output. This tool was compared against the current internal search tool and comparison metrics were generated.