(84d) Generating Insights Using Text Mining Methodologies from Aiche Survey and Industrial Documents | AIChE

(84d) Generating Insights Using Text Mining Methodologies from Aiche Survey and Industrial Documents


Zink, A. - Presenter, The Dow Chemical Company
Dessauer, M. - Presenter, Dow Chemical
Webb, M. - Presenter, Dow Chemical Company
Information extraction (IE)[1] is an important sub-area of text mining that aims to locate specific data from natural language documents and identify the relationships between them. As noun phrases (NPs) usually contain the key information of a word corpus, even when taken out of context, NP extraction methodologies are widely applied among various IE approaches[2]. In the chemical industry, it can be used to interpret survey data, incident reports, and work orders. In this paper we will show an example of analyzing satisfaction survey dataset using NP extraction together with Net Promoter Score (NPS). The overall attendee satisfaction is estimated by calculating the NPS score from structured survey data, e.g., ratings for satisfaction level. NPS synthesizes satisfaction into a single score that is popular among B-to-C companies for assessing improvement opportunities in their products and/or services, providing a convenient benchmark scale. Additionally the unstructured data, such as comments from the attendees, are analyzed using NP extraction. Key concerns can thus be extracted and actions defined to address them in future events.

Further we illustrate the application of text analytics to the challenge of optimizing and concentrating similar sessions at AIChE meetings. Common approaches, such as term frequency–inverse document frequency (TF-IDF)[3] followed by similarity and clustering techniques, as well as topic extraction, allow the identification of similar session descriptions.

Finally, we expand the scope to text documents relevant to chemical manufacturing, such as incident reports, reliability documents and work orders, and examine various embedding approaches to provide useful insights in categorization for technical challenges.


  1. Mooney, R. J.; Bunescu, R., Mining knowledge from text using information extraction. SIGKDD Explor. Newsl. 2005, 7 (1), 3-10.
  2. Handler, A.; Denny, M.; Wallach, H.; O’Connor, B. In Bag of what? simple noun phrase extraction for text analysis, Proceedings of the First Workshop on NLP and Computational Social Science, 2016; pp 114-124.
  3. Sarkar, D., Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data. Apress: 2016; p 385.