(560cl) Exploratory Textual Data Analysis for Understanding the Research Development of Oxygen Reduction Reaction
AIChE Annual Meeting
2019
2019 AIChE Annual Meeting
Catalysis and Reaction Engineering Division
Poster Session: Catalysis and Reaction Engineering (CRE) Division
Wednesday, November 13, 2019 - 3:30pm to 5:00pm
In this work, we demonstrated a web-based search engine to explore a vast amount of literature that are associated with the topic of oxygen reduction reaction. As the core component, the machine learning algorithms based on the task-specific Natural Language Processing (NLP) technique is the key to the success of the web-based search engine. The search engine stars with an HTML web scraper to systematically extract the unstructured textual data from the available website. Then, a text normalization process is applied to filter out the non-informative components of the articles such as HTML syntax, punctuations, stop words. In next step, a series of pre-trained surrogate models (i.e., categorical topic classifiers, Named-entity recognizer, sentiment analyzer) is implemented to process and highlight the most informative keywords for describing the given article such as the types of catalytic material, research scopes and primary techniques. Besides, some document characteristic information, i.e., publication time, impact factor, is also extracted for further analysis. In the end, we perform extensive exploratory analysis for the extracted keywords associating with the documents to draw some insightful knowledge. The exploratory analysis includes the visualization of the histogram distributions of the investigated catalytic materials, the ranking of the research topics versus years and so on. The analytical summary of our textual search engine has huge potential for guiding future research studies and strategic decisions.