(4ec) Graphical Model Framework for Automated Annotation of Cell Identities in Dense Cellular Images | AIChE

(4ec) Graphical Model Framework for Automated Annotation of Cell Identities in Dense Cellular Images

Authors 

Chaudhary, S. - Presenter, Georgia Institute of Technology
Lee, S. A., Georgia Institute of Technology
Li, Y., Georgia Tech
Lu, H., Georgia Institute of Technology
C. elegans is an extremely useful model organism in understanding neural basis of behaviors because of several advantageous features. The nematode has a small and fully mapped nervous system, displays complex behaviors, and is amenable to fluorescence imaging. Recent advances in microscopy have enabled simultaneous recording of activities of the entire brain of freely behaving C. elegans (whole-brain imaging) thus producing a mine of rich datasets. However, in order to compare recorded neuron activities across animals and experimental conditions, biological names of neurons must be identified in videos. Each neuron of C. elegans has been assigned a biological name. Further, huge body of literature describing individual neuron properties is available for several neurons. Incorporating this literature with whole-brain imaging experiments is not possible without resolving cell identities.

Although some methods exist, they were either not designed for handling dense images [1-2], and thus have lower accuracy requirements, or use registration methods [3-4], which perform poorly in handling noises common in data. The lack of an accurate automated method is becoming increasingly the bottleneck in analyzing large number of datasets in multi-cell and whole-brain functional imaging [5-6].

To address these challenges, we developed a probabilistic graphical model framework, CRF_Cell_ID [7]. The algorithm based on Conditional Random Fields [8] annotates cell identities by maximizing intrinsic similarity between shapes. We quantitatively show our method achieves higher accuracy and is more robust against two types of noises prominent in data (position variability and missing cells) compared to popular methods. CRF_Cell_ID boosts accuracy by building new data-driven atlases in highly computationally efficient manner compared to previous methods. Further, data-driven atlases can be built using partially annotated datasets or datasets from strains with non-overlapping cells, a task not possible by previous registration based methods.

We demonstrate wide applications by identifying cells in gene expression localization, multi-cell calcium imaging, and whole-brain imaging in C. elegans. We also demonstrate generalizability across strains and imaging conditions by 1) extending framework to incorporate new information in different strains such as landmark cells or chromatic code [4] and 2) showing application in handling images with different animal orientations. We demonstrate the power of our computational framework in a real use-case of whole-brain imaging, where automatic annotation enabled us to identify two distinct groups of specific cells in recordings: encoding responses to food and controlling spontaneous locomotion.

Our framework will enable fast and unbiased cell identification in whole-brain videos thus generating big annotated datasets. This will enable application of newer computational techniques to be applied to data (earlier limited by lack of cell identities) such as Tensor Component Analysis [9] or deep learning based methods and will enable understanding of how global brain dynamics generate and coordinate behavior.

Research Interests – Overall my research interest is to elucidate how brain dynamics (global neural activity in brain) generates flexible and long-lasting behavior, how brain dynamics is generated, sustained and flexibly reconfigured across cognitive and motor tasks, how brain dynamics is affected by internal states (such as hunger, sleep etc.) and how brain dynamics is altered in diseased states. Experimentally I use a genetic system C. elegans (a model organism in neuroscience), microfluidics and fluorescence microscopy. To analyze data, I develop computational methods for image processing. My current research is aimed at building computer vision and machine learning tools to enable biologists/neuroscientists to process 3D + time neuron activity image datasets in automated and high-throughput manner thus empowering them to focus on biological questions of interest. Research interests keywords - Neurobiology, Neuroethology, Machine learning/Deep learning, Image processing, Microfluidics, C. elegans.

Teaching Interest – Machine learning/Deep learning are increasingly integrated with all areas of science and research. I believe that training Chemical Engineers in these skills can further empower them to apply quantitative data analysis skills in their research areas as well as explore diverse research areas such as Computational Protein Design, Drug Discovery etc. Thus, I would love to teach an introductory course on Machine Learning and Deep Learning that covers the fundamentals of various techniques such as Supervised Learning (Linear/Non-linear regression, Classification, Neural Networks, Perceptron, Convolutional Neural Networks, Recurrent Neural Networks), Unsupervised methods (Dimensionality Reduction, Manifold Learning), Graphical Models, Self-supervised learning etc. I would also supplement these topics with practical implementation in Python using TensorFlow.

Second, Microscopic imaging is pervasive in almost all areas of science. In most cases, quantitative methods based information extraction and analysis is necessary to analyze microscopy data. Development of machine learning methods has propelled a new era of quantitative image analysis. Thus I would love to teach a course Microscopic techniques and Image analysis, covering topics like Fluorescence microscopic techniques, Image processing, Image segmentation, Object detection, Object tracking, Registration, Graph matching, Deep learning methods etc.

Among the core chemical engineering courses I would like to teach Fluid Mechanics and Engineering Thermodynamics. As a PhD student I have served as Teaching Assistant thrice for 2 courses – Microfluidics and Chemical Engineering Thermodynamics. In these courses I graded quizzes, home works and exams. For both the courses I held problem solving sessions where I designed problems based on topics covered in lectures to provide more clarity to students. I also conducted office hours in both the courses.

References

  1. Aerni, S. J., Liu, X., Do, C. B., Gross, S. S., Nguyen, A., Guo, S. D., Long, F., Peng, H., Kim, S. S., & Batzoglou, S. (2013). Automated cellular annotation for high-resolution images of adult Caenorhabditis elegans. Bioinformatics, 29(13). https://doi.org/10.1093/bioinformatics/btt223
  2. Long, F., Peng, H., Liu, X., Kim, S. K., & Myers, E. (2009). A 3D digital atlas of C. elegans and its application to single-cell analyses. Nature Methods, 6(9), 667–672. https://doi.org/10.1038/nmeth.1366
  3. Toyoshima, Y., Wu, S., Kanamori, M., Sato, H., Jang, M. S., Oe, S., Murakami, Y., Teramoto, T., Park, C., Iwasaki, Y., Ishihara, T., Yoshida, R., & Iino, Y. (2019). An annotation dataset facilitates automatic annotation of whole-brain activity imaging of C. elegans. BioRxiv, 698241. https://doi.org/10.1101/698241
  4. Yemini, E., Lin, A., Nejatbakhsh, A., Varol, E., Sun, R., Mena, G. E., Samuel, A. D. T., Paninski, L., Venkatachalam, V., & Hobert, O. (2021). NeuroPAL: A Multicolor Atlas for Whole-Brain Neuronal Identification in C. elegans. Cell, 184(1), 272-288.e11. https://doi.org/https://doi.org/10.1016/j.cell.2020.12.012
  5. Kato, S., Kaplan, H. S., Schrödel, T., Skora, S., Lindsay, T. H., Yemini, E., Lockery, S., & Zimmer, M. (2015). Global Brain Dynamics Embed the Motor Command Sequence of Caenorhabditis elegans. Cell, 163(3), 656–669. https://doi.org/10.1016/j.cell.2015.09.034
  6. Kato, S., Kaplan, H. S., Schrödel, T., Skora, S., Lindsay, T. H., Yemini, E., Lockery, S., & Zimmer, M. (2015). Global Brain Dynamics Embed the Motor Command Sequence of Caenorhabditis elegans. Cell, 163(3), 656–669. https://doi.org/10.1016/j.cell.2015.09.034
  7. Chaudhary, S., Lee, S. A., Li, Y., Patel, D. S., & Lu, H. (2021). Graphical-model framework for automated annotation of cell identities in dense cellular images. ELife, 10, e60321. https://doi.org/10.7554/eLife.60321
  8. Lafferty, J., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML ’01 Proceedings of the Eighteenth International Conference on Machine Learning, 8(June), 282–289. https://doi.org/10.1038/nprot.2006.61
  9. Williams, A. H., Kim, T. H., Wang, F., Vyas, S., Ryu, S. I., Shenoy, K. V., Schnitzer, M., Kolda, T. G., & Ganguli, S. (2018). Unsupervised Discovery of Demixed, Low-Dimensional Neural Dynamics across Multiple Timescales through Tensor Component Analysis. Neuron, 98(6), 1099-1115.e8. https://doi.org/10.1016/j.neuron.2018.05.015