(394g) An Approach Towards Method Development for Untargeted Urinary and Serum Metabolite Profiling in Metabolomics Research, As a Tool in Exposome Studies, Using Uplc-Q-TOF/MS

Authors: 
Sarigiannis, D., Aristotle University of Thessaloniki
Karakitsios, S., Aristotle University of Thessaloniki
Gabriel, A., Aristotle University of Thessaloniki
Papaioannou, N., Aristotle University of Thessaloniki
The exposome, refers to the total of exposures from the time of conception onwards, including the exogenous and endogenous exposures and more of that the different modifiable risk factors that predetermine to and predict disease. In line with exposome, exposome science aims at capturing the mechanistic processes that describe the source-to-dose continuum. Towards this aim, several methodological tools are employed related to environmental and human biomonitoring. Biological perturbations at different levels of biological organization are interpreted with multiple-omics (including genomics, transcriptomics, proteomics, and metabolomics) and post-omics technologies including epigenomics.

Furthermore the identification of the perturbations, results in the establishment of presumable pathways of toxicity. Pathways are been verified by targeted multi-omics and functional assays. Advanced bioinformatics tools, such as support vector machines and clustering algorithms and systems biology models, allow us to identify the functional links among the data derived from high throughput testing platforms and disease phenotypes providing thus phenotypic anchoring of the mechanistic hypotheses made earlier.

Here in this study we are focusing in the urinary and serum metabolomics profiling in metabolomics research, as a tool in exposome studies, using UPLC-Q-TOF/MS.

Global metabolic profiling (metabolomics/metabonomics), using LC–MS as an analytical platform, attracted great deal of interest in recent years in toxicological and pharmaceutical research, as well as in disease biomarker discovery. Although the last years progress has been made, a) in generating more or less comprehensive metabolite profiles, b) in data analysis, and c) in biomarker identification the actual methodology for untargeted profiling using LC–MS is in continuously development. The most difficult and the same time the most interesting in untargeted metabolite profiling is to optimize the different experimental steps, for studying thousands of unknown metabolites in biological samples. Both analytical and biological variability that exists, must be evaluated.

In this study we aimed to evaluate various sample pre-treatment and LC conditions in order to optimise a method for untargeted urinary and serum metabolite profiling that would yield high quality and reproducible data using UPLC/QToF MS. We tested the effects of varied experimental procedures on total ion chromatograms, total number of features and the repeatability of selected endogenous metabolites. We also investigated different approaches to evaluate and monitor the performance of analytical platforms used in generic metabolite profiling as well as testing the stability of biofluids under various sample handling and storage conditions. Major importance was the use of QC samples in metabolic profiling analyses. QC samples are the samples that became of a pool sample of the matrix to be analyzed and prepared by taking small aliquots of each of the study samples and mixing them to form a single “Pool”. The Pool sample and furthermore the QC samples, will contain a mean concentration of all of the components that are present in the samples under investigation. The novel in this method is the fact that internal standards have been used in all QC samples in order to strengthen the use of the QC samples. The QC samples have been placed at regular intervals throughout the analytical run in order to monitor different factors though the analysis. When combined with other measures, such an approach can rapidly be used to determine, if an analytical run has performed as it should, and the data is reliable enough to take forward into more detailed multi variate statistical analysis or, alternatively, that there has been a problem that requires investigation and resolution. In reality, this approach adapts established procedures that are followed in bioanalytical method validation to reach as much as possible the aim of characterizing the results obtained in terms of their reliability. What is next? Data preprocessing and data processing.

The main goals in spectral processing is to correctly arrange the huge amount of raw data generated by the MS files into a 2D matrix, and the improvement of signal quality and the reduction of the possible analytical and biological biases present in the raw data. There are several open source software available tools providing different methodological options for spectral processing. The most important criteria for the selection of the suitable tool are 1) the analysed biological sample, 2) the analytical technology used, and 3) the tool that has been chosen for further data analysis or pathway analysis, since the generated feature matrix should be in a format compatible with the tool. For the LC-MS data preprocessing a combination of R packages, including, xcms and limma, were used.

Regarding the LC-MS data analysis, the .d files generated from negative and positive ionization are treated as two different experiments. The parameters for noise removal, mass detection, deconvolution, data transformation, data reduction etc. were set based on the behavior of QC samples, because QC samples providing an average of all the metabolomes analysed in the study, as mentioned before. The asymmetric baseline corrector was used as the correction method, and for peak detection the centroid was used, since data were already centroided. In order to define the mass error for chromatogram builder, internal standards spectrums, such as caffeine or reserpine, were used as a reference. This is the reason why reference samples in the beginning, in the middle and at the end of analysis. Local minimum algorithm was selected due to low noise level, for the deconvolution. After deisotoping, alignment and gap-filling using the algorithm peak finder, a file containing the m/z values, retention times, and peak area for each detected peak, was exported in CSV format. Beside the careful design of the samples acquisition process, the careful cleaning and maintenance of the equipment before a batch analysis, to obtain consistent variables the resulting matrix was further reduced by the 80% rule. The 80% rule should not be applied to the QC samples rather to the problem samples, to avoid the loss of important metabolites reflecting to the biological status of the system. The instrument and overall process variability were determined by calculating the median RSD for all the endogenous metabolites, in case of the presented cohorts.

The next step in the downstream bioinformatics analysis was pathway mapping that revealed the roles that metabolites play in relation to each other and in biological aberrations, and EWAS analysis to draw the links between in utero exposure to metals, metabolic pathway deregulation, and clinically observed phenotypes of neurodevelopmental disorders. In particular, the logistic regression and FDR was carried out using the ‘X-Wide Association Analyses’ package, following the EWAS framework.

The application of this method was successfully used in two big cohord studies in the frame of HEALS (Health and Environment-wide associations via Large population Studies) project, called Repro PL and PHIME.