The Samuel Roberts Noble Foundation, Inc.

Sumner Group: Metabolomics

Primary and secondary metabolites represent the end products of genetic expression and the comprehensive analysis of large numbers of metabolites has been termed metabolomics. These qualitative and quantitative analyses provide a holistic view of the biochemical status or biochemical phenotype of an organism. The correlations of biochemical information with genetic and molecular data are very useful in providing better insight into the functions of unknown gene or systems response to external stimuli. Metabolomic studies also offer unique opportunities to study regulation and signaling under the control of small molecules (i.e., metabolites). Quite often, signaling and regulation are transparent at the transcriptome and/or proteome level. Finally, metabolomics offers the unbiased ability to differentiate organisms or cell states based on metabolite levels that may or may not produce visible phenotypes/genotypes. Although metabolomics is quite promising, several challenges still exist that influence the implementation of a metabolomic approach, including chemical complexity, analytical and biological variance, and dynamic range. Our group is employing selective extraction and parallel technologies to address these challenges and provide a comprehensive view of the metabolome. This approach is described in the following paragraphs.

Figure 1
Figure 1. An integrated functional genomic approach monitors quantitative and qualitative differences in the transcriptome, proteome, and metabolome as a means to study gene function and cellular responses to external stimuli. The large amount of information contained within the profile data is deposited into relational databases where it can be correlated, compared, and interrogated by bioinformatic tools to yield a better understanding of biology. Profile data are also emerging as unique means of annotating genome data. Confirmation of putative proteins through proteome analysis is one example.

Sequential Extraction and Parallel Analysis
Our approach to addressing the chemical complexity and dynamic range of the metabolome employs sequential extraction followed by parallel analyses. This approach is outlined (Fig. 2), and is designed to segregate the metabolome into more manageable subclasses with similar chemical properties. The subclasses are subjected to parallel analytical profiling techniques to record metabolite profile information. Segregation of the subclasses helps minimize chemical interferences, while parallel analyses help visualize a greater portion of the metabolome. Our technological approaches to multidimensional parallel profiling center around mass spectrometry due to its enhanced sensitivity and specificity. Specific methods such as GC/MS, LC/MS, and CE are matched with the target subclass to achieve the best performance.

Metabolic profiling of plant tissues is also complicated by changes in metabolite composition and concentration that can occur due to the presence of a large variety of enzymes. This necessitates rapid harvesting of plant materials to minimize degradation of materials. Tissues being profiled are harvested, immediately frozen in liquid nitrogen, lyophilized, and stored at -80°C until extracted or processed. Dry tissue can then be sequentially extracted based on solvent properties such as dielectric constant. Combined solvents can also be used for extraction such that metabolites selectively partition between solvents. Extracts can be further fractionated using solid phase extraction (SPE) if needed. Additional fractionation, however, moves away from the objective of global profiling, but may be necessary to get a more detailed view of the metabolome.

Figure 2
Figure 2. Our approach to surmounting the metabolome obstacles of chemical complexity and dynamic range employs sequential extraction followed by parallel analyses. Segregation of the metabolome into subclasses helps minimize chemical interferences, while parallel analyses help to visualize a greater portion of the metabolome.

Analytical and Biological Variance
Analytical variance is defined as the coefficient of variance or relative standard deviation that is directly related to the experimental approach. This variance does differ in accordance with the technology platform being used and is indeterminate in origin. Biological variance is also indeterminate in origin and arises from quantitative variations in metabolite levels between plants of the same species grown under identical or as near as possible identical conditions. Biological variations typically exceed analytical variations. Recently, Roessner and coworkers reported that the biological variability exceeded the analytical variability of GC/MS by a factor of ten.[34] These large biological variations are major limitations on the "resolution" of the metabolomics approach. One way to reduce biological variance is to pool tissues. This tactic helps minimize random variations through statistical averaging; however, many variations in metabolite levels often have biological significance and result from functional differentiation of tissues. Pooling tissue can also result in undesirable dilution of site or tissue specific up/down-regulated metabolites. An alternate is to start with homogeneous tissue such as cell cultures, but this has obvious restrictions on the ability to study intact plants. The bottom line is that sampling is important and strategies need to be incorporated to minimize variations.

Gas Chromatography
Gas chromatography coupled to mass spectrometry (GC/MS) is emerging as a powerful tool for profiling large numbers of primary metabolites,[12,34,39,40] and we are incorporating this approach into our program. The favorable attributes of GC/MS include high reproducibility (low analytical variance), standardized technique, and high separation efficiencies. High separation efficiencies allow for the separation of complex mixtures, and are achieved with long (30 to 60m) capillary columns (internal diameters 75 to 320m). GC/MS is commonly used in conjunction with electron ionization and requires the analyte to be volatile, thermally stable, and energetically stable. Many important biological analytes are polar and nonvolatile; therefore, they must be first chemically modified or derivatized prior to GC/MS analysis.

Figure 3
Figure 3. GC/MS metabolic profiles of a polar M. truncatula root extract that illustrates the elution regions of various metabolite classes. The (a) normalized chromatogram is dominated by several peaks, but the (b) expanded view of the same root profile reveals a substantially large amount of information not apparent at first glance.

We are using GC/MS for profiling primary metabolites in M. truncatula. This approach allows for the simultaneous profiling of approximately 300 to 500 components, including amino acids, organic acids, monosaccarides, disaccarides, alcohols, and aromatic amines.[12,31] A typical GC/MS profile of a M. truncatula root extract is shown (Fig. 3). The figure illustrates the naturally occurring relative abundances of metabolites visualized by GC/MS while also providing an expanded view of the profiles revealing the large amount of information contained within the data.

A large number of primary metabolites can be readily identified because most of these compounds are commercially available. Standard compounds are derivatized, co-chromatographed, and the data are deposited into databases. Unknown metabolites are identified by matching chromatographic retention times and mass spectra to that of known compounds in the databases. [41,42] Mass spectral identification is performed by matching target spectra with commercial libraries such as The National Institute of Standards and Technology (NIST) library or custom libraries constructed in-house using authentic standards. There are several computer algorithms that automate the process of database searching and identification.[43-45] An example includes the Automated Mass Spectral Deconvolution and Identification Software (AMDIS) provided with many Hewlett Packard GC/MS instruments.[46] We exploit both custom and commercial libraries for metabolite identifications. Using this approach we have identified a large number (∼130 currently) of primary metabolites in M. truncatula (Fig. 4). This method has also been used to compare the profiles of various M. truncatula tissues.

Figure 4
Figure 4. GC/MS metabolic profile of a polar M. truncatula root extract that provides the identification for many of the root components. Individual components are identified by matching their mass spectra to those in databases or by comparison with authentic samples. Using this approach we have identified a large number (>130 currently) of primary metabolites in M. truncatula.

The primary limitation associated with GC/MS is the need for derivatization. Derivatization introduces additional complexity to the system and is not 100% efficient. Inefficient reactions result in the presence of multiple derivatized forms of the same compound. For example, we can detect three different derivatization products of the amino acid asparagine (mw = 132) in M. truncatula roots (Fig. 4). These include asparagine, N,O-TMS (mw = 276), asparagine, N,N,O-TMS (mw = 348), and asparagine, N,N,N,O-TMS (mw = 420). Inefficiency of the derivation reactions also limits the lower concentration range of analytes that can be profiled. Finally, derivartization is not capable of achieving volatility for all compounds, such as many of the flavonoid glycosides. If derivatization is successful and the analyte is volatilized, it must still remain energetically stable enough to be detected. If the compound is not stable, it will fragment and molecular weight information may be lost, thereby complicating identification.

High Performance Liquid Chromatography
HPLC is a universal separation technique that is capable of separating both volatiles and non-volatiles without the need for derivatization. We are developing methods that employ both on-line photodiode array (PDA) detection and mass selective detection, HPLC/PDA/MS. This approach also utilizes an ion-trap mass spectrometer that is capable of normal and tandem mass spectrometry. [47,48] Tandem mass spectrometry allows the isolation of compounds in the gas phase followed by controlled fragmentation to yield structural information.[49,50] The combination of these technologies, i.e. HPLC/PDA/MS, yields a powerful tool for profiling and structural determinations.[51,52]

Figure 5
Figure 5. Three-dimensional display of the photodiode array absorbance data obtained by HPLC/PDA/MS for a M. truncatula extract. The first dimension is HPLC retention time, second is wavelength and third is absorbance. The data can be rapidly previewed for specific absorbance regions characteristic of functional groups.

On-line photodiode array detection is most useful for the analysis of compounds containing chromophores, such as phenolic compounds including flavonoids, isoflavonoids, coumarins, and pterocarpans. An illustrative three-dimensional photodiode array display for a Medicago truncatula phenolic extract is provided (Fig. 5). The three dimensional data consist of UV absorption spectra from 190 to 500 nm for each point along the chromatogram. The data can be rapidly previewed for unique absorption regions correlating to specific compounds or functional groups. Independent chromatograms can also be constructed for each wavelength to increase the selectivity of the data. The UV data are complemented by the mass selective data. Illustrations of both types of data are provided (Fig. 6). The chromatogram is generated from the ion abundances and mass spectra recorded in the negative-ion, electropsray ionization mode. The mass and UV spectra for the peak eluting at approximately 45 minutes are provided in the inserts and identify the eluting compound as medicarpin, known to have a ?max at 287 nm and a molecular weight of 270. A negative-ion is observed for the deprotonated molecule at m/z 269 with a deprotonated dimer ion observed at m/z 539 confirming the molecular weight as 270.

Figure 6
Figure 6. Multidimensional data obtained by HPLC/PDA/MS analysis of an alfalfa root extract. The HPLC retention time, the UV absorbance spectrum, and the mass spectrum readily identify the peak eluting at 45 minutes as medicarpin, a known phytoalexin in alfalfa.

The utility of mass selective detection is greatest when analyzing compounds that do not contain chromophores or when structural information is needed for chemical identification. Triterpene saponins contain very weak chromophores and have long been associated with a variety of biological activities including allelopathy,[53] poor digestibility in ruminants,[54] deterrence to insect foraging,55 and beneficial antifungal properties. 56 Saponins also possess anti-inflammatory, cholesterol lowering, and anticancer properties.57-59 Saponins isolated from the legume Acacia victoriae have been reported to trigger apoptosis in cancer cells.[60]

Saponins consist of triterpenoid or steroidal aglycones that are substituted with a varying number of sugar side chains. Unsubstituted, nonpolar aglycones are classified as sapogenins and two representative structures are included (Fig. 7). Because glycoside conjugates are labile and nonvolatile, they must be ionized using a lower energy technique such as electrospray ionization to retain their integrity during mass analysis. We have been profiling saponins in alfalfa (Medicago sativa) and M. truncatula by using HPLC coupled to an ESI ion-trap mass spectrometer to acquire normal and tandem mass spectra during profiling.[61] The mass spectra have also been used for structural characterization of Medicago saponins. A profile of an alfalfa extract is provided (Fig. 7) that illustrates both the enhanced sensitivity and selectivity of mass selective detection compared to UV detection at 206 nm.

Figure 7
Figure 7. HPLC/PDA/MS data for a saponin extract from alfalfa root. Comparison of the UV chromatogram and the total ion chromatogram (TIC) from the mass data illustrates the increased sensitivity of mass selective detection for saponins that possess only weak chromophores. Mass spectra and aglycone structures of two common saponins found in alfalfa and M. truncatula are provided for a) soysaponin I and b) 3-glucose-medicagenic acid. The increased selectivity of MS is achieved through molecular weight and fragment information.

HPLC/PDA/MS has also been used to compare the saponin profiles in multiple cultivars of alfalfa and M. truncatula. Comparative profiles are provided (Fig. 8). It is interesting that these closely related legumes yielded different saponin profiles. The saponin profile of M. truncatula is more complex than alfalfa and may provide a richer source for mining putative pharmaceuticals.

Figure 8
Figure 8. Comparative saponin profiles for two cultivars of alfalfa and one cultivar of M. truncatula obtained by reverse-phase HPLC/PDA/MS using electrospray ionization and an ion trap mass spectrometer. The profiles illustrate the increased complexity of saponins in M. truncatula and offer a richer source for bio-prospecting of natural products.

Tandem Mass Spectrometry (MS/MS)
The analysis of many natural products including saponins is further complicated by the lack of commercial standards. This complicates the identification process and requires more powerful tools for structural elucidation. Tandem mass spectrometry (MS/MS) is useful in structural determinations and can be visualized as multiple mass spectrometers placed in tandem as illustrated (Fig. 9). This technique performs gas-phase purification of a specified m/z value using the first mass spectrometer. This is achieved by allowing only the ion of interest to be transmitted while simultaneously discriminating against (rejecting) all other ions. The transmitted ion is then fragmented through unimolecular or collisionally induced dissociation to yield product or fragment ions from the precursor species. These ions can then be rationalized to a structure. A MS/MS spectrum is provided for 3-glucose-28-glucose medicagenic acid (Fig. 10). The spectrum shows ion peaks corresponding to the loss of two hexoses, the deprotonated aglycone, and a characteristic fragment ion of medicagenic acid.

Figure 9
Figure 9. Conceptual view of tandem mass spectrometry with a tandem-in-space triple quadrupole mass analyzer. The first mass analyzer (Q1) selects the precursor ion of interest by allowing it only to pass while discriminating against all others. The precursor ion is then fragmented, usually by energetic collisions, in the second quadrupole (q2) that is operated in transmissive mode allowing all fragment ions to be collimated and passed into the third quadrupole (Q3). Q3 performs mass analysis on the product ions that compose the tandem mass spectra and are rationalized to a structure.

Multiple mass analyzers exist that can perform tandem mass spectrometry. Some use a tandem-in-space configuration, such as the triple quadrupole mass analyzers illustrated (Fig. 9). Others use a tandem-in-time configuration and include instruments such as ion-traps (ITMS) and Fourier transform ion cyclotron resonance mass spectrometry (FTICRMS or FTMS). A triple quadrupole mass spectrometer can only perform the tandem process once for an isolated precursor ion (e.g., MS/MS), but trapping or tandem-in-time instruments can perform repetitive tandem mass spectrometry (MSn) thus adding nth degrees of structural characterization and elucidation. When an ion-trap is combined with HPLC and photodiode array detection, the net result is a profiling tool that provides a powerful tool for both metabolite profiling and metabolite identification.

Figure 10
Figure 10. HPLC/MS/MS tandem mass spectrum of 3-Glc-28-Glc medicagenic acid obtained using an ion-trap mass analyzer. The spectrum illustrates the successive loss of two hexoses, the deprotonated aglycone, and a characteristic fragment ion associated with medicagenic acid saponins.
Chemical Identification using MSn
ESI/MS4 tandem mass spectra of an unknown cell wall phenolic that has been identified as acetosyringone (3,'5',dimethoxy-4-hydroxy acetophenone) and an authentic acetosyringone standard. Levels of the unknown product were observed to increase in tobacco cell cultures harboring a constitutively expressed cinnamate-4-hydoxralase (C4H) transgene in response to yeast elicitation and relative to a control.

Pages: 1 2 · References