Metabometrix - Pathfinders in Metabonomics

	Metabometrix Ltd
	RSM Prince Consort Road London SW72BP

Pathfinders in Metabonomics

Biofluid
Chemometrics
Functional Genomics
Genomics
Magic-Angle-Spinning (MAS)
Metabolome
Metabonomics and Metabolomics
NMR Spectroscopy

Pattern Recognition (PR) Methods
Phenotype
Principal Components Analysis (PCA)
Proteomics
Supervised Chemometrics Methods
Transcriptomics
Unsupervised Chemometrics Methods

Biofluid: A fluid sample obtained from a living system. The donor might typically be a human or an animal. Fluids can be excreted (such as urine, sweat), expressed or secreted (such as milk, bile), obtained by intervention (such as blood plasma, serum or cerebrospinal fluid), develop as a result of a pathological process (such as blister or cyst fluid), or be applied and collected (such as dialysis fluid).

Chemometrics: The application of multivariate statistical, pattern recognition and informatics methods to chemically-based data.

Functional Genomics: Now that the human genome has been catalogued, it is necessary to determine the function of each gene, and to understand the control mechanisms. It will also be required that the role that genotype and environment play in determining the phenotype be elucidated. To understand gene function, researchers need to apply high-throughput technologies to study functional networks and pathways. With enough data and appropriate chemometrics tools, it should be possible to do this, which would allow optimization of drug target selection and the development of safer, more effective therapeutics. Metabonomics promises to be a lead technology in this process.

Genomics: This is simply the study of gene sequences and differences in those sequences between species and individuals and the variation of gene sequences in health and disease. This is a complex, lengthy, and expensive procedure and relatively few organisms have been sequenced so far. Knowing the gene sequence per se does not necessarily give insight into deep biological function, but as the understanding of the functional variability in gene sequences increases this will lead to the discovery of many new drug targets.

Magic-Angle-Spinning (MAS): A technique used in NMR spectroscopy to obtain high quality data on inhomogeneous systems such as tissues. The sample is rotated at an angle of 54.7% to the magnetic field axis at rates around 4-10 kHz using special rotors and gas bearings.

Metabolome: The quantitative complement of all the low molecular weight molecules present in cells in a particular physiological or developmental state.

Metabonomics and Metabolomics: These very similar terms have arisen at about the same time in different area of bioscience research, mainly animal biochemistry and microbial/plant biochemistry respectively. Although both involve the multiparametric measurement of metabolites they are not philosophically identical as metabonomics deals with integrated, multicellular, biological systems including communicating extracellular environments and metabolomics deals with simple cell systems and, at least in terms of published data, mainly intracellular metabolite concentrations.

NMR Spectroscopy: Some atomic nuclei possess a non-zero magnetic moment. This property is quantised and leads to discrete energy states in a magnetic field. Nuclei such as 1H, 13C, 15N, 19F and 31P can undergo transitions between these states when radiofrequency pulses of appropriate energy are applied. The exact frequency of a transition depends on the type of nucleus and on its electronic environment in a molecule. For example, 1H nuclei in a molecule give NMR peaks at frequencies (chemical shifts) characteristic of their chemical environment. NMR spectroscopy is extensively used as a structural tool and information on isomers and molecular conformations can be obtained by interpretation of the chemical shifts as well as splitting patterns due to indirect nuclear interactions (J couplings). In metabonomics, it is the the patterns that occur when many different biochemical entities are detected simultaneously in a mixture using 1H NMR that are interpreted.

Pattern Recognition (PR) Methods: PR and related multivariate statistical approaches can be used to discern significant patterns in complex data sets and are particularly appropriate in situations where there are more variables than samples in the data set. The general aim of PR is to classify objects (in this case 1H NMR spectra) or to predict the origin of objects based on identification of inherent patterns in a set of indirect measurements. PR methods can reduce the dimensionality of complex data sets via 2 or 3D mapping procedures, thereby facilitating the visualisation of inherent patterns in the data.

Phenotype: The observable traits or characteristics of an organism, for example hair colour, or the presence or absence of a disease. Phenotypic traits are not necessarily genetic.

Principal Components Analysis (PCA): This is a data dimension reduction method. It is termed an unsupervised technique in that no a priori knowledge as to the class of the samples is required and analysis is based on the calculation of latent variables. Principal components are linear combinations of the original data variables such that the first component explains as much as possible of the variance in the data set and subsequent components are orthogonal to each other and explain decreasing levels of data variance. Use of PCA enables the "best" representation, in terms of biochemical variation in the data set to be displayed in two or three dimensions.

Proteomics: The measurement of “all” cellular protein production and levels, the structural characterisation of those proteins and the understanding of their functions. This science is also heavily dependent on advanced analytical methodologies, including for example 2D gel-electrophoresis combined with nanospray mass spectrometry for separation and identification of proteins. Interestingly in humans, there may be only about 30,000 genes, but there are thought to be many more cellular proteins than there are genes, including all the possible post-translational modifications. This poses an immediate theoretical problem when gene expression- proteomic correlations are being sought as there is a higher level of cellular control than the genome which is in the protein complement itself. Also changes in gene expression which may or may not result in changes in cellular protein synthesis have to occur at different times in the cell, and different gene regulation events occurring at the same time may take different times to effect the proteome. From an analytical viewpoint, so far it has only been possible to separate and identify a small fraction of the possible cellular proteins.

Supervised Chemometrics Methods: Multiparametric data can be modelled, so that the class of a sample from an independent data set can be predicted based on a series of mathematical models derived from the original data or ‘training set’. These methods are known as supervised methods and utilise class information in order to maximise the separation between classes. Supervised methods such as soft independent modelling of classification analogy (SIMCA), Partial Least Squares (PLS) Analysis and PLS discriminant analysis (PLS-DA) can be used to predict objects that are unknown to the system.

Transcriptomics: This is the quantitative measurement of gene expression in a cell or tissue. Generally this involves the measurement of mRNA levels by various methods, the most popular currently being via proprietary gene chips. The problems here include the fact that chips are very expensive, that many genes or sequences have no known function and that the relationships between quantitative variation or patterns in expression and the influences of cell or pathway function are, at best, poorly understood. Moreover it is widely appreciated that mRNAs are not chemically stable and steps must be taken to ensure quantitative reliability of the chip measurements. A less well considered problem stems from the fact that quite large samples of tissue are generally required to make an extensive set of gene expression measurements on one sample (up to 1 g in the case of human tissues). In such sample, even in a relatively “homogeneous” tissue such as liver, there may be dozens of cell types in different topographical locations performing different functions and by definition have different levels of genetic activity. The gene chip measures an average of these activities the meaning of which is unclear.

Unsupervised Chemometrics Methods: See Principal Components Analysis.