Hmdb loader

Introduction to HMDB

HMDB is a detailed database on small molecule from Homo sapiens. Each small molecule entry has extensive information on properties, structure, and biology. Each small molecule can have 1 or more enzymes and transporters associated. Below is a quick definition list to get you started.

Metabolite
A naturally occurring molecule typically under 1000 MW.
Enzyme
Any protein which catalyzes chemical reactions involving the small molecule.
Transporter
A membrane bound protein which shuttles ions, small molecules or macromolecules across membranes, into cells or out of cells.

Field Documentation and Sources

Data sources indicate the source of the information present in each field. Data sources do not indicate that all sources were used for a particular field and metabolite. Individual metabolite reference information can be found in the "General References" section of each metabolite.

Field Description Sources
Creation Date Date/time the entry was created
  • Automatic
Update Date Date/time the entry was last updated
  • Automatic
HMDB ID (Primary Accession Number) Unique HMDB accession number consisting of a 4 letter prefix (HMDB) and a 5 number suffix. This ID is used to access the metabolite entry via the URL. If an entry is deleted, it's HMDB ID will not be reused.
  • Automatic
Name Standard name of metabolite
Description Description of the metabolite describing general facts.
Kingdom First level of hierarchical classification.Organic or Inorganic
Super Class Second level of hierarchical classification. Metabolite with the same super class are considered structurally similar.
Class Third level of hierarchical classification. Metabolite classes form the major component of the classification system. Metabolite with the same super class are considered structurally similar.
Sub Class Fourth level of hierarchical classification. Metabolite with the same class are considered structurally similar.
Substituents Fifth level of hierarchical classification. Metabolite functional groups and substructures.
Direct Parent A direct parent in the taxonomy is the most descriptive chemical class a chemical entity can be attributed to. It is generated by taking into account the largest substructure (which characterizes a given chemical class) of the biomolecule and the most descriptive attributes.
Synonyms Alternate names of the metabolite
Chemical IUPAC Name IUPAC or standard chemical name for the metabolite
Traditional IUPAC Name Traditional IUPAC or standard chemical name for the metabolite
Chemical Formula Chemical formula describing atomic or elemental composition
Formal Charge Molecular formal charge
Average Molecular Weight Molecular weight in g/mol, determined from molecular formula or sequence
Monoisotopic Molecular Weight The sum of the masses of the atoms in a molecule using the unbound, ground-state, rest mass of the principle (most abundant) isotope for each element instead of the isotopic average mass.
Structure The 2D chemical structure including links to download and view the structure in various formats.
SMILES Isomeric SMILES string corresponding to metabolite structure
InChI Standard InChI identifier
InChI Key Standard InChI key
CAS Registry Number Chemical Abstract Service identification number
KEGG Compound ID Kyoto Encyclopedia of Genes and Genomes compound identification number (if molecule is in KEGG)
PubChem Compound ID NCBI's PubChem database compound identification number
ChemSpider ID ChemSpider identification number
ChEBI ID EBI's Chemicals of Biological Interest identification number (if metabolite is in ChEBI)
Wikipedia Link Link to Wikipedia entry for the given metabolite (if it exists)
Phenol Explorer ID Phenol Explorer identification number
DrugBank ID DrugBank identification number
KNApSAcK ID DrugBank identification number
OMIM ID OMIM identification number
Metagene ID Metagene identification number
State Physical state (solid, liquid, gas)
Melting Point Melting point (if solid) or boiling point (if liquid) in degrees Celsius
Experimental Water Solubility Water solubility in mg/mL or g/L
Predicted Water Solubility Predicted water solubility in mg/mL
Experimental LogP/Hydrophobicity Water/octanol partition coefficient (if small molecule) or hydrophobicity score (Gravy score) if protein/peptide
Predicted LogP/Hydrophobicity Predicted water/octanol partition coefficient
Predicted LogS Predicted LogS (water solubility)
Experimental LogS Experimental LogS (water solubility)
pKa Dissociation constant (pKa)
Experimental QqQ MS/MS Spectrum
Experimental 1H NMR Spectrum
Experimental 13C NMR Spectrum Image of experimental 13C NMR
Experimental 13C HSQC Spectrum Image of experimental 13C NMR
Experimental 2D TOCSY Spectrum Image of experimental 2D TOCSY spectrum
Predicted 1H NMR Spectrum Image of predicted 1H NMR Spectrum
Predicted 13C NMR Spectrum Image of experimental 13C NMR Spectrum
Biospecimen Location
Tissue Location
Normal Metabolite Concentration (Urine) Metabolite concentration found in urine
Normal Metabolite Concentration (Plasma) Metabolite concentration found in blood plasma
Normal Metabolite Concentration (CSF) Metabolite concentration found in cerebral spinal fluid
Normal Metabolite Concentration (Cellular) Metabolite concentration found in cells
Normal Metabolite Concentration (Others) Metabolite concentration found in other buifluids
Associated Disorders and Diseases
Abnormal Metabolite Concentration (Urine) Metabolite concentration found in urine
Abnormal Metabolite Concentration (Plasma) Metabolite concentration found in blood plasma
Abnormal Metabolite Concentration (CSF) Metabolite concentration found in cerebral spinal fluid
Abnormal Metabolite Concentration (Cellular) Metabolite concentration found in cells
Abnormal Metabolite Concentration (Others) Metabolite concentration found in other buifluids
Pathways Metabolic Pathways
Cellular Locations Metabolite subcellular locations
References General on-line reference to other details about the metabolite
Synthesis Reference Reference describing the compound synthesis

Enzyme, Cofactor and Transporter Documentation

Field Description Sources
Name Name of the protein or macromolecule (or other small molecule)
Gene Name Gene name
Synonyms Alternate names (protein names, abbreviations, etc.)
Protein Sequence Amino acid sequence
Number of Residues Number of amino acids in the protein sequence
  • Automatic
Molecular Weight (Daltons) Molecular weight given in Daltons or g/mol
  • Automatic
Theoretical pI Theoretical isoelectric point
  • Automatic
GO Classification Gene ontology classification including function, cellular process and location
General Function Short 3-4 word summary of the primary functions
Specific Function Detailed 30-40 word summary of the specific functions
Pathways Key pathways or processes (from SMPD) that the given molecule is involved in
Reaction Reaction(s) that the given molecule participates in
Pfam Domain Function Names and ID numbers of PFAM domains
Signals Location of signal peptide or other localization signals in the sequence
Transmembrane Regions Number and location of the transmembrane helices
GenBank ID Protein GenBank protein ID (if it exists)
UniProt ID/Name UniProt ID (if it exists)
PDB ID PDB ID (if it exists)
Cellular Location Location of the given protein or macromolecule inside or around the cell (cytoplasm, nucleus, membrane, etc.)
Gene Sequence DNA sequence (from cDNA) of the given molecule
GenBank ID Gene GenBank database gene identifier and link
Chromosome Location Location of the molecule on any of the 16 Saccharomyces cerevisiae
Locus More detailed location of the chromosomal position of the gene
References Pubmed references
  • PubMed
  • Automatic
  • Manual Search