Although Escherichia coli (E. coli) is the most studied prokaryote organism in the history of life sciences, many molecular mechanisms and gene functions encoded in its genome remain to be discovered. This work aims at quantifying the illumination of the E. coli gene function space by the scientific literature and how close we are towards the goal of a complete list of E. coli gene functions.
ReadThe human proteins TMTC1, TMTC2, TMTC3 and TMTC4 have been experimentally shown to be components of a new O-mannosylation pathway. Their own mannosyl-transferase activity has been suspected but their actual enzymatic potential has not been demonstrated yet. So far, sequence analysis of TMTCs has been compromised by evolutionary sequence divergence within their membrane-embedded N-terminal region, sequence inaccuracies in the protein databases and the difficulty to interpret the large functional variety of known homologous proteins (mostly sugar transferases and some with known 3D structure).
ReadThe transamidase complex is a molecular machine in the endoplasmic reticulum of eukaryotes that attaches a glycosylphosphatidylinositol (GPI) lipid anchor to substrate proteins after cleaving a C-terminal propeptide with a defined sequence signal. Its five subunits are very hydrophobic; thus, solubility, heterologous expression and complex reconstruction are difficult.
ReadWhether due to simplicity or hypocrisy, the question of access to patient data for biomedical research is widely seen in the public discourse only from the angle of patient privacy. At the same time, the desire to live and to live without disability is of much higher value to the patients. This goal can only be achieved by extracting research insight from patient data in addition to working on model organisms, something that is well understood by many patients.
ReadBACKGROUND: Phomafungin is a recently reported broad spectrum antifungal compound but its biosynthetic pathway is unknown. We combed publicly available Phoma genomes but failed to find any putative biosynthetic gene cluster that could account for its biosynthesis.
ReadThe mentioning of gene names in the body of the scientific literature 1901-2017 and their fractional counting was used as a proxy to assess the level of biological function discovery. We define a literature score of one as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. We find that less than 5000 human genes have each at least 100 FPEs in the available literature corpus.
ReadDistant homology relationships among proteins with many transmembrane regions (TMs) are difficult to detect as they are clouded by the TMs' hydrophobic compositional bias and mutational divergence in connecting loops. In the case of several GPI lipid anchor biosynthesis pathway components, the hidden evolutionary signal can be revealed with dissectHMMER, a sequence similarity search tool focusing on fold-critical, high complexity sequence segments.
Read