Jensen LJ

About the dark corners in the gene function space of Escherichia coli remaining without illumination by scientific literature

Although Escherichia coli (E. coli) is the most studied prokaryote organism in the history of life sciences, many molecular mechanisms and gene functions encoded in its genome remain to be discovered. This work aims at quantifying the illumination of the E. coli gene function space by the scientific literature and how close we are towards the goal of a complete list of E. coli gene functions.

Read

Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000

The mentioning of gene names in the body of the scientific literature 1901-2017 and their fractional counting was used as a proxy to assess the level of biological function discovery. We define a literature score of one as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. We find that less than 5000 human genes have each at least 100 FPEs in the available literature corpus.

Read