TY - JOUR
T1 - Escherichia coli Data-Driven Strain Design Using Aggregated Adaptive Laboratory Evolution Mutational Data
AU - Phaneuf, Patrick V.
AU - Zielinski, Daniel C.
AU - Yurkovich, James T.
AU - Johnsen, Josefin
AU - Szubin, Richard
AU - Yang, Lei
AU - Kim, Se Hyeuk
AU - Schulz, Sebastian
AU - Wu, Muyao
AU - Dalldorf, Christopher
AU - Ozdemir, Emre
AU - Lennen, Rebecca M.
AU - Palsson, Bernhard O.
AU - Feist, Adam M.
N1 - Publisher Copyright: ©
PY - 2021/11/11
Y1 - 2021/11/11
N2 - Microbes are being engineered for an increasingly large and diverse set of applications. However, the designing of microbial genomes remains challenging due to the general complexity of biological systems. Adaptive Laboratory Evolution (ALE) leverages nature's problem-solving processes to generate optimized genotypes currently inaccessible to rational methods. The large amount of public ALE data now represents a new opportunity for data-driven strain design. This study describes how novel strain designs, or genome sequences not yet observed in ALE experiments or published designs, can be extracted from aggregated ALE data and demonstrates this by designing, building, and testing three novel Escherichia coli strains with fitnesses comparable to ALE mutants. These designs were achieved through a meta-analysis of aggregated ALE mutations data (63 Escherichia coli K-12 MG1655 based ALE experiments, described by 93 unique environmental conditions, 357 independent evolutions, and 13 »957 observed mutations), which additionally revealed global ALE mutation trends that inform on ALE-derived strain design principles. Such informative trends anticipate ALE-derived strain designs as largely gene-centric, as opposed to noncoding, and composed of a relatively small number of beneficial variants (approximately 6). These results demonstrate how strain design efforts can be enhanced by the meta-analysis of aggregated ALE data.
AB - Microbes are being engineered for an increasingly large and diverse set of applications. However, the designing of microbial genomes remains challenging due to the general complexity of biological systems. Adaptive Laboratory Evolution (ALE) leverages nature's problem-solving processes to generate optimized genotypes currently inaccessible to rational methods. The large amount of public ALE data now represents a new opportunity for data-driven strain design. This study describes how novel strain designs, or genome sequences not yet observed in ALE experiments or published designs, can be extracted from aggregated ALE data and demonstrates this by designing, building, and testing three novel Escherichia coli strains with fitnesses comparable to ALE mutants. These designs were achieved through a meta-analysis of aggregated ALE mutations data (63 Escherichia coli K-12 MG1655 based ALE experiments, described by 93 unique environmental conditions, 357 independent evolutions, and 13 »957 observed mutations), which additionally revealed global ALE mutation trends that inform on ALE-derived strain design principles. Such informative trends anticipate ALE-derived strain designs as largely gene-centric, as opposed to noncoding, and composed of a relatively small number of beneficial variants (approximately 6). These results demonstrate how strain design efforts can be enhanced by the meta-analysis of aggregated ALE data.
KW - adaptive laboratory evolution
KW - data-driven strain design
KW - genome design variables
KW - meta-analysis
KW - mutation functional analysis
KW - structural biology
UR - https://www.scopus.com/pages/publications/85119590092
U2 - 10.1021/acssynbio.1c00337
DO - 10.1021/acssynbio.1c00337
M3 - Article
SN - 2161-5063
JO - ACS Synthetic Biology
JF - ACS Synthetic Biology
ER -