Friday, May 22, 2020 - 3:00pm

Abstract:

As genomic repositories increasingly grow with a variety of data from a multitude of organisms, the need to approach extracting and interpreting data also becomes increasingly difficult. Recent advances in protein annotation and structure prediction have improved, however the variety and sheer amount of data requires unique approaches from multiple different disciplines. Bioinformatics yields important functional sequence information and classification. Molecular dynamics (MD) simulation allows for the interrogation of biochemical systems at the atomistic level. Combined with machine learning, these disciplines can be equipped to investigate the complex functions and relationships of proteins within the current abundant genomic landscape.

The objective of this dissertation is to outline complementary methodologies from various fields - bioinformatics, molecular dynamics simulation, and machine learning - that together, can investigate vast genomic repositories, functional protein data.

Aim 1: The development of the bioinformatics and in silico maturation pipeline consists of gene annotation, MD simulation to equilibrate predicted proteins, and statistical methods adopted from graph theory in collaboration with the Butts lab. Proteins can be represented in graph theoretic terms allowing for the exploration of diverse protein structural features.

Aim 2: Molecular dynamics simulation gives rise to atomic level details of complex systems. A variety of protein systems - HIV Rev, short intrinsically disordered peptides, STXPB4, YAP-1 WW domain - explored are intrinsically disordered. MD simulations were used to simulate the complexities and difficulties encountered within these proteins as well as plant metabolic proteins.

Aim 3: After the aforementioned bioinformatics pipeline and in silico molecular dynamics-based maturation of predicted proteins, methods to extract useful atomistic information from coarse PSNs were developed. A multi-layer perceptron was used to essentially upscale coarse PSNs into atomistic models. The significance of this new technique permits for the simulation of coarse PSNs using ERGMs, and the exploration of complex protein structural conformations.

Speaker: 

Vy Duong

Institution: 

Martin Group

Location: 

Zoom