1. Staff

Patrick Zhao, Ph.D.


Current Research

The Problem

Problem 1: Advancing Bioinformatics for Plant Genotype-Phenotype (G2P) Association Discovery
Plants have evolved highly sophisticated mechanisms and efficient phenotypes (i.e., traits) to capture light energy and survive under harsh environmental conditions, including biotic/abiotic stresses, life-threatening pathogens and pest attacks, and complete life cycle competitively with other organisms. Plant genotype(s) are translated into phenotype(s) through cell signal transduction, gene expression regulated by transcription factors, small RNAs and epigenetic modifications, metabolic reactions and other developmental pathways; understanding such genotype-phenotype (G2P) relationships has been a fundamental and grand challenge in biology. Advancing bioinformatics for understanding the relationships between genomes and phenomes is thus recognized as one of grand challenges. One long-term research goal in the Zhao group is to advance bioinformatics for plant G2P association discovery via integrative analyses of genome-scale biological networks and genome-wide associations (GWAS).

Problem 2: Bioinformatics, Machine Learning and Big Biodata Analytics
Modern molecular biology often requires a systems-level analysis to understand how cells, tissues, and organs develop and function. With the rise of next generation high throughput 'omics' technologies, such as the next generation sequencing (NGS), data acquisition is no longer a barrier. The new challenge lies in processing the unprecedented scales of heterogeneous and often large data sets (also referred to as "big data") into information that can be applied in data-driven science discovery.

The Approach

Approach 1: We have three integrative strategies: 1) to advance bioinformatics to decipher genome-scale signaling transduction, metabolism, and gene regulation networks (focusing on transcription factors, small RNAs and epigenetic modifications) in model and crop plants; 2) to develop innovative models and algorithms to enable high accuracy genome-wide plant marker and trait linkage mapping analysis using advanced mixed models that incorporate gene-by-gene (GXG) epistatic effects and genotype–by-environment (GXE) interaction effects; and 3) to develop an integrative knowledge discovery platform to facilitate the integration, deciphering, and mapping of genotype-phenotype associations. The resulting biological model-based algorithms, tools, web services, and data resources will facilitate the reverse engineering of 'omics' data into complex plant biological and genetic network models to decipher plant phenotypes from genotypes.

Approach 2: Combining domain knowledge in bioinformatics, computational biology, data science and plant biology, we have been developing and will continuously develop large-scale bioinformatics and data resources (e.g., bioinformatics web servers and biological databases), machine learning-based novel sequence analysis, 'omics' data integration and data mining methodologies, and an innovative graph search-empowered integrative knowledge discovery platform to extract biological insights from these 'big data' to fortify basic plant science, plant genomics and translational genomics research. Big data also brings the following technique challenges: huge workloads and low memory. To cope with these challenges, we are equipped with 1) next generation data analysis, data mining and knowledge discovery technologies such as artificially intelligence/machine learning-based data analytics, 2) high-performance computing clusters consisting of CPUs and GPUs, and 3) data manipulation tactics of the following: a) parallel computing, and b) data dividing/assembly methods.

Current Projects

  • Development and Curation of the Alfalfa Breeder's Toolbox
  • Advancing bioinformatics for plant genotype-phenotype (G2P) association discovery
  • Development of methods, tools and databases for genome-scale biological network analysis
  • Development of methods and tools for analyzing trait through omics wide association studies
  • Development of methods and tools for analyzing genetic variant HapMap data for accession identification and genotyping
  • Advancing bioinformatics for genome-wide analysis of small signaling peptides in Medicago truncatula with an emphasis on macronutrient regulation of root and nodule development
  • Development of MtSSPdb: an Integrative Database of Medicago truncatula Small Signaling Peptides
  • Development of an integrative platform to study gene function and genome evolution in legumes
  • Development of novel methods and bioinformatics tools for the understanding of plant small RNA:mRNA interac¬tions or protein-DNA interactions for fast gene discovery and large-scale trait genotyping through the use of genetic screens and crop genetic engineering.
  • Doctor of Philosophy in Communication and Information Systems, Shanghai Jiao Tong University, China, 2000

Project Title: Collaborative Research: ABI Innovation: Plant Genotype-Phenotype (G2P) Association Discovery via Integrative Genome-scale Biological Network & Genome-wide Association Analysis
Source: National Science Foundation Advances in Biological Informatics (ABI)
Term: July 1, 2015 – June 30, 2018 (Estimated)

Project Title: Genome-wide Analysis of Small Signaling Peptides in Medicago truncatula with an Emphasis on Macro-nutrient Regulation of Root and Nodule Development
Source: National Science Foundation Plant Genome Research Project
Term: August 1, 2015 – July 31, 2019 (Estimated)

Project Title: MRI: Acquisition of a UPLC/MS/SPE/NMR for plant metabolomics
Source: National Science Foundation
Term: September 1, 2011 – August 31, 2015

Project Title: ABI: Systems bioinformatics approaches to modeling and deciphering plant transcriptional regulatory networks
Source: National Science Foundation Advances in Biological Informatics (ABI)
Term: July 1, 2010 – September 30, 2014 (Estimated)

Project Title: The Association of Independent Plant Research Institutes (AIPI) Plant Genome Annotation Group
Source: AIPI Collaborative Grant Award
Term: January 1, 2014 – December 31, 2014

Project Title: Advancing bioinformatics to understand mechanisms of plant non-coding small RNA-target interactions
Source: Oklahoma Center for the Advancement of Science & Technology (OCAST)
Term: March 1, 2011 – February 28, 2013

Project Title: Development of graph-based models to stimulate and decipher plant gene regulatory networks
Source: Oklahoma Center for the Advancement of Science & Technology (OCAST)
Term: August 1, 2009 – July 31, 2011

Project Title: Development of genetic resources to dissect the regulatory networks governing nodule development and differentiation in Medicago truncatula
Source: National Science Foundation Plant Genome
Term: September 15, 2007 – August 31, 2011

Project Title: Development and application of genomic tools for drought tolerance enhancement in alfalfa (Medicago sativa L.)
Source: Oklahoma Bioenergy Center
Term: December 15, 2010 – December 14, 2011

Project Title: Development and application of genomic tools for drought tolerance enhancement in Alfalfa (Medicago sativa L.)
Source: Oklahoma Department of Energy – Oklahoma Bioenergy Center
Term: January 1, 2008 – December 31, 2010

Project Title: Comparative genomics of secretory trichomes-Biofactories for production of plant secondary metabolites
Source: National Science Foundation Plant Genome
Term: September 27, 2006 – September 30, 2010