RESEARCH

Our research is at the intersection of

biology + computer science + statistics + applied mathematics

approach.png

Broadly, we develop and apply computational approaches to find explanations for how our genome relates to various aspects of health and disease.​ These approaches are based on ideas from statistics, machine learning, and applied mathematics. They are designed to analyze large-scale data and build genome-scale predictive models about the molecular basis of biological and biomedical phenomena.

We work on a wide range of research projects in this area.

INTEGRATIVE ANALYSIS OF LARGE-SCALE DATA

We develop methods for integrating thousands of publicly available –omics datasets to build genome-scale models of how genes interact with each other in various biological contexts including health and disease.

brain.degnorm.assayed-genes.paths.edges.

EXTRACTING KNOWLEDGE FROM NATURAL LANGUAGE

Much genetic/molecular knowledge is buried within millions of published papers. We develop methods to automatically extract links between biological entities from literature while complementing manual curation.

nlp.png

Few related papers

[ Coming soon! ]

AGE- AND SEX-SPECIFIC GENOMIC CONTEXTS

Most diseases vary in prevalence and impact between females/males and across life stages. We are developing methods to delineate the genomic basis of differences in physiology and disease between sexes and across ages.

age-sex.png

Few related papers

[ Coming soon! ]

SINGLE CELL GENOMICS

Single-cell omics gives unprecedented resolution into gene expression regulatory programs. We are developing methods to use these data to study cell-type and developmental specificity of complex diseases.

scrna-seq.png

Few related papers

[ Coming soon! ]

BIOLOGY-INFORMED MACHINE LEARNING MODELS

We develop new ML approaches to predict novel genes related to functions and diseases by using prior knowledge of various types and varied confidence levels to constraint model structure and train model parameters.

ml.png

DATA IMPUTATION AND PREDICTION

Massive public omics datasets contain several "holes" – not just "missing" but systematically unmeasured values. We are developing statistical methods to impute these values to enable integrative analyses. 

Fig-Project-Overview_edited.jpg

Few related papers

  1. A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.  2020 Nuc. Acids Res.

[ More coming soon. ]

HETEROGENEITY OF COMPLEX DISEASES

Each disease is not a single well-defined condition. We are working on building integrative approaches that can unravel subtypes of complex disorders defined by common functional/mechanistic deregulations.

dis-het.png

DRUG REPURPOSING

Given the prohibitive costs and timelines for developing new drugs, we are working on machine learning approaches to repurposing approved drugs and drug combinations for complex and infectious diseases.

drug-repurposing.png

Few related papers

  1. Reconciling multiple connectivity scores for drug repurposing.  2021 Brief. Bioinformatics.

[ More coming soon. ]

SOFTWARE DEVELOPMMENT

Our group is committed to making all our computational methods available to the broader research community in the form of software and online tools to promote open science, i.e., transparency, reproducibility, and reuse.

Figure1.png

OMICS SAMPLE ANNOTATION

There are nearly 2 million publicly available omics samples but they remain acutely underused. We are developing methods to systematically annotate these samples to enable researchers to find and reuse them.

onto.png

Few related papers

  1. Systematic tissue annotations of –omics samples by modeling unstructured metadata. 2021 bioRxiv.

[ More coming soon. ]

CROSS-SPECIES MODELS OF HUMAN TRAITS & DISEASES

Choosing the right in vivo model system to study human biology is hard. We are working on methods to translate omics data and knowledge between human and models for studying specific facets of complex traits and diseases.

cross-sp.png

Few related papers

[ Coming soon! ]

SCIENCE AND MEDICAL UNDERSTANDING & COMMUNICATION

Effectively communicating science and medicine is as critical as making new discoveries. We are developing new methods to quantify language understandability and tools for science/ medical communication & education.

leukocyt.png

Few related papers

[ Coming soon! ]

 

Collaborators

Ingo Braasch, Michigan State U. | Keith English, Michigan State U. | Julia Ganz, Michigan State U. | Santhosh Girirajan, Penn State U. | Kelly Klump, Michigan State U. | Rick Leach, Michigan State U. | Adam Moeser, Michigan State U. | Andy Pereira, U. Arkansas | Dhandapany Perundurai, OHSU | Aaditya Rangan, New York U.

 

Funding

Current Funding

Dept. Biochemistry & Molecular Biology at MSU

Past Funding

thorek.png
BEACON-Logo.jpg