top of page

RESEARCH

Our research is at the intersection of

biology + computer science + statistics + applied mathematics + engineering

We develop computational approaches that can leverage massive public data collections to unravel the molecular basis of complex traits and diseases.
(click the tabs for more)

We are focused on addressing three facets of heterogeneity of complex traits and diseases

research_biomedical-areas.png

The following are active research projects.

INTEGRATIVE ANALYSIS OF LARGE-SCALE DATA

We develop methods for integrating thousands of publicly available –omics datasets to build genome-scale models of how genes interact with each other in various biological contexts including health and disease.

brain.degnorm.assayed-genes.paths.edges.

EXTRACTING KNOWLEDGE FROM NATURAL LANGUAGE

Much genetic/molecular knowledge is buried within millions of published papers. We develop methods to automatically extract links between biological entities from literature while complementing manual curation.

nlp.png

Few related papers

​

[ Coming soon! ]

AGE- AND SEX-SPECIFIC GENOMIC CONTEXTS

Most diseases vary in prevalence and impact between females/males and across life stages. We are developing methods to delineate the genomic basis of differences in physiology and disease between sexes and across ages.

age-sex.png

Few related papers

​

[ Coming soon! ]

SINGLE CELL GENOMICS

Single-cell omics gives unprecedented resolution into gene expression regulatory programs. We are developing methods to use these data to study cell-type and developmental specificity of complex diseases.

scrna-seq.png

Few related papers

​

[ Coming soon! ]

BIOLOGY-INFORMED MACHINE LEARNING MODELS

We develop new ML approaches to predict novel genes related to functions and diseases by using prior knowledge of various types and varied confidence levels to constraint model structure and train model parameters.

ml.png

DATA IMPUTATION AND PREDICTION

Massive public omics datasets contain several "holes" – not just "missing" but systematically unmeasured values. We are developing statistical methods to impute these values to enable integrative analyses. 

Fig-Project-Overview_edited.jpg

Few related papers

  1. A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.  2020 Nuc. Acids Res.

​

[ More coming soon. ]

HETEROGENEITY OF COMPLEX DISEASES

Each disease is not a single well-defined condition. We are working on building integrative approaches that can unravel subtypes of complex disorders defined by common functional/mechanistic deregulations.

dis-het.png

DRUG REPURPOSING

Given the prohibitive costs and timelines for developing new drugs, we are working on machine learning approaches to repurposing approved drugs and drug combinations for complex and infectious diseases.

drug-repurposing.png

Few related papers

  1. Reconciling multiple connectivity scores for drug repurposing.  2021 Brief. Bioinformatics.

​

[ More coming soon. ]

SOFTWARE DEVELOPMMENT

Our group is committed to making all our computational methods available to the broader research community in the form of software and online tools to promote open science, i.e., transparency, reproducibility, and reuse.

Figure1.png

OMICS SAMPLE ANNOTATION

There are nearly 2 million publicly available omics samples but they remain acutely underused. We are developing methods to systematically annotate these samples to enable researchers to find and reuse them.

onto.png

Few related papers

  1. Systematic tissue annotations of –omics samples by modeling unstructured metadata. 2021 bioRxiv.

​

[ More coming soon. ]

CROSS-SPECIES MODELS OF HUMAN TRAITS & DISEASES

Choosing the right in vivo model system to study human biology is hard. We are working on methods to translate omics data and knowledge between human and models for studying specific facets of complex traits and diseases.

cross-sp.png

Few related papers

​

[ Coming soon! ]

SCIENCE AND MEDICAL UNDERSTANDING & COMMUNICATION

Effectively communicating science and medicine is as critical as making new discoveries. We are developing new methods to quantify language understandability and tools for science/ medical communication & education.

leukocyt.png

Few related papers

​

[ Coming soon! ]

 

​

Funding

nigms.png
niaid.png
nsf.png

Past funding

BEACON-Logo.jpg
bottom of page