RESEARCH
Our research is at the intersection of
biology + computer science + statistics + applied mathematics + engineering
We develop computational approaches that can leverage massive public data collections to unravel the molecular basis of complex traits and diseases.
(click the tabs for more)
We are focused on addressing three facets of heterogeneity of complex traits and diseases
The following are active research projects.
INTEGRATIVE ANALYSIS OF LARGE-SCALE DATA
We develop methods for integrating thousands of publicly available –omics datasets to build genome-scale models of how genes interact with each other in various biological contexts including health and disease.
Few related papers​
EXTRACTING KNOWLEDGE FROM NATURAL LANGUAGE
Much genetic/molecular knowledge is buried within millions of published papers. We develop methods to automatically extract links between biological entities from literature while complementing manual curation.
Few related papers
​
[ Coming soon! ]
AGE- AND SEX-SPECIFIC GENOMIC CONTEXTS
Most diseases vary in prevalence and impact between females/males and across life stages. We are developing methods to delineate the genomic basis of differences in physiology and disease between sexes and across ages.
Few related papers
​
[ Coming soon! ]
SINGLE CELL GENOMICS
Single-cell omics gives unprecedented resolution into gene expression regulatory programs. We are developing methods to use these data to study cell-type and developmental specificity of complex diseases.
Few related papers
​
[ Coming soon! ]
BIOLOGY-INFORMED MACHINE LEARNING MODELS
We develop new ML approaches to predict novel genes related to functions and diseases by using prior knowledge of various types and varied confidence levels to constraint model structure and train model parameters.
Few related papers
DATA IMPUTATION AND PREDICTION
Massive public omics datasets contain several "holes" – not just "missing" but systematically unmeasured values. We are developing statistical methods to impute these values to enable integrative analyses.
Few related papers
-
A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes. 2020 Nuc. Acids Res.
​
[ More coming soon. ]
HETEROGENEITY OF COMPLEX DISEASES
Each disease is not a single well-defined condition. We are working on building integrative approaches that can unravel subtypes of complex disorders defined by common functional/mechanistic deregulations.
Few related papers
-
Pervasive epistasis in cell proliferation pathways modulates neurodevelopmental defects of autism-associated 16p11.2 deletion. 2018 Nat. Comm.
​
[ More coming soon! ]
DRUG REPURPOSING
Given the prohibitive costs and timelines for developing new drugs, we are working on machine learning approaches to repurposing approved drugs and drug combinations for complex and infectious diseases.
Few related papers
-
Reconciling multiple connectivity scores for drug repurposing. 2021 Brief. Bioinformatics.
​
[ More coming soon. ]
SOFTWARE DEVELOPMMENT
Our group is committed to making all our computational methods available to the broader research community in the form of software and online tools to promote open science, i.e., transparency, reproducibility, and reuse.
Few related tools
OMICS SAMPLE ANNOTATION
There are nearly 2 million publicly available omics samples but they remain acutely underused. We are developing methods to systematically annotate these samples to enable researchers to find and reuse them.
Few related papers
​
[ More coming soon. ]
CROSS-SPECIES MODELS OF HUMAN TRAITS & DISEASES
Choosing the right in vivo model system to study human biology is hard. We are working on methods to translate omics data and knowledge between human and models for studying specific facets of complex traits and diseases.
Few related papers
​
[ Coming soon! ]
SCIENCE AND MEDICAL UNDERSTANDING & COMMUNICATION
Effectively communicating science and medicine is as critical as making new discoveries. We are developing new methods to quantify language understandability and tools for science/ medical communication & education.
Few related papers
​
[ Coming soon! ]