Berk Alpay

I'm a computer science and math major now doing a PhD in the Systems, Synthetic, and Quantitative Biology program at Harvard in Michael Desai's lab. I like working on a wide range of problems but my main interests are protein fitness landscapes, clinical data analysis, and Bayesian statistics. [CV]

Publications

Drug Discov. Today

Evaluating molecular fingerprint-based models of drug side effects against a statistical control [link]

The AUC and AUPR are metrics commonly used to evaluate models that predict the side effects of drugs using their molecular features. These models predict a matrix of drugs' associated side effects. However, the baseline AUC and AUPR can be high depending on the statistical properties of this matrix. We analyze this dependence, take it into account, and ask: to what degree do models actually benefit from molecular fingerprints?

Bioinformatics

Combinatorial and statistical prediction of gene expression from haplotype sequence [link]

Studies have genotyped and measured the gene expression levels of many people. Using this data, one can investigate how genotype influences gene expression, which could be useful for understanding the genetic basis of complex traits such as disease. We attempt to model gene expression, accounting for interactions among genetic markers. By doing so, we more accurately predict the expression of a large subset of genes.

Forecasting

Dynamic modeling of power outages caused by thunderstorms [link]

Thunderstorms are complex phenomena that cause substantial power outages in a short period. Predicting these outages is challenging using models that summarize the weather over the entire course of the storm. Instead, we develop a framework designed for models to learn the dynamics of thunderstorm-caused outages directly from hourly weather forecasts.