Berk Ata Alpay

I studied computer science and math during my undergrad at the University of Connecticut. I'm now a PhD student in the Systems, Synthetic, and Quantitative Biology PhD program at Harvard, working in Michael Desai's lab. I like working on a wide range of problems but my interests are mainly in protein fitness landscapes, antibodies, and Bayesian statistics.


Drug Discov. Today

Evaluating molecular fingerprint-based models of drug side effects against a statistical control [link]

The AUC and AUPR are metrics commonly used to evaluate models that predict the side effects of drugs using their molecular features. These models predict a matrix of drugs' associated side effects. However, the baseline AUC and AUPR can be high depending on the statistical properties of this matrix. We analyze this dependence, take it into account, and ask: to what degree do models actually benefit from molecular fingerprints?


Combinatorial and statistical prediction of gene expression from haplotype sequence [link]

Computational gene expression prediction can combine the statistical power and biological insights of transcriptome-wide association studies with the genetic signals discovered by genome-wide association studies. However, current methods are not accurate for many genes. Our models relax the independence of genetic markers to more accurately predict a large subset of genes.


Dynamic modeling of power outages caused by thunderstorms [link]

Thunderstorms are complex phenomena that cause substantial power outages in a short period. Predicting these outages is challenging using eventwise models, which summarize the weather dynamics over the entire course of the storm. We developed a framework designed for models to learn the dynamics of thunderstorm-caused outages directly from hourly weather forecasts.