PhD, University of Cambridge
At A Glance
Changes in gene regulation often underlie the mechanism of genetic disorders and cancer. These changes can arise from variation in genomic DNA sequence. They can also come from alterations in epigenomic properties, such as DNA methylation, chromatin packaging, histone modifications, or 3D chromosome conformation. New sequencing technology reveals a forest of genomic and epigenomic variation, but we are hindered by insufficient understanding of the variation's consequences. As a result, we can apply these data to diagnosis or personalized drug therapy only in limited cases.
Our research program addresses this gap in knowledge to understand interactions between genome, epigenome, and phenotype in human cancers. We apply a systematic framework to create and validate predictive models of (1) how genetic variants cause epigenomic changes, and (2) the effect of epigenomic changes on gene regulation and phenotype. First, we start with data from collaborators or public resources, using cancer cell lines and cancer patient primary tissue. Second, we develop machine learning models of how a genomic or epigenomic input leads to an epigenomic or phenotypic output. Third, we perturb input data and predict changes in output. Fourth, we validate predictions with targeted experiments.
Michael Hoffman creates predictive computational models to understand interactions between genome, epigenome, and phenotype in human cancers. He implemented the genome annotation method Segway, which simplifies interpretation of large multivariate genomic datasets, and was a linchpin of the NIH ENCODE Project analysis. He is a principal investigator at Princess Margaret Cancer Centre and Associate Professor in the Departments of Medical Biophysics and Computer Science, University of Toronto. He was named a CIHR New Investigator and has received several awards for his academic work, including the NIH K99/R00 Pathway to Independence Award, and the Ontario Early Researcher Award.
Advances in research. I transformed epigenomic analysis by creating the genome annotation method Segway. Segway analyzes multiple epigenomic datasets, integrates them, and categorizes each base in a genome (e.g. transcription start, enhancer, insulator, repressed). Segway enables simple interpretation and visualization of large multivariate genomic datasets. I led an effort to annotate the human genome using Segway—a linchpin of the ENCODE analysis, which shifted thinking about the biomedical importance of noncoding DNA.
Segway's global impact is demonstrated by the many scientists who run the software or use our annotations on human cell types. These annotations are displayed by both the Ensembl (50,000 unique users/week), and UCSC (38,000 unique/week) genome browsers. Segway annotations also form a building block for highly-used noncoding interpretation tools like CADD and the Ensembl Regulatory Build. In other work, I created Sunflower, a theoretical framework to predict effects of genetic variation on transcription factor (TF) binding, originating a widespread "motif-breaker" approach.
Training a new generation. My past students are in PhD programs at Princeton, University of Toronto, University of Washington, and University of Maryland. They have received the NSF Graduate Research Fellowship, Canadian Graduate Scholarship, and Ontario Graduate Scholarships.
- Karimzadeh M, Arlidge C, Rostami A, Lupien M, Bratman SV, Hoffman MM. “Viral integration transforms chromatin to drive oncogenesis.” 2020. Preprint: https://doi.org/10.1101/2020.02.12.942755
- Denisko D, Viner C, Hoffman MM. “Motif elucidation in ChIP-seq datasets with a knockout control.” 2019. Preprint: https://doi.org/10.1101/721720
- Chicco D, Bi HS, Reimand J, Hoffman MM. “BEHST: genomic set enrichment analysis enhanced through integration of chromatin long-range interactions.” 2019. Preprint: https://doi.org/10.1101/168427
- Karimzadeh M, Hoffman MM. “Virtual ChIP-seq: Predicting transcription factor binding by learning from the transcriptome.” 2019. Preprint: https://doi.org/10.1101/168419
- Chan RCW*, Libbrecht MW*, Roberts EG, Bilmes JA, Noble WS, Hoffman MM. “Segway 2.0: Gaussian mixture models and minibatch training.” Bioinformatics. 2018; 34:669–71.
- Viner C, Johnson J, Walker N, Shi H, Sjöberg M, Adams DJ, Ferguson-Smith AC, Bailey TL, Hoffman MM. “Modeling methyl-sensitive transcription factor motifs with an expanded epigenetic alphabet.” 2016. Preprint:https://doi.org/10.1101/043794