12–17 Jul 2026
University of Graz
Europe/Vienna timezone

Estimating the Distribution of Methylation Entropy in Genomic Data

16 Jul 2026, 18:30
2h
University of Graz

University of Graz

Poster Numerical, Computational, and Data-Driven Methods Poster Presentations

Speaker

Monika Kurpas (Department of Systems Biology and Engineering, Silesian University of Technology, Gliwice, Poland)

Description

DNA methylation is an epigenetic modification in which a methyl group is added to cytosine within CpG dinucleotides. CpG-rich regions are often found in gene promoters and other regulatory elements. Methylation at these loci is a stable, heritable mark that influences transcription, chromatin state, and genome stability [1]. Its dysregulation has been linked to several diseases, including cancer [2].
At the level of a single DNA molecule, each CpG site is either methylated or unmethylated. Mean methylation levels describe the overall tendency of a locus, but they do not distinguish between different stochastic configurations.
We use Shannon entropy to quantify methylation heterogeneity and develop a simulation framework to explore the entropy landscape [3]. Probability vectors are sampled from a symmetric Dirichlet prior on the simplex and conditioned on a fixed mean methylation level. The accepted distributions are then used to compute entropy. By simulating approximately 10 million reads per setting and comparing the results with whole-genome nanopore calls from human blood, we found that entropy is lowest at extreme mean values and peaks near 0.5. Increasing the number of CpG sites expands the space of possible methylation patterns and increases entropy, while the Dirichlet concentration parameter controls dispersion.
Overall, Shannon entropy is a sensitive measure of epigenetic heterogeneity and simple Dirichlet-based models reproduce features observed in empirical data.

Bibliography

@article{bird_dna_2002,
title = {{DNA} methylation patterns and epigenetic memory},
volume = {16},
issn = {0890-9369, 1549-5477},
url = {http://genesdev.cshlp.org/lookup/doi/10.1101/gad.947102},
doi = {10.1101/gad.947102},
language = {en},
number = {1},
urldate = {2026-04-22},
journal = {Genes \& Development},
author = {Bird, Adrian},
month = jan,
year = {2002},
pages = {6--21},
}

@article{baylin_epigenetic_2016,
title = {Epigenetic {Determinants} of {Cancer}},
volume = {8},
issn = {1943-0264},
url = {http://cshperspectives.cshlp.org/lookup/doi/10.1101/cshperspect.a019505},
doi = {10.1101/cshperspect.a019505},
language = {en},
number = {9},
urldate = {2026-04-22},
journal = {Cold Spring Harbor Perspectives in Biology},
author = {Baylin, Stephen B. and Jones, Peter A.},
month = sep,
year = {2016},
pages = {a019505},
}

@phdthesis{asante_estimation_2026,
type = {{PhD} {Thesis}},
title = {Estimation and {Applications} of {Some} {Models} for {Dependent} {Data} in {DNA} {Methylation} and {Genomic} {Mutation} {Data}},
school = {Rice University},
author = {Asante, Emmanuel},
year = {2026},
}

Authors

Emmanuel Asante (Department of Statistics, Rice University, Houston, TX, USA) Monika Kurpas (Department of Systems Biology and Engineering, Silesian University of Technology, Gliwice, Poland)

Co-authors

Meng Li (Department of Statistics and Ken Kennedy Institute, Rice University, Houston, TX, USA) Marek Kimmel (Department of Statistics and Ken Kennedy Institute, Rice University, Houston, TX, USA; Department of Systems Biology and Engineering, Silesian University of Technology, Gliwice, Poland) Peter Van Loo (Department of Genetics, Division of Discovery Science and Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA) Huw Ogilvie (The University of Texas MD Anderson Cancer Center, Houston, TX, USA)

Presentation materials

There are no materials yet.