Speaker
Description
Cells encode environmental information through the nuclear localisation dynamics of transcription factors (TFs) - stochastic time-series governed by physical parameters: mean expression (μ), coefficient of variation (CV), and autocorrelation time (T_ac). Labelling these is costly, motivating a foundational self-supervised model generalisable across TF localisations and biological contexts cite{zhang2022tfc, yue2022ts2vec}.
Benchmarks reveal no single approach generalises: raw SVMs achieve 86% on full-length trajectories by exploiting transient bursts, but collapse to chance (49%) on truncated series. Catch22 cite{lubba2019catch22} - 22 interpretable time-series features - scores 67% on truncated but only 61% on full-length data. Dataset design governs apparent performance.
To learn dataset-agnostic representations, we train a SimCLR-style cite{chen2020simclr} contrastive Transformer cite{vaswani2017attention} on Gillespie-simulated trajectories via InfoNCE loss. The Transformer acts as a feature extractor whose embeddings feed a downstream SVM used for classification.
We uncover a fundamental paradox: normalisation is essential for stable contrastive training, yet destroys absolute scale. Consequently, μ - trivially recoverable by a raw SVM - collapses to chance (~50%) across all SSL variants, while CV (87%) and T_ac (63%) are well-encoded. This exposes a hard incompatibility between SSL stability and scale preservation in stochastic biological time-series.
Bibliography
@inproceedings{chen2020simclr,
title = {A Simple Framework for Contrastive Learning of Visual Representations},
author = {Chen, Ting and Kornblith, Simon and Norouzi, Mohammad and Hinton, Geoffrey},
booktitle = {Proceedings of the 37th International Conference on Machine Learning (ICML)},
year = {2020}
}
@article{lubba2019catch22,
title = {catch22: CAnonical Time-series CHaracteristics},
author = {Lubba, Carl H and Sethi, Sarab S and Knaute, Philip and Schultz, Simon R and Fulcher, Ben D and Jones, Nick S},
journal = {Data Mining and Knowledge Discovery},
volume = {33},
pages = {1821--1852},
year = {2019}
}
@inproceedings{vaswani2017attention,
title = {Attention Is All You Need},
author = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, Lukasz and Polosukhin, Illia},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2017}
}
@inproceedings{yue2022ts2vec,
title = {TS2Vec: Towards Universal Representation of Time Series},
author = {Yue, Zhihan and Wang, Yujing and Duan, Juanyong and Yang, Tianmeng and Huang, Congrui and Tong, Yunhai and Xu, Bixiong},
booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence},
year = {2022}
}
@inproceedings{zhang2022tfc,
title = {Self-Supervised Contrastive Pre-Training for Time Series via Time-Frequency Consistency},
author = {Zhang, Xiang and Zhao, Ziyuan and Tsiligkaridis, Theodoros and Zitnik, Marinka},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2022}
}