12–17 Jul 2026
University of Graz
Europe/Vienna timezone

Conditions on Interaction Networks for Achieving Specific Reinforcement under Global Dopamine Signals

15 Jul 2026, 11:30
20m
11.34 - SR (University of Graz)

11.34 - SR

University of Graz

28
Contributed Talk Systems Biology and Biochemical Networks Contributed Talks

Speaker

Yonatan Savir (Technion – Israel Institute of Technology)

Description

Reward-based learning relies on a system’s ability to distinguish between environmental signals across space or time. However, biological reinforcement systems do not always operate under such idealized conditions. In particular, dopamine reward signals are often broadly distributed rather than spatially localized, while environmental stimuli frequently overlap in time in realistic settings. This raises the question of how learning can remain robust to interference from competing signals \cite{schultz_neural_1997,schultz_dopamine_2024}.

In this work, we investigate how robust learning can emerge when reinforcement signals lack temporal or spatial specificity. Building on a theoretical framework for global homeostasis in networks \cite{steuer_robust_2011,savir_achieving_2017}, we derive conditions on the network structure and feedback topology that guarantee the stability of dopamine-mediated reinforcement signals in the presence of interfering inputs. We show that robustness is achieved only when dopamine secretion is downregulated by the overall level of global stimulation. Under this topology, global and non-specific regulation gives rise to selective responses by suppressing interference between competing inputs. This result does not rely on parameter tuning or detailed biological assumptions and reveals a nonlinear feedback principle applicable to a broad class of systems that must achieve specificity using control mechanisms that are not signal-specific.

Bibliography

@article{schultz_dopamine_2024,
title = {A dopamine mechanism for reward maximization},
volume = {121},
url = {https://www.pnas.org/doi/abs/10.1073/pnas.2316658121},
doi = {10.1073/pnas.2316658121},
abstract = {Individual survival and evolutionary selection require biological organisms to maximize reward. Economic choice theories define the necessary and sufficient conditions, and neuronal signals of decision variables provide mechanistic explanations. Reinforcement learning (RL) formalisms use predictions, actions, and policies to maximize reward. Midbrain dopamine neurons code reward prediction errors (RPE) of subjective reward value suitable for RL. Electrical and optogenetic self-stimulation experiments demonstrate that monkeys and rodents repeat behaviors that result in dopamine excitation. Dopamine excitations reflect positive RPEs that increase reward predictions via RL; against increasing predictions, obtaining similar dopamine RPE signals again requires better rewards than before. The positive RPEs drive predictions higher again and thus advance a recursive reward-RPE-prediction iteration toward better and better rewards. Agents also avoid dopamine inhibitions that lower reward prediction via RL, which allows smaller rewards than before to elicit positive dopamine RPE signals and resume the iteration toward better rewards. In this way, dopamine RPE signals serve a causal mechanism that attracts agents via RL to the best rewards. The mechanism improves daily life and benefits evolutionary selection but may also induce restlessness and greed.},
number = {20},
urldate = {2026-03-15},
journal = {Proceedings of the National Academy of Sciences},
publisher = {Proceedings of the National Academy of Sciences},
author = {Schultz, Wolfram},
month = may,
year = {2024},
pages = {e2316658121},
file = {Full Text PDF:C\:\Users\lapto\Zotero\storage\2ADZ8WJM\Schultz - 2024 - A dopamine mechanism for reward maximization.pdf:application/pdf},
}

@article{schultz_neural_1997,
address = {New York, N.Y.},
title = {A neural substrate of prediction and reward},
volume = {275},
issn = {0036-8075},
doi = {10.1126/science.275.5306.1593},
abstract = {The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events. Taken together, these findings can be understood through quantitative theories of adaptive optimizing control.},
language = {eng},
number = {5306},
journal = {Science},
author = {Schultz, W. and Dayan, P. and Montague, P. R.},
month = mar,
year = {1997},
keywords = {Algorithms, Animals, Computer Simulation, Conditioning, Psychological, Cues, Dopamine, Learning, Mesencephalon, Models, Neurological, Neurons, Rats, Reward},
pages = {1593--1599},
}

@article{steuer_robust_2011,
title = {Robust {Signal} {Processing} in {Living} {Cells}},
volume = {7},
issn = {1553-7358},
url = {https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002218},
doi = {10.1371/journal.pcbi.1002218},
abstract = {Cellular signaling networks have evolved an astonishing ability to function reliably and with high fidelity in uncertain environments. A crucial prerequisite for the high precision exhibited by many signaling circuits is their ability to keep the concentrations of active signaling compounds within tightly defined bounds, despite strong stochastic fluctuations in copy numbers and other detrimental influences. Based on a simple mathematical formalism, we identify topological organizing principles that facilitate such robust control of intracellular concentrations in the face of multifarious perturbations. Our framework allows us to judge whether a multiple-input-multiple-output reaction network is robust against large perturbations of network parameters and enables the predictive design of perfectly robust synthetic network architectures. Utilizing the Escherichia coli chemotaxis pathway as a hallmark example, we provide experimental evidence that our framework indeed allows us to unravel the topological organization of robust signaling. We demonstrate that the specific organization of the pathway allows the system to maintain global concentration robustness of the diffusible response regulator CheY with respect to several dominant perturbations. Our framework provides a counterpoint to the hypothesis that cellular function relies on an extensive machinery to fine-tune or control intracellular parameters. Rather, we suggest that for a large class of perturbations, there exists an appropriate topology that renders the network output invariant to the respective perturbations.},
language = {en},
number = {11},
urldate = {2026-03-15},
journal = {PLOS Computational Biology},
publisher = {Public Library of Science},
author = {Steuer, Ralf and Waldherr, Steffen and Sourjik, Victor and Kollmann, Markus},
month = nov,
year = {2011},
keywords = {Chemotaxis, Operons, Phosphorylation, Signal processing, Signaling networks, Soil perturbation, Topology, Vector spaces},
pages = {e1002218},
file = {Full Text PDF:C\:\Users\lapto\Zotero\storage\2RN4YRB8\Steuer et al. - 2011 - Robust Signal Processing in Living Cells.pdf:application/pdf},
}

@article{savir_achieving_2017,
title = {Achieving global perfect homeostasis through transporter regulation},
volume = {13},
issn = {1553-7358},
url = {https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005458},
doi = {10.1371/journal.pcbi.1005458},
abstract = {Nutrient homeostasis—the maintenance of relatively constant internal nutrient concentrations in fluctuating external environments—is essential to the survival of most organisms. Transcriptional regulation of plasma membrane transporters by internal nutrient concentrations is typically assumed to be the main mechanism by which homeostasis is achieved. While this mechanism is homeostatic we show that it does not achieve global perfect homeostasis—a condition where internal nutrient concentrations are completely independent of external nutrient concentrations for all external nutrient concentrations. We show that the criterion for global perfect homeostasis is that transporter levels must be inversely proportional to net nutrient flux into the cell and that downregulation of active transporters (activity-dependent regulation) is a simple and biologically plausible mechanism that meets this criterion. Activity-dependent transporter regulation creates a trade-off between robustness and efficiency, i.e., the system's ability to withstand perturbation in external nutrients and the transporter production rate needed to maintain homeostasis. Additionally, we show that a system that utilizes both activity-dependent transporter downregulation and regulation of transporter synthesis by internal nutrient levels can create a system that mitigates the shortcomings of each of the individual mechanisms. This analysis highlights the utility of activity-dependent regulation in achieving homeostasis and calls for a re-examination of the mechanisms of regulation of other homeostatic systems.},
language = {en},
number = {4},
urldate = {2026-03-15},
journal = {PLOS Computational Biology},
publisher = {Public Library of Science},
author = {Savir, Yonatan and Martynov, Alexander and Springer, Michael},
month = apr,
year = {2017},
keywords = {Cell membranes, Chemical synthesis, Feedback regulation, Homeostasis, Homeostatic mechanisms, Intracellular membranes, Molecular sensors, Transcriptional control},
pages = {e1005458},
file = {Full Text PDF:C\:\Users\lapto\Zotero\storage\QWF9D4JI\Savir et al. - 2017 - Achieving global perfect homeostasis through transporter regulation.pdf:application/pdf},
}

Authors

Or Ben Yaakov (Technion – israel institute of technology) Yonatan Savir (Technion – Israel Institute of Technology)

Presentation materials

There are no materials yet.