keyword
MENU ▼
Read by QxMD icon Read
search

model-based reinforcement learning

keyword
https://www.readbyqxmd.com/read/28226424/reward-gain-model-describes-cortical-use-dependent-plasticity
#1
Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik, Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik, Firas Mawase, Pablo Celnik, Shintaro Uehara, Nicholas Wymbs
Consistent repetitions of an action lead to plastic change in the motor cortex and cause shift in the direction of future movements. This process is known as use-dependent plasticity (UDP), one of the basic forms of the motor memory. We have recently demonstrated in a physiological study that success-related reinforcement signals could modulate the strength of UDP. We tested this idea by developing a computational approach that modeled the shift in the direction of future action as a change in preferred direction of population activity of neurons in the primary motor cortex...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28225034/simulating-future-value-in-intertemporal-choice
#2
Alec Solway, Terry Lohrenz, P Read Montague
The laboratory study of how humans and other animals trade-off value and time has a long and storied history, and is the subject of a vast literature. However, despite a long history of study, there is no agreed upon mechanistic explanation of how intertemporal choice preferences arise. Several theorists have recently proposed model-based reinforcement learning as a candidate framework. This framework describes a suite of algorithms by which a model of the environment, in the form of a state transition function and reward function, can be converted on-line into a decision...
February 22, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28183973/solving-the-quantum-many-body-problem-with-artificial-neural-networks
#3
Giuseppe Carleo, Matthias Troyer
The challenge posed by the many-body problem in quantum physics originates from the difficulty of describing the nontrivial correlations encoded in the exponential complexity of the many-body wave function. Here we demonstrate that systematic machine learning of the wave function can reduce this complexity to a tractable computational form for some notable cases of physical interest. We introduce a variational representation of quantum states based on artificial neural networks with a variable number of hidden neurons...
February 10, 2017: Science
https://www.readbyqxmd.com/read/28170057/animal-models-of-drug-addiction
#4
María Pilar García Pardo, Concepción Roger Sánchez, José Enrique De la Rubia Ortí, María Asunción Aguilar Calpe
The development of animal models of drug reward and addiction is an essential factor for progress in understanding the biological basis of this disorder and for the identification of new therapeutic targets. Depending on the component of reward to be studied, one type of animal model or another may be used. There are models of reinforcement based on the primary hedonic effect produced by the consumption of the addictive substance, such as the self-administration (SA) and intracranial self-stimulation (ICSS) paradigms, and there are models based on the component of reward related to associative learning and cognitive ability to make predictions about obtaining reward in the future, such as the conditioned place preference (CPP) paradigm...
January 12, 2017: Adicciones
https://www.readbyqxmd.com/read/28166826/socio-ecological-dynamics-and-challenges-to-the-governance-of-neglected-tropical-disease-control
#5
REVIEW
Edwin Michael, Shirin Madon
The current global attempts to control the so-called "Neglected Tropical Diseases (NTDs)" have the potential to significantly reduce the morbidity suffered by some of the world's poorest communities. However, the governance of these control programmes is driven by a managerial rationality that assumes predictability of proposed interventions, and which thus primarily seeks to improve the cost-effectiveness of implementation by measuring performance in terms of pre-determined outputs. Here, we argue that this approach has reinforced the narrow normal-science model for controlling parasitic diseases, and in doing so fails to address the complex dynamics, uncertainty and socio-ecological context-specificity that invariably underlie parasite transmission...
February 6, 2017: Infectious Diseases of Poverty
https://www.readbyqxmd.com/read/28159314/fuzzy-lyapunov-reinforcement-learning-for-non-linear-systems
#6
Abhishek Kumar, Rajneesh Sharma
We propose a fuzzy reinforcement learning (RL) based controller that generates a stable control action by lyapunov constraining fuzzy linguistic rules. In particular, we attempt at lyapunov constraining the consequent part of fuzzy rules in a fuzzy RL setup. Ours is a first attempt at designing a linguistic RL controller with lyapunov constrained fuzzy consequents to progressively learn a stable optimal policy. The proposed controller does not need system model or desired response and can effectively handle disturbances in continuous state-action space problems...
January 31, 2017: ISA Transactions
https://www.readbyqxmd.com/read/28158309/learning-by-stimulation-avoidance-a-principle-to-control-spiking-neural-networks-dynamics
#7
Lana Sinapayen, Atsushi Masumori, Takashi Ikegami
Learning based on networks of real neurons, and learning based on biologically inspired models of neural networks, have yet to find general learning rules leading to widespread applications. In this paper, we argue for the existence of a principle allowing to steer the dynamics of a biologically inspired neural network. Using carefully timed external stimulation, the network can be driven towards a desired dynamical state. We term this principle "Learning by Stimulation Avoidance" (LSA). We demonstrate through simulation that the minimal sufficient conditions leading to LSA in artificial networks are also sufficient to reproduce learning results similar to those obtained in biological neurons by Shahaf and Marom, and in addition explains synaptic pruning...
2017: PloS One
https://www.readbyqxmd.com/read/28152753/data-driven-quality-improvement-in-an-oncology-patient-centered-medical-home
#8
Maureen Lowry, Brian Flounders, Susan Higman Tofani
: 54 Background: In a 2012 abstract, Data driven transformation for an Oncology Patient-Centered Medical Home, Consultants in Medical Oncology (CMOH) demonstrated that standardized processes and enhanced IT capabilities (IRIS software app) provided a rapid learning system for the practice. Iris aggregated data became the basis for Quality Improvement Projects (QIPs) allowing CMOH to continue to improve in quality and cost measures. Deviation from performance trend is readily identifiable, providing operational direction...
March 2016: Journal of Clinical Oncology: Official Journal of the American Society of Clinical Oncology
https://www.readbyqxmd.com/read/28113995/optimal-output-feedback-control-of-unknown-continuous-time-linear-systems-using-off-policy-reinforcement-learning
#9
Hamidreza Modares, Frank L Lewis, Zhong-Ping Jiang
A model-free off-policy reinforcement learning algorithm is developed to learn the optimal output-feedback (OPFB) solution for linear continuous-time systems. The proposed algorithm has the important feature of being applicable to the design of optimal OPFB controllers for both regulation and tracking problems. To provide a unified framework for both optimal regulation and tracking, a discounted performance function is employed and a discounted algebraic Riccati equation (ARE) is derived which gives the solution to the problem...
September 22, 2016: IEEE Transactions on Cybernetics
https://www.readbyqxmd.com/read/28112207/placebo-intervention-enhances-reward-learning-in-healthy-individuals
#10
Zsolt Turi, Matthias Mittner, Walter Paulus, Andrea Antal
According to the placebo-reward hypothesis, placebo is a reward-anticipation process that increases midbrain dopamine (DA) levels. Reward-based learning processes, such as reinforcement learning, involves a large part of the DA-ergic network that is also activated by the placebo intervention. Given the neurochemical overlap between placebo and reward learning, we investigated whether verbal instructions in conjunction with a placebo intervention are capable of enhancing reward learning in healthy individuals by using a monetary reward-based reinforcement-learning task...
January 23, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28106849/a-reinforcement-learning-model-equipped-with-sensors-for-generating-perception-patterns-implementation-of-a-simulated-air-navigation-system-using-ads-b-automatic-dependent-surveillance-broadcast-technology
#11
Santiago Álvarez de Toledo, Aurea Anguera, José M Barreiro, Juan A Lara, David Lizcano
Over the last few decades, a number of reinforcement learning techniques have emerged, and different reinforcement learning-based applications have proliferated. However, such techniques tend to specialize in a particular field. This is an obstacle to their generalization and extrapolation to other areas. Besides, neither the reward-punishment (r-p) learning process nor the convergence of results is fast and efficient enough. To address these obstacles, this research proposes a general reinforcement learning model...
January 19, 2017: Sensors
https://www.readbyqxmd.com/read/28095201/multisensory-bayesian-inference-depends-on-synapse-maturation-during-training-theoretical-analysis-and-neural-modeling-implementation
#12
Mauro Ursino, Cristiano Cuppini, Elisa Magosso
Recent theoretical and experimental studies suggest that in multisensory conditions, the brain performs a near-optimal Bayesian estimate of external events, giving more weight to the more reliable stimuli. However, the neural mechanisms responsible for this behavior, and its progressive maturation in a multisensory environment, are still insufficiently understood. The aim of this letter is to analyze this problem with a neural network model of audiovisual integration, based on probabilistic population coding-the idea that a population of neurons can encode probability functions to perform Bayesian inference...
March 2017: Neural Computation
https://www.readbyqxmd.com/read/28091572/a-computational-psychiatry-approach-identifies-how-alpha-2a-noradrenergic-agonist-guanfacine-affects-feature-based-reinforcement-learning-in-the-macaque
#13
S A Hassani, M Oemisch, M Balcarras, S Westendorff, S Ardid, M A van der Meer, P Tiesinga, T Womelsdorf
Noradrenaline is believed to support cognitive flexibility through the alpha 2A noradrenergic receptor (a2A-NAR) acting in prefrontal cortex. Enhanced flexibility has been inferred from improved working memory with the a2A-NA agonist Guanfacine. But it has been unclear whether Guanfacine improves specific attention and learning mechanisms beyond working memory, and whether the drug effects can be formalized computationally to allow single subject predictions. We tested and confirmed these suggestions in a case study with a healthy nonhuman primate performing a feature-based reversal learning task evaluating performance using Bayesian and Reinforcement learning models...
January 16, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28077716/the-attraction-effect-modulates-reward-prediction-errors-and-intertemporal-choices
#14
Sebastian Gluth, Jared M Hotaling, Jörg Rieskamp
: Classical economic theory contends that the utility of a choice option should be independent of other options. This view is challenged by the attraction effect, in which the relative preference between two options is altered by the addition of a third, asymmetrically dominated option. Here, we leveraged the attraction effect in the context of intertemporal choices to test whether both decisions and reward prediction errors (RPE) in the absence of choice violate the independence of irrelevant alternatives principle...
January 11, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/28065344/abai-s-moc-assessment-of-knowledge-program-matures-adding-value-with-continuous-learning-and-assessment
#15
REVIEW
David I Bernstein, Stephen I Wasserman, William P Thompson, Theodore M Freeman
Rapid changes in modern medicine along with advances in the science of learning and memory have necessitated a shift in the way physician knowledge is assessed. Physician recertification beyond initial certification has historically consisted of retaining large amounts of knowledge over a long time span. The adult learning theory has shown that the maintenance and improvement of our knowledge base is more effective by being exposed to new concepts at regular intervals throughout one's career and reinforcing these concepts on an ongoing basis...
January 2017: Journal of Allergy and Clinical Immunology in Practice
https://www.readbyqxmd.com/read/28047608/su-d-brb-05-quantum-learning-for-knowledge-based-response-adaptive-radiotherapy
#16
I El Naqa, R Ten
PURPOSE: There is tremendous excitement in radiotherapy about applying data-driven methods to develop personalized clinical decisions for real-time response-based adaptation. However, classical statistical learning methods lack in terms of efficiency and ability to predict outcomes under conditions of uncertainty and incomplete information. Therefore, we are investigating physics-inspired machine learning approaches by utilizing quantum principles for developing a robust framework to dynamically adapt treatments to individual patient's characteristics and optimize outcomes...
June 2016: Medical Physics
https://www.readbyqxmd.com/read/28018206/the-role-of-multiple-neuromodulators-in-reinforcement-learning-that-is-based-on-competition-between-eligibility-traces
#17
Marco A Huertas, Sarah E Schwettmann, Harel Z Shouval
The ability to maximize reward and avoid punishment is essential for animal survival. Reinforcement learning (RL) refers to the algorithms used by biological or artificial systems to learn how to maximize reward or avoid negative outcomes based on past experiences. While RL is also important in machine learning, the types of mechanistic constraints encountered by biological machinery might be different than those for artificial systems. Two major problems encountered by RL are how to relate a stimulus with a reinforcing signal that is delayed in time (temporal credit assignment), and how to stop learning once the target behaviors are attained (stopping rule)...
2016: Frontiers in Synaptic Neuroscience
https://www.readbyqxmd.com/read/28018203/computational-properties-of-the-hippocampus-increase-the-efficiency-of-goal-directed-foraging-through-hierarchical-reinforcement-learning
#18
Eric Chalmers, Artur Luczak, Aaron J Gruber
The mammalian brain is thought to use a version of Model-based Reinforcement Learning (MBRL) to guide "goal-directed" behavior, wherein animals consider goals and make plans to acquire desired outcomes. However, conventional MBRL algorithms do not fully explain animals' ability to rapidly adapt to environmental changes, or learn multiple complex tasks. They also require extensive computation, suggesting that goal-directed behavior is cognitively expensive. We propose here that key features of processing in the hippocampus support a flexible MBRL mechanism for spatial navigation that is computationally efficient and can adapt quickly to change...
2016: Frontiers in Computational Neuroscience
https://www.readbyqxmd.com/read/27966103/the-drift-diffusion-model-as-the-choice-rule-in-reinforcement-learning
#19
Mads Lund Pedersen, Michael J Frank, Guido Biele
Current reinforcement-learning models often assume simplified decision processes that do not fully reflect the dynamic complexities of choice processes. Conversely, sequential-sampling models of decision making account for both choice accuracy and response time, but assume that decisions are based on static decision values. To combine these two computational models of decision making and learning, we implemented reinforcement-learning models in which the drift diffusion model describes the choice process, thereby capturing both within- and across-trial dynamics...
December 13, 2016: Psychonomic Bulletin & Review
https://www.readbyqxmd.com/read/27916841/market-model-for-resource-allocation-in-emerging-sensor-networks-with-reinforcement-learning
#20
Yue Zhang, Bin Song, Ying Zhang, Xiaojiang Du, Mohsen Guizani
Emerging sensor networks (ESNs) are an inevitable trend with the development of the Internet of Things (IoT), and intend to connect almost every intelligent device. Therefore, it is critical to study resource allocation in such an environment, due to the concern of efficiency, especially when resources are limited. By viewing ESNs as multi-agent environments, we model them with an agent-based modelling (ABM) method and deal with resource allocation problems with market models, after describing users' patterns...
November 29, 2016: Sensors
keyword
keyword
103848
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"