keyword
MENU ▼
Read by QxMD icon Read
search

model based model free reinforcement learning

keyword
https://www.readbyqxmd.com/read/28113995/optimal-output-feedback-control-of-unknown-continuous-time-linear-systems-using-off-policy-reinforcement-learning
#1
Hamidreza Modares, Frank L Lewis, Zhong-Ping Jiang
A model-free off-policy reinforcement learning algorithm is developed to learn the optimal output-feedback (OPFB) solution for linear continuous-time systems. The proposed algorithm has the important feature of being applicable to the design of optimal OPFB controllers for both regulation and tracking problems. To provide a unified framework for both optimal regulation and tracking, a discounted performance function is employed and a discounted algebraic Riccati equation (ARE) is derived which gives the solution to the problem...
September 22, 2016: IEEE Transactions on Cybernetics
https://www.readbyqxmd.com/read/28112207/placebo-intervention-enhances-reward-learning-in-healthy-individuals
#2
Zsolt Turi, Matthias Mittner, Walter Paulus, Andrea Antal
According to the placebo-reward hypothesis, placebo is a reward-anticipation process that increases midbrain dopamine (DA) levels. Reward-based learning processes, such as reinforcement learning, involves a large part of the DA-ergic network that is also activated by the placebo intervention. Given the neurochemical overlap between placebo and reward learning, we investigated whether verbal instructions in conjunction with a placebo intervention are capable of enhancing reward learning in healthy individuals by using a monetary reward-based reinforcement-learning task...
January 23, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28077716/the-attraction-effect-modulates-reward-prediction-errors-and-intertemporal-choices
#3
Sebastian Gluth, Jared M Hotaling, Jörg Rieskamp
: Classical economic theory contends that the utility of a choice option should be independent of other options. This view is challenged by the attraction effect, in which the relative preference between two options is altered by the addition of a third, asymmetrically dominated option. Here, we leveraged the attraction effect in the context of intertemporal choices to test whether both decisions and reward prediction errors (RPE) in the absence of choice violate the independence of irrelevant alternatives principle...
January 11, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/28047608/su-d-brb-05-quantum-learning-for-knowledge-based-response-adaptive-radiotherapy
#4
I El Naqa, R Ten
PURPOSE: There is tremendous excitement in radiotherapy about applying data-driven methods to develop personalized clinical decisions for real-time response-based adaptation. However, classical statistical learning methods lack in terms of efficiency and ability to predict outcomes under conditions of uncertainty and incomplete information. Therefore, we are investigating physics-inspired machine learning approaches by utilizing quantum principles for developing a robust framework to dynamically adapt treatments to individual patient's characteristics and optimize outcomes...
June 2016: Medical Physics
https://www.readbyqxmd.com/read/27909102/the-attraction-effect-modulates-reward-prediction-errors-and-intertemporal-choices
#5
Sebastian Gluth, Jared M Hotaling, Jörg Rieskamp
: Classical economic theory contends that the utility of a choice option should be independent of other options. This view is challenged by the attraction effect, in which the relative preference between two options is altered by the addition of a third, asymmetrically dominated option. Here, we leveraged the attraction effect in the context of intertemporal choices to test whether both decisions and reward prediction errors (RPE)-in the absence of choice-violate the independence of irrelevant alternatives principle...
December 1, 2016: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/27825732/cognitive-components-underpinning-the-development-of-model-based-learning
#6
Tracey C S Potter, Nessa V Bryce, Catherine A Hartley
Reinforcement learning theory distinguishes "model-free" learning, which fosters reflexive repetition of previously rewarded actions, from "model-based" learning, which recruits a mental model of the environment to flexibly select goal-directed actions. Whereas model-free learning is evident across development, recruitment of model-based learning appears to increase with age. However, the cognitive processes underlying the development of model-based learning remain poorly characterized. Here, we examined whether age-related differences in cognitive processes underlying the construction and flexible recruitment of mental models predict developmental increases in model-based choice...
October 29, 2016: Developmental Cognitive Neuroscience
https://www.readbyqxmd.com/read/27793098/-proactive-use-of-cue-context-congruence-for-building-reinforcement-learning-s-reward-function
#7
Judit Zsuga, Klara Biro, Gabor Tajti, Magdolna Emma Szilasi, Csaba Papp, Bela Juhasz, Rudolf Gesztelyi
BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related attributes (action set, policy, discount factor) and the agent's knowledge of the environment embodied by the reward function and hidden environmental factors given by the transition probability...
October 28, 2016: BMC Neuroscience
https://www.readbyqxmd.com/read/27713407/striatal-prediction-errors-support-dynamic-control-of-declarative-memory-decisions
#8
Jason M Scimeca, Perri L Katzman, David Badre
Adaptive memory requires context-dependent control over how information is retrieved, evaluated and used to guide action, yet the signals that drive adjustments to memory decisions remain unknown. Here we show that prediction errors (PEs) coded by the striatum support control over memory decisions. Human participants completed a recognition memory test that incorporated biased feedback to influence participants' recognition criterion. Using model-based fMRI, we find that PEs-the deviation between the outcome and expected value of a memory decision-correlate with striatal activity and predict individuals' final criterion...
October 7, 2016: Nature Communications
https://www.readbyqxmd.com/read/27687119/learning-reward-and-decision-making
#9
John P O'Doherty, Jeffrey Cockburn, Wolfgang M Pauli
In this review, we summarize findings supporting the existence of multiple behavioral strategies for controlling reward-related behavior, including a dichotomy between the goal-directed or model-based system and the habitual or model-free system in the domain of instrumental conditioning and a similar dichotomy in the realm of Pavlovian conditioning. We evaluate evidence from neuroscience supporting the existence of at least partly distinct neuronal substrates contributing to the key computations necessary for the function of these different control systems...
January 3, 2017: Annual Review of Psychology
https://www.readbyqxmd.com/read/27564094/when-does-model-based-control-pay-off
#10
Wouter Kool, Fiery A Cushman, Samuel J Gershman
Many accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to "model-free" and "model-based" strategies in reinforcement learning. Model-free strategies are computationally cheap, but sometimes inaccurate, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. In contrast, model-based strategies compute action values through planning in a causal model of the environment, which is more accurate but also more cognitively demanding...
August 2016: PLoS Computational Biology
https://www.readbyqxmd.com/read/27511383/gaze-data-reveal-distinct-choice-processes-underlying-model-based-and-model-free-reinforcement-learning
#11
Arkady Konovalov, Ian Krajbich
Organisms appear to learn and make decisions using different strategies known as model-free and model-based learning; the former is mere reinforcement of previously rewarded actions and the latter is a forward-looking strategy that involves evaluation of action-state transition probabilities. Prior work has used neural data to argue that both model-based and model-free learners implement a value comparison process at trial onset, but model-based learners assign more weight to forward-looking computations. Here using eye-tracking, we report evidence for a different interpretation of prior results: model-based subjects make their choices prior to trial onset...
2016: Nature Communications
https://www.readbyqxmd.com/read/27482099/learning-to-soar-in-turbulent-environments
#12
Gautam Reddy, Antonio Celani, Terrence J Sejnowski, Massimo Vergassola
Birds and gliders exploit warm, rising atmospheric currents (thermals) to reach heights comparable to low-lying clouds with a reduced expenditure of energy. This strategy of flight (thermal soaring) is frequently used by migratory birds. Soaring provides a remarkable instance of complex decision making in biology and requires a long-term strategy to effectively use the ascending thermals. Furthermore, the problem is technologically relevant to extend the flying range of autonomous gliders. Thermal soaring is commonly observed in the atmospheric convective boundary layer on warm, sunny days...
August 16, 2016: Proceedings of the National Academy of Sciences of the United States of America
https://www.readbyqxmd.com/read/27445895/toward-a-unified-sub-symbolic-computational-theory-of-cognition
#13
Martin V Butz
This paper proposes how various disciplinary theories of cognition may be combined into a unifying, sub-symbolic, computational theory of cognition. The following theories are considered for integration: psychological theories, including the theory of event coding, event segmentation theory, the theory of anticipatory behavioral control, and concept development; artificial intelligence and machine learning theories, including reinforcement learning and generative artificial neural networks; and theories from theoretical and computational neuroscience, including predictive coding and free energy-based inference...
2016: Frontiers in Psychology
https://www.readbyqxmd.com/read/27441367/model-free-machine-learning-in-biomedicine-feasibility-study-in-type-1-diabetes
#14
Elena Daskalaki, Peter Diem, Stavroula G Mougiakakou
Although reinforcement learning (RL) is suitable for highly uncertain systems, the applicability of this class of algorithms to medical treatment may be limited by the patient variability which dictates individualised tuning for their usually multiple algorithmic parameters. This study explores the feasibility of RL in the framework of artificial pancreas development for type 1 diabetes (T1D). In this approach, an Actor-Critic (AC) learning algorithm is designed and developed for the optimisation of insulin infusion for personalised glucose regulation...
2016: PloS One
https://www.readbyqxmd.com/read/27234192/study-of-positive-and-negative-feedback-sensitivity-in-psychosis-using-the-wisconsin-card-sorting-test
#15
Aida Farreny, Ángel Del Rey-Mejías, Gemma Escartin, Judith Usall, Núria Tous, Josep Maria Haro, Susana Ochoa
BACKGROUND: Schizophrenia involves marked motivational and learning deficits that may reflect abnormalities in reward processing. The purpose of this study was to examine positive and negative feedback sensitivity in schizophrenia using computational modeling derived from the Wisconsin Card Sorting Test (WCST). We also aimed to explore feedback sensitivity in a sample with bipolar disorder. METHODS: Eighty-three individuals with schizophrenia and 27 with bipolar disorder were included...
July 2016: Comprehensive Psychiatry
https://www.readbyqxmd.com/read/27175984/reduced-model-based-decision-making-in-schizophrenia
#16
Adam J Culbreth, Andrew Westbrook, Nathaniel D Daw, Matthew Botvinick, Deanna M Barch
Individuals with schizophrenia have a diminished ability to use reward history to adaptively guide behavior. However, tasks traditionally used to assess such deficits often rely on multiple cognitive and neural processes, leaving etiology unresolved. In the current study, we adopted recent computational formalisms of reinforcement learning to distinguish between model-based and model-free decision-making in hopes of specifying mechanisms associated with reinforcement-learning dysfunction in schizophrenia. Under this framework, decision-making is model-free to the extent that it relies solely on prior reward history, and model-based if it relies on prospective information such as motivational state, future consequences, and the likelihood of obtaining various outcomes...
2016: Journal of Abnormal Psychology
https://www.readbyqxmd.com/read/27084852/from-creatures-of-habit-to-goal-directed-learners-tracking-the-developmental-emergence-of-model-based-reinforcement-learning
#17
Johannes H Decker, A Ross Otto, Nathaniel D Daw, Catherine A Hartley
Theoretical models distinguish two decision-making strategies that have been formalized in reinforcement-learning theory. A model-based strategy leverages a cognitive model of potential actions and their consequences to make goal-directed choices, whereas a model-free strategy evaluates actions based solely on their reward history. Research in adults has begun to elucidate the psychological mechanisms and neural substrates underlying these learning processes and factors that influence their relative recruitment...
June 2016: Psychological Science
https://www.readbyqxmd.com/read/27064794/a-flexible-mechanism-of-rule-selection-enables-rapid-feature-based-reinforcement-learning
#18
Matthew Balcarras, Thilo Womelsdorf
Learning in a new environment is influenced by prior learning and experience. Correctly applying a rule that maps a context to stimuli, actions, and outcomes enables faster learning and better outcomes compared to relying on strategies for learning that are ignorant of task structure. However, it is often difficult to know when and how to apply learned rules in new contexts. In our study we explored how subjects employ different strategies for learning the relationship between stimulus features and positive outcomes in a probabilistic task context...
2016: Frontiers in Neuroscience
https://www.readbyqxmd.com/read/27052578/the-involvement-of-model-based-but-not-model-free-learning-signals-during-observational-reward-learning-in-the-absence-of-choice
#19
Simon Dunne, Arun D'Souza, John P O'Doherty
A major open question is whether computational strategies thought to be used during experiential learning, specifically model-based and model-free reinforcement learning, also support observational learning. Furthermore, the question of how observational learning occurs when observers must learn about the value of options from observing outcomes in the absence of choice has not been addressed. In the present study we used a multi-armed bandit task that encouraged human participants to employ both experiential and observational learning while they underwent functional magnetic resonance imaging (fMRI)...
June 1, 2016: Journal of Neurophysiology
https://www.readbyqxmd.com/read/26961942/human-choice-strategy-varies-with-anatomical-projections-from-ventromedial-prefrontal-cortex-to-medial-striatum
#20
Payam Piray, Ivan Toni, Roshan Cools
Two distinct systems, goal-directed and habitual, support decision making. It has recently been hypothesized that this distinction may arise from two computational mechanisms, model-based and model-free reinforcement learning, neuronally implemented in frontostriatal circuits involved in learning and behavioral control. Here, we test whether the relative strength of anatomical connectivity within frontostriatal circuits accounts for variation in human individuals' reliance on model-based and model-free control...
March 9, 2016: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
keyword
keyword
119339
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"