Most recent papers in the shared collection Reinforcement Learning

#1

JOURNAL ARTICLE

Does cognitive-behavioral therapy affect goal-directed planning in obsessive-compulsive disorder?

Michael G Wheaton, Claire M Gillan, H Blair Simpson

Cross-sectional studies have reported failures in goal-directed planning in obsessive-compulsive disorder (OCD). It remains unclear whether these deficits confer vulnerability to developing OCD, or are a consequence of symptoms. The present study examined goal-directed learning before and after cognitive behavioral therapy (CBT), using treatment as a tool to reduce symptoms. Eighteen adult OCD patients undergoing 17 sessions of CBT completed an established task of model-based (i.e., goal directed) versus model-free planning as well as measures of OCD and depression before and after treatment...

30640057

March 2019: Psychiatry Research

#2

JOURNAL ARTICLE

Optimal behavioral hierarchy.

Alec Solway, Carlos Diuk, Natalia Córdova, Debbie Yee, Andrew G Barto, Yael Niv, Matthew M Botvinick

Human behavior has long been recognized to display hierarchical structure: actions fit together into subtasks, which cohere into extended goal-directed activities. Arranging actions hierarchically has well established benefits, allowing behaviors to be represented efficiently by the brain, and allowing solutions to new tasks to be discovered easily. However, these payoffs depend on the particular way in which actions are organized into a hierarchy, the specific way in which tasks are carved up into subtasks...

25122479

August 2014: PLoS Computational Biology

#3

JOURNAL ARTICLE

Learning Reward Uncertainty in the Basal Ganglia.

John G Mikhael, Rafal Bogacz

Learning the reliability of different sources of rewards is critical for making optimal choices. However, despite the existence of detailed theory describing how the expected reward is learned in the basal ganglia, it is not known how reward uncertainty is estimated in these circuits. This paper presents a class of models that encode both the mean reward and the spread of the rewards, the former in the difference between the synaptic weights of D1 and D2 neurons, and the latter in their sum. In the models, the tendency to seek (or avoid) options with variable reward can be controlled by increasing (or decreasing) the tonic level of dopamine...

27589489

September 2016: PLoS Computational Biology

#4

JOURNAL ARTICLE

Model-based hierarchical reinforcement learning and human action control.

Matthew Botvinick, Ari Weinstein

Recent work has reawakened interest in goal-directed or 'model-based' choice, where decisions are based on prospective evaluation of potential action outcomes. Concurrently, there has been growing attention to the role of hierarchy in decision-making and action control. We focus here on the intersection between these two areas of interest, considering the topic of hierarchical model-based control. To characterize this form of action control, we draw on the computational framework of hierarchical reinforcement learning, using this to interpret recent empirical findings...

25267822

November 5, 2014: Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences

#5

REVIEW

Measuring wanting and liking from animals to humans: A systematic review.

Eva Pool, Vanessa Sennwald, Sylvain Delplanque, Tobias Brosch, David Sander

Animal research has shown it is possible to want a reward that is not liked once obtained. Although these findings have elicited interest, human experiments have produced contradictory results, raising doubts about the existence of separate wanting and liking influences in human reward processing. This discrepancy could be due to inconsistences in the operationalization of these concepts. We systematically reviewed the methodologies used to assess human wanting and/or liking and found that most studies operationalized these concepts in congruency with the animal literature...

26851575

April 2016: Neuroscience and Biobehavioral Reviews

#6

JOURNAL ARTICLE

A spiking neural network model of model-free reinforcement learning with high-dimensional sensory input and perceptual ambiguity.

Takashi Nakano, Makoto Otsuka, Junichiro Yoshimoto, Kenji Doya

A theoretical framework of reinforcement learning plays an important role in understanding action selection in animals. Spiking neural networks provide a theoretically grounded means to test computational hypotheses on neurally plausible algorithms of reinforcement learning through numerical simulation. However, most of these models cannot handle observations which are noisy, or occurred in the past, even though these are inevitable and constraining features of learning in real environments. This class of problem is formally known as partially observable reinforcement learning (PORL) problems...

25734662

2015: PloS One

#7

JOURNAL ARTICLE

Convergence of EEG and fMRI measures of reward anticipation.

Stephanie M Gorka, K Luan Phan, Stewart A Shankman

Deficits in reward anticipation are putative mechanisms for multiple psychopathologies. Research indicates that these deficits are characterized by reduced left (relative to right) frontal electroencephalogram (EEG) activity and blood oxygenation level-dependent (BOLD) signal abnormalities in mesolimbic and prefrontal neural regions during reward anticipation. Although it is often assumed that these two measures capture similar mechanisms, no study to our knowledge has directly examined the convergence between frontal EEG alpha asymmetry and functional magnetic resonance imaging (fMRI) during reward anticipation in the same sample...

26394333

December 2015: Biological Psychology

#8

JOURNAL ARTICLE

Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning.

Kenji Morita, Yasuo Kawaguchi

There are two prevailing notions regarding the involvement of the corticobasal ganglia system in value-based learning: (i) the direct and indirect pathways of the basal ganglia are crucial for appetitive and aversive learning, respectively, and (ii) the activity of midbrain dopamine neurons represents reward-prediction error. Although (ii) constitutes a critical assumption of (i), it remains elusive how (ii) holds given (i), with the basal-ganglia influence on the dopamine neurons. Here we present a computational neural-circuit model that potentially resolves this issue...

26095906

August 2015: European Journal of Neuroscience

#9

JOURNAL ARTICLE

Differential contributions of the globus pallidus and ventral thalamus to stimulus-response learning in humans.

Henning Schroll, Andreas Horn, Christine Gröschel, Christof Brücke, Götz Lütjens, Gerd-Helge Schneider, Joachim K Krauss, Andrea A Kühn, Fred H Hamker

The ability to learn associations between stimuli, responses and rewards is a prerequisite for survival. Models of reinforcement learning suggest that the striatum, a basal ganglia input nucleus, vitally contributes to these learning processes. Our recently presented computational model predicts, first, that not only the striatum, but also the globus pallidus contributes to the learning (i.e., exploration) of stimulus-response associations based on rewards. Secondly, it predicts that the stable execution (i...

26220740

November 15, 2015: NeuroImage

#10

JOURNAL ARTICLE

Reduction in ventral striatal activity when anticipating a reward in depression and schizophrenia: a replicated cross-diagnostic finding.

Gonzalo Arrondo, Nuria Segarra, Antonio Metastasio, Hisham Ziauddeen, Jennifer Spencer, Niels R Reinders, Robert B Dudas, Trevor W Robbins, Paul C Fletcher, Graham K Murray

In the research domain framework (RDoC), dysfunctional reward expectation has been proposed to be a cross-diagnostic domain in psychiatry, which may contribute to symptoms common to various neuropsychiatric conditions, such as anhedonia or apathy/avolition. We used a modified version of the Monetary Incentive Delay (MID) paradigm to obtain functional MRI images from 22 patients with schizophrenia, 24 with depression and 21 controls. Anhedonia and other symptoms of depression, and overall positive and negative symptomatology were also measured...

26379600

2015: Frontiers in Psychology

#11

JOURNAL ARTICLE

Model-Based Reasoning in Humans Becomes Automatic with Training.

Marcos Economides, Zeb Kurth-Nelson, Annika Lübbert, Marc Guitart-Masip, Raymond J Dolan

Model-based and model-free reinforcement learning (RL) have been suggested as algorithmic realizations of goal-directed and habitual action strategies. Model-based RL is more flexible than model-free but requires sophisticated calculations using a learnt model of the world. This has led model-based RL to be identified with slow, deliberative processing, and model-free RL with fast, automatic processing. In support of this distinction, it has recently been shown that model-based reasoning is impaired by placing subjects under cognitive load--a hallmark of non-automaticity...

26379239

September 2015: PLoS Computational Biology

#12

JOURNAL ARTICLE

Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning.

Guillaume Viejo, Mehdi Khamassi, Andrea Brovelli, Benoît Girard

Current learning theory provides a comprehensive description of how humans and other animals learn, and places behavioral flexibility and automaticity at heart of adaptive behaviors. However, the computations supporting the interactions between goal-directed and habitual decision-making systems are still poorly understood. Previous functional magnetic resonance imaging (fMRI) results suggest that the brain hosts complementary computations that may differentially support goal-directed and habitual processes in the form of a dynamical interplay rather than a serial recruitment of strategies...

26379518

2015: Frontiers in Behavioral Neuroscience

#13

JOURNAL ARTICLE

A new computational account of cognitive control over reinforcement-based decision-making: Modeling of a probabilistic learning task.

Sareh Zendehrouh

Recent work on decision-making field offers an account of dual-system theory for decision-making process. This theory holds that this process is conducted by two main controllers: a goal-directed system and a habitual system. In the reinforcement learning (RL) domain, the habitual behaviors are connected with model-free methods, in which appropriate actions are learned through trial-and-error experiences. However, goal-directed behaviors are associated with model-based methods of RL, in which actions are selected using a model of the environment...

26339919

November 2015: Neural Networks: the Official Journal of the International Neural Network Society

#14

REVIEW

Psychology of Habit.

Wendy Wood, Dennis Rünger

As the proverbial creatures of habit, people tend to repeat the same behaviors in recurring contexts. This review characterizes habits in terms of their cognitive, motivational, and neurobiological properties. In so doing, we identify three ways that habits interface with deliberate goal pursuit: First, habits form as people pursue goals by repeating the same responses in a given context. Second, as outlined in computational models, habits and deliberate goal pursuit guide actions synergistically, although habits are the efficient, default mode of response...

26361052

2016: Annual Review of Psychology

#15

JOURNAL ARTICLE

Arithmetic and local circuitry underlying dopamine prediction errors.

Neir Eshel, Michael Bukwich, Vinod Rao, Vivian Hemmelder, Ju Tian, Naoshige Uchida

Dopamine neurons are thought to facilitate learning by comparing actual and expected reward. Despite two decades of investigation, little is known about how this comparison is made. To determine how dopamine neurons calculate prediction error, we combined optogenetic manipulations with extracellular recordings in the ventral tegmental area while mice engaged in classical conditioning. Here we demonstrate, by manipulating the temporal expectation of reward, that dopamine neurons perform subtraction, a computation that is ideal for reinforcement learning but rarely observed in the brain...

26322583

September 10, 2015: Nature

#16

JOURNAL ARTICLE

Anticipatory pleasure predicts effective connectivity in the mesolimbic system.

Zhi Li, Chao Yan, Wei-Zhen Xie, Ke Li, Ya-Wei Zeng, Zhen Jin, Eric F C Cheung, Raymond C K Chan

Convergent evidence suggests the important role of the mesolimbic pathway in anticipating monetary rewards. However, the underlying mechanism of how the sub-regions interact with each other is still not clearly understood. Using dynamic causal modeling, we constructed a reward-related network for anticipating monetary reward using the Monetary Incentive Delay Task. Twenty-six healthy adolescents (Female/Male = 11/15; age = 18.69 ± 1.35 years; education = 12 ± 1.58 years) participated in the present study...

26321934

2015: Frontiers in Behavioral Neuroscience

#17

COMMENT

Exploration-exploitation: A cognitive dilemma still unresolved.

Russell N James

The solution to the exploration-exploitation dilemma presented essentially subsumes exploitation into an information-maximizing model. Such a single-maximization model is shown to be (1) more tractable than the initial dual-maximization dilemma, (2) useful in modeling information-maximizing subsystems, and (3) profitably applied in artificial simulations where exploration is costless. However, the model fails to resolve the dilemma in ethological or practical circumstances with objective outcomes, such as inclusive fitness, rather than information outcomes, such as lack of surprise...

26317249

2015: Cognitive Neuroscience

#18

REVIEW

Neurophysiology of Reward-Guided Behavior: Correlates Related to Predictions, Value, Motivation, Errors, Attention, and Action.

Gregory B Bissonette, Matthew R Roesch

Many brain areas are activated by the possibility and receipt of reward. Are all of these brain areas reporting the same information about reward? Or are these signals related to other functions that accompany reward-guided learning and decision-making? Through carefully controlled behavioral studies, it has been shown that reward-related activity can represent reward expectations related to future outcomes, errors in those expectations, motivation, and signals related to goal- and habit-driven behaviors. These dissociations have been accomplished by manipulating the predictability of positively and negatively valued events...

26276036

2016: Current Topics in Behavioral Neurosciences

#19

REVIEW

Avoidance learning: a review of theoretical models and recent developments.

Angelos-Miltiadis Krypotos, Marieke Effting, Merel Kindt, Tom Beckers

Avoidance is a key characteristic of adaptive and maladaptive fear. Here, we review past and contemporary theories of avoidance learning. Based on the theories, experimental findings and clinical observations reviewed, we distill key principles of how adaptive and maladaptive avoidance behavior is acquired and maintained. We highlight clinical implications of avoidance learning theories and describe intervention strategies that could reduce maladaptive avoidance and prevent its return. We end with a brief overview of recent developments and avenues for further research...

26257618

2015: Frontiers in Behavioral Neuroscience

#20

COMPARATIVE STUDY

Instrumental learning of traits versus rewards: dissociable neural correlates and effects on choice.

Leor M Hackel, Bradley B Doll, David M Amodio

Humans learn about people and objects through positive and negative experiences, yet they can also look beyond the immediate reward of an interaction to encode trait-level attributes. We found that perceivers encoded both reward and trait-level information through feedback in an instrumental learning task, but relied more heavily on trait representations in cross-context decisions. Both learning types implicated ventral striatum, but trait learning also recruited a network associated with social impression formation...

26237363

September 2015: Nature Neuroscience

Use the collections feature with a free QxMD account.

Reinforcement Learning

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips