keyword
MENU ▼
Read by QxMD icon Read
search

Reward based learning

keyword
https://www.readbyqxmd.com/read/28227165/reward-value-is-encoded-in-primary-somatosensory-cortex-and-can-be-decoded-from-neural-activity-during-performance-of-a-psychophysical-task
#1
David B McNiel, John S Choi, John P Hessburg, Joseph T Francis, David B McNiel, John S Choi, John P Hessburg, Joseph T Francis, John P Hessburg, Joseph T Francis, John S Choi, David B McNiel
Encoding of reward valence has been shown in various brain regions, including deep structures such as the substantia nigra as well as cortical structures such as the orbitofrontal cortex. While the correlation between these signals and reward valence have been shown in aggregated data comprised of many trials, little work has been done investigating the feasibility of decoding reward valence on a single trial basis. Towards this goal, one non-human primate (macaca radiata) was trained to grip and hold a target level of force in order to earn zero, one, two, or three juice rewards...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28226424/reward-gain-model-describes-cortical-use-dependent-plasticity
#2
Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik, Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik, Firas Mawase, Pablo Celnik, Shintaro Uehara, Nicholas Wymbs
Consistent repetitions of an action lead to plastic change in the motor cortex and cause shift in the direction of future movements. This process is known as use-dependent plasticity (UDP), one of the basic forms of the motor memory. We have recently demonstrated in a physiological study that success-related reinforcement signals could modulate the strength of UDP. We tested this idea by developing a computational approach that modeled the shift in the direction of future action as a change in preferred direction of population activity of neurons in the primary motor cortex...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28225034/simulating-future-value-in-intertemporal-choice
#3
Alec Solway, Terry Lohrenz, P Read Montague
The laboratory study of how humans and other animals trade-off value and time has a long and storied history, and is the subject of a vast literature. However, despite a long history of study, there is no agreed upon mechanistic explanation of how intertemporal choice preferences arise. Several theorists have recently proposed model-based reinforcement learning as a candidate framework. This framework describes a suite of algorithms by which a model of the environment, in the form of a state transition function and reward function, can be converted on-line into a decision...
February 22, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28221836/mimicry-among-unequally-defended-prey-should-be-mutualistic-when-predators-sample-optimally
#4
Thomas G Aubier, Mathieu Joron, Thomas N Sherratt
Understanding the conditions under which moderately defended prey evolve to resemble better-defended prey and whether this mimicry is parasitic (quasi-Batesian) or mutualistic (Müllerian) is central to our understanding of warning signals. Models of predator learning generally predict quasi-Batesian relationships. However, predators' attack decisions are based not only on learning alone but also on the potential future rewards. We identify the optimal sampling strategy of predators capable of classifying prey into different profitability categories and contrast the implications of these rules for mimicry evolution with a classical Pavlovian model based on conditioning...
March 2017: American Naturalist
https://www.readbyqxmd.com/read/28220089/emotion-regulation-therapy-a-mechanism-targeted-treatment-for-disorders-of-distress
#5
Megan E Renna, Jean M Quintero, David M Fresco, Douglas S Mennin
"Distress disorders," which include generalized anxiety disorder and major depression are often highly comorbid with each other and appear to be characterized by common temperamental features that reflect heightened sensitivity to underlying motivational systems related to threat/safety and reward/loss. Further, individuals with distress disorders tend to utilize self-referential processes (e.g., worry, rumination, self-criticism) in a maladaptive attempt to respond to motivationally relevant distress, often resulting in suboptimal contextual learning...
2017: Frontiers in Psychology
https://www.readbyqxmd.com/read/28212422/probability-matching-in-perceptrons-effects-of-conditional-dependence-and-linear-nonseparability
#6
Michael R W Dawson, Maya Gupta
Probability matching occurs when the behavior of an agent matches the likelihood of occurrence of events in the agent's environment. For instance, when artificial neural networks match probability, the activity in their output unit equals the past probability of reward in the presence of a stimulus. Our previous research demonstrated that simple artificial neural networks (perceptrons, which consist of a set of input units directly connected to a single output unit) learn to match probability when presented different cues in isolation...
2017: PloS One
https://www.readbyqxmd.com/read/28210998/on-t%C3%AF-he-distinction-between-value-driven-attention-and-selection-history-evidence-from-individuals-with-depressive-symptoms
#7
Brian A Anderson, Michelle Chiu, Michelle M DiBartolo, Stephanie L Leal
When predictive of extrinsic reward as targets, stimuli rapidly acquire the ability to automatically capture attention. Attentional biases for former targets of visual search also can develop without reward feedback but typically require much longer training. These learned biases towards former targets often are conceptualized within a single framework and might differ merely in degree. That is, both are the result of the reinforcement of selection history, with extrinsic reward for correct report of the target providing greater reinforcement than correct report alone...
February 16, 2017: Psychonomic Bulletin & Review
https://www.readbyqxmd.com/read/28207733/beyond-negative-valence-2-week-administration-of-a-serotonergic-antidepressant-enhances-both-reward-and-effort-learning-signals
#8
Jacqueline Scholl, Nils Kolling, Natalie Nelissen, Michael Browning, Matthew F S Rushworth, Catherine J Harmer
To make good decisions, humans need to learn about and integrate different sources of appetitive and aversive information. While serotonin has been linked to value-based decision-making, its role in learning is less clear, with acute manipulations often producing inconsistent results. Here, we show that when the effects of a selective serotonin reuptake inhibitor (SSRI, citalopram) are studied over longer timescales, learning is robustly improved. We measured brain activity with functional magnetic resonance imaging (fMRI) in volunteers as they performed a concurrent appetitive (money) and aversive (effort) learning task...
February 2017: PLoS Biology
https://www.readbyqxmd.com/read/28196108/post-traumatic-stress-disorder-symptom-burden-and-gender-each-affect-generalization-in-a-reward-and-punishment-learning-task
#9
Milen L Radell, Kevin D Beck, Mark W Gilbertson, Catherine E Myers
Post-traumatic stress disorder (PTSD) can develop following exposure to a traumatic event. Re-experiencing, which includes intrusive memories or flashbacks of the trauma, is a core symptom cluster of PTSD. From an associative learning perspective, this cluster may be attributed to cues associated with the trauma, which have come to elicit symptoms in a variety of situations encountered in daily life due to a tendency to overgeneralize. Consistent with this, prior studies have indicated that both individuals with clinically diagnosed with PTSD, and those with self-reported symptoms who may not meet full diagnostic criteria, show changes in generalization...
2017: PloS One
https://www.readbyqxmd.com/read/28191003/want-more-learn-less-motivation-affects-adolescents-learning-from-negative-feedback
#10
Yun Zhuang, Wenfeng Feng, Yu Liao
The primary goal of the present study was to investigate how positive and negative feedback may differently facilitate learning throughout development. In addition, the role of motivation as a modulating factor was examined. Participants (children, adolescents, and adults) completed two forms of the guess and application task (GAT). Feedback from the Cool-GAT task has low motivational salience because there are no consequences, while feedback from the Hot-GAT task has high motivational salience as it pertains to receiving a reward...
2017: Frontiers in Psychology
https://www.readbyqxmd.com/read/28185881/the-association-between-endogenous-testosterone-level-and-behavioral-flexibility-in-young-men-evidence-from-stimulus-outcome-reversal-learning
#11
Esther K Diekhof, Susanne Kraft
The capacity to flexibly adapt responding to unexpected changes in the environment is crucial for survival. Several neurotransmitters have been implicated in stimulus-outcome reversal learning. Yet, it remains an open question whether inter-individual differences in the neuroactive hormone testosterone may also be related to this type of behavioral flexibility. In this study we assessed the association between endogenous testosterone level and reversal learning in young healthy men. We used an observer reversal learning task, in which subjects viewed computer-based decisions between two stimuli, of which one was currently rewarded while the other one was punished...
February 20, 2017: Hormones and Behavior
https://www.readbyqxmd.com/read/28176215/on-the-value-dependence-of-value-driven-attentional-capture
#12
Brian A Anderson, Madeline Halpern
Findings from an increasingly large number of studies have been used to argue that attentional capture can be dependent on the learned value of a stimulus, or value-driven. However, under certain circumstances attention can be biased to select stimuli that previously served as targets, independent of reward history. Value-driven attentional capture, as studied using the training phase-test phase design introduced by Anderson and colleagues, is widely presumed to reflect the combined influence of learned value and selection history...
February 7, 2017: Attention, Perception & Psychophysics
https://www.readbyqxmd.com/read/28174533/reward-dependent-invigoration-relates-to-theta-oscillations-and-is-predicted-by-dopaminergic-midbrain-integrity-in-healthy-elderly
#13
Tineke K Steiger, Nico Bunzeck
Motivation can have invigorating effects on behavior via dopaminergic neuromodulation. While this relationship has mainly been established in theoretical models and studies in younger subjects, the impact of structural declines of the dopaminergic system during healthy aging remains unclear. To investigate this issue, we used electroencephalography (EEG) in healthy young and elderly humans in a reward-learning paradigm. Specifically, scene images were initially encoded by combining them with cues predicting monetary reward (high vs...
2017: Frontiers in Aging Neuroscience
https://www.readbyqxmd.com/read/28170057/animal-models-of-drug-addiction
#14
María Pilar García Pardo, Concepción Roger Sánchez, José Enrique De la Rubia Ortí, María Asunción Aguilar Calpe
The development of animal models of drug reward and addiction is an essential factor for progress in understanding the biological basis of this disorder and for the identification of new therapeutic targets. Depending on the component of reward to be studied, one type of animal model or another may be used. There are models of reinforcement based on the primary hedonic effect produced by the consumption of the addictive substance, such as the self-administration (SA) and intracranial self-stimulation (ICSS) paradigms, and there are models based on the component of reward related to associative learning and cognitive ability to make predictions about obtaining reward in the future, such as the conditioned place preference (CPP) paradigm...
January 12, 2017: Adicciones
https://www.readbyqxmd.com/read/28167910/adaptive-baseline-enhances-em-based-policy-search-validation-in-a-view-based-positioning-task-of-a-smartphone-balancer
#15
Jiexin Wang, Eiji Uchibe, Kenji Doya
EM-based policy search methods estimate a lower bound of the expected return from the histories of episodes and iteratively update the policy parameters using the maximum of a lower bound of expected return, which makes gradient calculation and learning rate tuning unnecessary. Previous algorithms like Policy learning by Weighting Exploration with the Returns, Fitness Expectation Maximization, and EM-based Policy Hyperparameter Exploration implemented the mechanisms to discard useless low-return episodes either implicitly or using a fixed baseline determined by the experimenter...
2017: Frontiers in Neurorobotics
https://www.readbyqxmd.com/read/28158309/learning-by-stimulation-avoidance-a-principle-to-control-spiking-neural-networks-dynamics
#16
Lana Sinapayen, Atsushi Masumori, Takashi Ikegami
Learning based on networks of real neurons, and learning based on biologically inspired models of neural networks, have yet to find general learning rules leading to widespread applications. In this paper, we argue for the existence of a principle allowing to steer the dynamics of a biologically inspired neural network. Using carefully timed external stimulation, the network can be driven towards a desired dynamical state. We term this principle "Learning by Stimulation Avoidance" (LSA). We demonstrate through simulation that the minimal sufficient conditions leading to LSA in artificial networks are also sufficient to reproduce learning results similar to those obtained in biological neurons by Shahaf and Marom, and in addition explains synaptic pruning...
2017: PloS One
https://www.readbyqxmd.com/read/28148725/the-role-of-orbitofrontal-amygdala-interactions-in-updating-action-outcome-valuations-in-macaques
#17
Emily C Fiuzat, Sarah E V Rhodes, Elisabeth A Murray
: A previous study revealed that although monkeys with bilateral lesions of either the orbitofrontal cortex (OFC) or the amygdala could learn an action-outcome task, they could not adapt their choices in response to devalued outcomes. Specifically, they could not adjust their choice between two actions after the value of the outcome associated with one of the actions had decreased. Here we examined whether OFC needs to functionally interact with the amygdala inmediating such choices. Rhesus monkeys were trained to make two mutually exclusive actions on a touch-sensitive screen: 'tap' and 'hold'...
February 1, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/28148471/a-self-regulation-theory-based-asthma-management-mobile-app-for-adolescents-a-usability-assessment
#18
Adam Sage, Courtney Roberts, Lorie Geryk, Betsy Sleath, Deborah Tate, Delesha Carpenter
BACKGROUND: Self-regulation theory suggests people learn to influence their own behavior through self-monitoring, goal-setting, feedback, self-reward, and self-instruction, all of which smartphones are now capable of facilitating. Several mobile apps exist to manage asthma; however, little evidence exists about whether these apps employ user-centered design processes that adhere to government usability guidelines for mobile apps. OBJECTIVE: Building upon a previous study that documented adolescent preferences for an asthma self-management app, we employed a user-centered approach to assess the usability of a high-fidelity wireframe for an asthma self-management app intended for use by adolescents with persistent asthma...
February 1, 2017: JMIR Human Factors
https://www.readbyqxmd.com/read/28146248/disruption-of-reward-processing-in-addiction-an-image-based-meta-analysis-of-functional-magnetic-resonance-imaging-studies
#19
Maartje Luijten, Arnt F Schellekens, Simone Kühn, Marise W J Machielse, Guillaume Sescousse
Importance: Disrupted reward processing, mainly driven by striatal dysfunction, is a key characteristic of addictive behaviors. However, functional magnetic resonance imaging (fMRI) studies have reported conflicting results, with both hypoactivations and hyperactivations during anticipation and outcome notification of monetary rewards in addiction. Objective: To determine the nature and direction of reward-processing disruptions during anticipation and outcome notification of monetary rewards in individuals with addiction using image-based meta-analyses of fMRI studies...
February 1, 2017: JAMA Psychiatry
https://www.readbyqxmd.com/read/28143961/motor-learning-enhances-use-dependent-plasticity
#20
Firas Mawase, Shintaro Uehara, Amy Bastian, Pablo Celnik
: Motor behaviors are shaped not only by current sensory signals but also by the history of recent experiences. For instance, repeated movements toward a particular target bias the subsequent movements toward that target direction. This process, called use-dependent plasticity (UDP), is considered a basic and goal-independent way of forming motor memories. Most studies consider movement history as the critical component that leads to UDP (Classen et al., 1998, Verstynen and Sabes, 2011)...
January 31, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
keyword
keyword
106925
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"