keyword
MENU ▼
Read by QxMD icon Read
search

model-based reinforcement learning

keyword
https://www.readbyqxmd.com/read/28626011/effects-of-ventral-striatum-lesions-on-stimulus-versus-action-based-reinforcement-learning
#1
Kathryn M Rothenhoefer, Vincent D Costa, Ramón Bartolo, Raquel Vicario-Feliciano, Elisabeth A Murray, Bruno B Averbeck
Learning the values of actions versus stimuli may depend on separable neural circuits. In the current study, we evaluated ventral striatum (VS) lesioned macaques' performance on a two-arm bandit task that had randomly interleaved blocks of stimulus based and action based reinforcement learning (RL). Compared to controls, monkeys with VS lesions had deficits in learning to select rewarding images but not rewarding actions. We used a RL model to quantify learning and choice consistency and found that, in stimulus based RL, the VS lesion monkeys were more influenced by negative feedback and had lower choice consistency than controls...
June 16, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/28612515/the-benefits-of-a-peer-assisted-mock-paces
#2
Sarim Siddiqui, Samee Siddiqui, Qamar Mustafa, Abeer F Rizvi, Ibtesham T Hossain
BACKGROUND: Peer-assisted learning (PAL) and mock examinations have been credited as effective teaching tools; however, there is a lack of research into their effectiveness in PACES (practical assessment of clinical examination skills). This study demonstrates an effective model and the benefits of PAL after its implementation in a mock PACES at Imperial College London. There is a lack of research into the effectiveness of PAL and mock examinations in PACES METHODS: A mock PACES was designed for fifth-year medical students...
June 14, 2017: Clinical Teacher
https://www.readbyqxmd.com/read/28599832/model-based-control-in-dimensional-psychiatry
#3
REVIEW
Valerie Voon, Andrea Reiter, Miriam Sebold, Stephanie Groman
We use parallel interacting goal-directed and habitual strategies to make our daily decisions. The arbitration between these strategies is relevant to inflexible repetitive behaviors in psychiatric disorders. Goal-directed control, also known as model-based control, is based on an affective outcome relying on a learned internal model to prospectively make decisions. In contrast, habit control, also known as model-free control, is based on an integration of previous reinforced learning autonomous of the current outcome value and is implicit and more efficient but at the cost of greater inflexibility...
April 23, 2017: Biological Psychiatry
https://www.readbyqxmd.com/read/28585051/the-role-of-the-putamen-in-language-a-meta-analytic-connectivity-modeling-study
#4
Nestor Viñas-Guasch, Yan Jing Wu
The putamen is a subcortical structure that forms part of the dorsal striatum of basal ganglia, and has traditionally been associated with reinforcement learning and motor control, including speech articulation. However, recent studies have shown involvement of the left putamen in other language functions such as bilingual language processing (Abutalebi et al. 2012) and production, with some authors arguing for functional segregation of anterior and posterior putamen (Oberhuber et al. 2013). A further step in exploring the role of putamen in language would involve identifying the network of coactivations of not only the left, but also the right putamen, given the involvement of right hemisphere in high order language functions (Vigneau et al...
June 5, 2017: Brain Structure & Function
https://www.readbyqxmd.com/read/28581478/reinstated-episodic-context-guides-sampling-based-decisions-for-reward
#5
Aaron M Bornstein, Kenneth A Norman
How does experience inform decisions? In episodic sampling, decisions are guided by a few episodic memories of past choices. This process can yield choice patterns similar to model-free reinforcement learning; however, samples can vary from trial to trial, causing decisions to vary. Here we show that context retrieved during episodic sampling can cause choice behavior to deviate sharply from the predictions of reinforcement learning. Specifically, we show that, when a given memory is sampled, choices (in the present) are influenced by the properties of other decisions made in the same context as the sampled event...
June 5, 2017: Nature Neuroscience
https://www.readbyqxmd.com/read/28575424/association-between-habenula-dysfunction-and-motivational-symptoms-in-unmedicated-major-depressive-disorder
#6
Wen-Hua Liu, Vincent Valton, Ling-Zhi Wang, Yu-Hua Zhu, Jonathan P Roiser
The lateral habenula plays a central role in reward and punishment processing and has been suggested to drive the cardinal symptom of anhedonia in depression. This hypothesis is largely based on observations of habenula hypermetabolism in animal models of depression, but the activity of habenula and its relationship with clinical symptoms in patients with depression remains unclear. High-resolution functional magnetic resonance imaging (fMRI) and computational modelling were used to investigate the activity of the habenula during a probabilistic reinforcement learning task with rewarding and punishing outcomes in 21 unmedicated patients with major depression and 17 healthy participants...
May 29, 2017: Social Cognitive and Affective Neuroscience
https://www.readbyqxmd.com/read/28573384/a-simple-computational-algorithm-of-model-based-choice-preference
#7
Asako Toyama, Kentaro Katahira, Hideki Ohira
A broadly used computational framework posits that two learning systems operate in parallel during the learning of choice preferences-namely, the model-free and model-based reinforcement-learning systems. In this study, we examined another possibility, through which model-free learning is the basic system and model-based information is its modulator. Accordingly, we proposed several modified versions of a temporal-difference learning model to explain the choice-learning process. Using the two-stage decision task developed by Daw, Gershman, Seymour, Dayan, and Dolan (2011), we compared their original computational model, which assumes a parallel learning process, and our proposed models, which assume a sequential learning process...
June 1, 2017: Cognitive, Affective & Behavioral Neuroscience
https://www.readbyqxmd.com/read/28559955/a-plausible-neural-circuit-for-decision-making-and-its-formation-based-on-reinforcement-learning
#8
Hui Wei, Dawei Dai, Yijie Bu
A human's, or lower insects', behavior is dominated by its nervous system. Each stable behavior has its own inner steps and control rules, and is regulated by a neural circuit. Understanding how the brain influences perception, thought, and behavior is a central mandate of neuroscience. The phototactic flight of insects is a widely observed deterministic behavior. Since its movement is not stochastic, the behavior should be dominated by a neural circuit. Based on the basic firing characteristics of biological neurons and the neural circuit's constitution, we designed a plausible neural circuit for this phototactic behavior from logic perspective...
June 2017: Cognitive Neurodynamics
https://www.readbyqxmd.com/read/28548893/dynamic-decision-making-learning-processes-and-new-research-directions
#9
Cleotilde Gonzalez, Pegah Fakhari, Jerome Busemeyer
OBJECTIVE: The aim of this manuscript is to provide a review of contemporary research and applications on dynamic decision making (DDM). BACKGROUND: Since early DDM studies, there has been little systematic progress in understanding decision making in complex, dynamic systems. Our review contributes to better understanding of decision making processes in dynamic tasks. METHOD: We discuss new research directions in DDM to highlight the value of simplification in the study of complex decision processes, divided into experimental and theoretical/computational approaches, and focus on problems involving control tasks and search-and-choice tasks...
May 1, 2017: Human Factors
https://www.readbyqxmd.com/read/28499738/cognitive-effects-of-subdiaphragmatic-vagal-deafferentation-in-rats
#10
Melanie Klarer, Ulrike Weber-Stadlbauer, Myrtha Arnold, Wolfgang Langhans, Urs Meyer
Vagal afferents are a crucial neuronal component of the gut-brain axis and mediate the information flow from the viscera to the central nervous system. Based on the findings provided by experiments involving vagus nerve stimulation, it has been suggested that vagal afferent signaling may influence various cognitive functions such as recognition memory and cognitive flexibility. Here, we examined this hypothesis using a rat model of subdiaphragmatic vagal deafferentation (SDA), the most complete and selective abdominal vagal deafferentation method existing to date...
May 9, 2017: Neurobiology of Learning and Memory
https://www.readbyqxmd.com/read/28495350/decaying-relevance-of-clinical-data-towards-future-decisions-in-data-driven-inpatient-clinical-order-sets
#11
Jonathan H Chen, Muthuraman Alagappan, Mary K Goldstein, Steven M Asch, Russ B Altman
OBJECTIVE: Determine how varying longitudinal historical training data can impact prediction of future clinical decisions. Estimate the "decay rate" of clinical data source relevance. MATERIALS AND METHODS: We trained a clinical order recommender system, analogous to Netflix or Amazon's "Customers who bought A also bought B..." product recommenders, based on a tertiary academic hospital's structured electronic health record data. We used this system to predict future (2013) admission orders based on different subsets of historical training data (2009 through 2012), relative to existing human-authored order sets...
June 2017: International Journal of Medical Informatics
https://www.readbyqxmd.com/read/28473250/extinction-of-pavlovian-conditioning-the-influence-of-trial-number-and-reinforcement-history
#12
REVIEW
C K J Chan, Justin A Harris
Pavlovian conditioning is sensitive to the temporal relationship between the conditioned stimulus (CS) and the unconditioned stimulus (US). This has motivated models that describe learning as a process that continuously updates associative strength during the trial or specifically encodes the CS-US interval. These models predict that extinction of responding is also continuous, such that response loss is proportional to the cumulative duration of exposure to the CS without the US. We review evidence showing that this prediction is incorrect, and that extinction is trial-based rather than time-based...
May 1, 2017: Behavioural Processes
https://www.readbyqxmd.com/read/28468745/erosion-of-digital-professionalism-during-medical-students-core-clinical-clerkships
#13
Arash Mostaghimi, Aleksandra E Olszewski, Sigall K Bell, David H Roberts, Bradley H Crotty
BACKGROUND: The increased use of social media, cloud computing, and mobile devices has led to the emergence of guidelines and novel teaching efforts to guide students toward the appropriate use of technology. Despite this, violations of professional conduct are common. OBJECTIVE: We sought to explore professional behaviors specific to appropriate use of technology by looking at changes in third-year medical students' attitudes and behaviors at the beginning and conclusion of their clinical clerkships...
May 3, 2017: JMIR Medical Education
https://www.readbyqxmd.com/read/28441518/learning-to-allocate-limited-time-to-decisions-with-different-expected-outcomes
#14
Arash Khodadadi, Pegah Fakhari, Jerome R Busemeyer
The goal of this article is to investigate how human participants allocate their limited time to decisions with different properties. We report the results of two behavioral experiments. In each trial of the experiments, the participant must accumulate noisy information to make a decision. The participants received positive and negative rewards for their correct and incorrect decisions, respectively. The stimulus was designed such that decisions based on more accumulated information were more accurate but took longer...
June 2017: Cognitive Psychology
https://www.readbyqxmd.com/read/28421669/misfortune-may-be-a-blessing-in-disguise-fairness-perception-and-emotion-modulate-decision-making
#15
Hong-Hsiang Liu, Yin-Dir Hwang, Ming H Hsieh, Yung-Fong Hsu, Wen-Sung Lai
Fairness perception and equality during social interactions frequently elicit affective arousal and affect decision making. By integrating the dictator game and a probabilistic gambling task, this study aimed to investigate the effects of a negative experience induced by perceived unfairness on decision making using behavioral, model fitting, and electrophysiological approaches. Participants were randomly assigned to the neutral, harsh, or kind groups, which consisted of various asset allocation scenarios to induce different levels of perceived unfairness...
April 19, 2017: Psychophysiology
https://www.readbyqxmd.com/read/28417944/variable-admittance-control-based-on-fuzzy-reinforcement-learning-for-minimally-invasive-surgery-manipulator
#16
Zhijiang Du, Wei Wang, Zhiyuan Yan, Wei Dong, Weidong Wang
In order to get natural and intuitive physical interaction in the pose adjustment of the minimally invasive surgery manipulator, a hybrid variable admittance model based on Fuzzy Sarsa(λ)-learning is proposed in this paper. The proposed model provides continuous variable virtual damping to the admittance controller to respond to human intentions, and it effectively enhances the comfort level during the task execution by modifying the generated virtual damping dynamically. A fuzzy partition defined over the state space is used to capture the characteristics of the operator in physical human-robot interaction...
April 12, 2017: Sensors
https://www.readbyqxmd.com/read/28408878/reward-based-motor-adaptation-mediated-by-basal-ganglia
#17
Taegyo Kim, Khaldoun C Hamade, Dmitry Todorov, William H Barnett, Robert A Capps, Elizaveta M Latash, Sergey N Markin, Ilya A Rybak, Yaroslav I Molkov
It is widely accepted that the basal ganglia (BG) play a key role in action selection and reinforcement learning. However, despite considerable number of studies, the BG architecture and function are not completely understood. Action selection and reinforcement learning are facilitated by the activity of dopaminergic neurons, which encode reward prediction errors when reward outcomes are higher or lower than expected. The BG are thought to select proper motor responses by gating appropriate actions, and suppressing inappropriate ones...
2017: Frontiers in Computational Neuroscience
https://www.readbyqxmd.com/read/28383956/hold-it-the-influence-of-lingering-rewards-on-choice-diversification-and-persistence
#18
Christin Schulze, Don van Ravenzwaaij, Ben R Newell
Learning to choose adaptively when faced with uncertain and variable outcomes is a central challenge for decision makers. This study examines repeated choice in dynamic probability learning tasks in which outcome probabilities changed either as a function of the choices participants made or independently of those choices. This presence/absence of sequential choice-outcome dependencies was implemented by manipulating a single task aspect between conditions: the retention/withdrawal of reward across individual choice trials...
April 6, 2017: Journal of Experimental Psychology. Learning, Memory, and Cognition
https://www.readbyqxmd.com/read/28383500/a-qos-optimization-approach-in-cognitive-body-area-networks-for-healthcare-applications
#19
Tauseef Ahmed, Yannick Le Moullec
Wireless body area networks are increasingly featuring cognitive capabilities. This work deals with the emerging concept of cognitive body area networks. In particular, the paper addresses two important issues, namely spectrum sharing and interferences. We propose methods for channel and power allocation. The former builds upon a reinforcement learning mechanism, whereas the latter is based on convex optimization. Furthermore, we also propose a mathematical channel model for off-body communication links in line with the IEEE 802...
April 6, 2017: Sensors
https://www.readbyqxmd.com/read/28362620/finite-horizon-h%C3%A2-tracking-control-for-unknown-nonlinear-systems-with-saturating-actuators
#20
Huaguang Zhang, Xiaohong Cui, Yanhong Luo, He Jiang
In this paper, a neural network (NN)-based online model-free integral reinforcement learning algorithm is developed to solve the finite-horizon H∞ optimal tracking control problem for completely unknown nonlinear continuous-time systems with disturbance and saturating actuators (constrained control input). An augmented system is constructed with the tracking error system and the command generator system. A time-varying Hamilton-Jacobi-Isaacs (HJI) equation is formulated for the augmented problem, which is extremely difficult or impossible to solve due to its time-dependent property and nonlinearity...
March 1, 2017: IEEE Transactions on Neural Networks and Learning Systems
keyword
keyword
103848
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"