keyword
MENU ▼
Read by QxMD icon Read
search

model-based reinforcement learning

keyword
https://www.readbyqxmd.com/read/28441518/learning-to-allocate-limited-time-to-decisions-with-different-expected-outcomes
#1
Arash Khodadadi, Pegah Fakhari, Jerome R Busemeyer
The goal of this article is to investigate how human participants allocate their limited time to decisions with different properties. We report the results of two behavioral experiments. In each trial of the experiments, the participant must accumulate noisy information to make a decision. The participants received positive and negative rewards for their correct and incorrect decisions, respectively. The stimulus was designed such that decisions based on more accumulated information were more accurate but took longer...
April 19, 2017: Cognitive Psychology
https://www.readbyqxmd.com/read/28421669/misfortune-may-be-a-blessing-in-disguise-fairness-perception-and-emotion-modulate-decision-making
#2
Hong-Hsiang Liu, Yin-Dir Hwang, Ming H Hsieh, Yung-Fong Hsu, Wen-Sung Lai
Fairness perception and equality during social interactions frequently elicit affective arousal and affect decision making. By integrating the dictator game and a probabilistic gambling task, this study aimed to investigate the effects of a negative experience induced by perceived unfairness on decision making using behavioral, model fitting, and electrophysiological approaches. Participants were randomly assigned to the neutral, harsh, or kind groups, which consisted of various asset allocation scenarios to induce different levels of perceived unfairness...
April 19, 2017: Psychophysiology
https://www.readbyqxmd.com/read/28417944/variable-admittance-control-based-on-fuzzy-reinforcement-learning-for-minimally-invasive-surgery-manipulator
#3
Zhijiang Du, Wei Wang, Zhiyuan Yan, Wei Dong, Weidong Wang
In order to get natural and intuitive physical interaction in the pose adjustment of the minimally invasive surgery manipulator, a hybrid variable admittance model based on Fuzzy Sarsa(λ)-learning is proposed in this paper. The proposed model provides continuous variable virtual damping to the admittance controller to respond to human intentions, and it effectively enhances the comfort level during the task execution by modifying the generated virtual damping dynamically. A fuzzy partition defined over the state space is used to capture the characteristics of the operator in physical human-robot interaction...
April 12, 2017: Sensors
https://www.readbyqxmd.com/read/28408878/reward-based-motor-adaptation-mediated-by-basal-ganglia
#4
Taegyo Kim, Khaldoun C Hamade, Dmitry Todorov, William H Barnett, Robert A Capps, Elizaveta M Latash, Sergey N Markin, Ilya A Rybak, Yaroslav I Molkov
It is widely accepted that the basal ganglia (BG) play a key role in action selection and reinforcement learning. However, despite considerable number of studies, the BG architecture and function are not completely understood. Action selection and reinforcement learning are facilitated by the activity of dopaminergic neurons, which encode reward prediction errors when reward outcomes are higher or lower than expected. The BG are thought to select proper motor responses by gating appropriate actions, and suppressing inappropriate ones...
2017: Frontiers in Computational Neuroscience
https://www.readbyqxmd.com/read/28383956/hold-it-the-influence-of-lingering-rewards-on-choice-diversification-and-persistence
#5
Christin Schulze, Don van Ravenzwaaij, Ben R Newell
Learning to choose adaptively when faced with uncertain and variable outcomes is a central challenge for decision makers. This study examines repeated choice in dynamic probability learning tasks in which outcome probabilities changed either as a function of the choices participants made or independently of those choices. This presence/absence of sequential choice-outcome dependencies was implemented by manipulating a single task aspect between conditions: the retention/withdrawal of reward across individual choice trials...
April 6, 2017: Journal of Experimental Psychology. Learning, Memory, and Cognition
https://www.readbyqxmd.com/read/28383500/a-qos-optimization-approach-in-cognitive-body-area-networks-for-healthcare-applications
#6
Tauseef Ahmed, Yannick Le Moullec
Wireless body area networks are increasingly featuring cognitive capabilities. This work deals with the emerging concept of cognitive body area networks. In particular, the paper addresses two important issues, namely spectrum sharing and interferences. We propose methods for channel and power allocation. The former builds upon a reinforcement learning mechanism, whereas the latter is based on convex optimization. Furthermore, we also propose a mathematical channel model for off-body communication links in line with the IEEE 802...
April 6, 2017: Sensors
https://www.readbyqxmd.com/read/28362620/finite-horizon-h%C3%A2-tracking-control-for-unknown-nonlinear-systems-with-saturating-actuators
#7
Huaguang Zhang, Xiaohong Cui, Yanhong Luo, He Jiang
In this paper, a neural network (NN)-based online model-free integral reinforcement learning algorithm is developed to solve the finite-horizon H∞ optimal tracking control problem for completely unknown nonlinear continuous-time systems with disturbance and saturating actuators (constrained control input). An augmented system is constructed with the tracking error system and the command generator system. A time-varying Hamilton-Jacobi-Isaacs (HJI) equation is formulated for the augmented problem, which is extremely difficult or impossible to solve due to its time-dependent property and nonlinearity...
March 1, 2017: IEEE Transactions on Neural Networks and Learning Systems
https://www.readbyqxmd.com/read/28326050/what-to-choose-next-a-paradigm-for-testing-human-sequential-decision-making
#8
Elisa M Tartaglia, Aaron M Clarke, Michael H Herzog
Many of the decisions we make in our everyday lives are sequential and entail sparse rewards. While sequential decision-making has been extensively investigated in theory (e.g., by reinforcement learning models) there is no systematic experimental paradigm to test it. Here, we developed such a paradigm and investigated key components of reinforcement learning models: the eligibility trace (i.e., the memory trace of previous decision steps), the external reward, and the ability to exploit the statistics of the environment's structure (model-free vs...
2017: Frontiers in Psychology
https://www.readbyqxmd.com/read/28320846/working-memory-load-strengthens-reward-prediction-errors
#9
Anne G E Collins, Brittany Ciullo, Michael J Frank, David Badre
Reinforcement learning (RL) in simple instrumental tasks is usually modeled as a monolithic process in which reward prediction errors (RPEs) are used to update expected values of choice options. This modeling ignores the different contributions of different memory and decision-making systems thought to contribute even to simple learning. In an fMRI experiment, we investigated how working memory (WM) and incremental RL processes interact to guide human learning. WM load was manipulated by varying the number of stimuli to be learned across blocks...
April 19, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/28316564/functional-circuitry-effect-of-ventral-tegmental-area-deep-brain-stimulation-imaging-and-neurochemical-evidence-of-mesocortical-and-mesolimbic-pathway-modulation
#10
Megan L Settell, Paola Testini, Shinho Cho, Jannifer H Lee, Charles D Blaha, Hang J Jo, Kendall H Lee, Hoon-Ki Min
Background: The ventral tegmental area (VTA), containing mesolimbic and mesocortical dopaminergic neurons, is implicated in processes involving reward, addiction, reinforcement, and learning, which are associated with a variety of neuropsychiatric disorders. Electrical stimulation of the VTA or the medial forebrain bundle and its projection target the nucleus accumbens (NAc) is reported to improve depressive symptoms in patients affected by severe, treatment-resistant major depressive disorder (MDD) and depressive-like symptoms in animal models of depression...
2017: Frontiers in Neuroscience
https://www.readbyqxmd.com/read/28298887/automated-operant-conditioning-in-the-mouse-home-cage
#11
Nikolas A Francis, Patrick O Kanold
Recent advances in neuroimaging and genetics have made mice an advantageous animal model for studying the neurophysiology of sensation, cognition, and locomotion. A key benefit of mice is that they provide a large population of test subjects for behavioral screening. Reflex-based assays of hearing in mice, such as the widely used acoustic startle response, are less accurate than operant conditioning in measuring auditory processing. To date, however, there are few cost-effective options for scalable operant conditioning systems...
2017: Frontiers in Neural Circuits
https://www.readbyqxmd.com/read/28293206/five-year-olds-systematic-errors-in-second-order-false-belief-tasks-are-due-to-first-order-theory-of-mind-strategy-selection-a-computational-modeling-study
#12
Burcu Arslan, Niels A Taatgen, Rineke Verbrugge
The focus of studies on second-order false belief reasoning generally was on investigating the roles of executive functions and language with correlational studies. Different from those studies, we focus on the question how 5-year-olds select and revise reasoning strategies in second-order false belief tasks by constructing two computational cognitive models of this process: an instance-based learning model and a reinforcement learning model. Unlike the reinforcement learning model, the instance-based learning model predicted that children who fail second-order false belief tasks would give answers based on first-order theory of mind (ToM) reasoning as opposed to zero-order reasoning...
2017: Frontiers in Psychology
https://www.readbyqxmd.com/read/28286265/vicarious-extinction-learning-during-reconsolidation-neutralizes-fear-memory
#13
Armita Golkar, Cathelijn Tjaden, Merel Kindt
BACKGROUND: Previous studies have suggested that fear memories can be updated when recalled, a process referred to as reconsolidation. Given the beneficial effects of model-based safety learning (i.e. vicarious extinction) in preventing the recovery of short-term fear memory, we examined whether consolidated long-term fear memories could be updated with safety learning accomplished through vicarious extinction learning initiated within the reconsolidation time-window. We assessed this in a final sample of 19 participants that underwent a three-day within-subject fear-conditioning design, using fear-potentiated startle as our primary index of fear learning...
February 22, 2017: Behaviour Research and Therapy
https://www.readbyqxmd.com/read/28282439/iterative-free-energy-optimization-for-recurrent-neural-networks-inferno
#14
Alexandre Pitti, Philippe Gaussier, Mathias Quoy
The intra-parietal lobe coupled with the Basal Ganglia forms a working memory that demonstrates strong planning capabilities for generating robust yet flexible neuronal sequences. Neurocomputational models however, often fails to control long range neural synchrony in recurrent spiking networks due to spontaneous activity. As a novel framework based on the free-energy principle, we propose to see the problem of spikes' synchrony as an optimization problem of the neurons sub-threshold activity for the generation of long neuronal chains...
2017: PloS One
https://www.readbyqxmd.com/read/28274725/consolidation-of-vocabulary-during-sleep-the-rich-get-richer
#15
REVIEW
Emma James, M Gareth Gaskell, Anna Weighall, Lisa Henderson
Sleep plays a role in strengthening new words and integrating them with existing vocabulary knowledge, consistent with neural models of learning in which sleep supports hippocampal transfer to neocortical memory. Such models are based on adult research, yet neural maturation may mean that the mechanisms supporting word learning vary across development. Here, we propose a model in which children may capitalise on larger amounts of slow-wave sleep to support a greater demand on learning and neural reorganisation, whereas adults may benefit from a richer knowledge base to support consolidation...
March 6, 2017: Neuroscience and Biobehavioral Reviews
https://www.readbyqxmd.com/read/28268267/reward-gain-model-describes-cortical-use-dependent-plasticity
#16
Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik
Consistent repetitions of an action lead to plastic change in the motor cortex and cause shift in the direction of future movements. This process is known as use-dependent plasticity (UDP), one of the basic forms of the motor memory. We have recently demonstrated in a physiological study that success-related reinforcement signals could modulate the strength of UDP. We tested this idea by developing a computational approach that modeled the shift in the direction of future action as a change in preferred direction of population activity of neurons in the primary motor cortex...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28265866/unpacking-buyer-seller-differences-in-valuation-from-experience-a-cognitive-modeling-approach
#17
Thorsten Pachur, Benjamin Scheibehenne
People often indicate a higher price for an object when they own it (i.e., as sellers) than when they do not (i.e., as buyers)-a phenomenon known as the endowment effect. We develop a cognitive modeling approach to formalize, disentangle, and compare alternative psychological accounts (e.g., loss aversion, loss attention, strategic misrepresentation) of such buyer-seller differences in pricing decisions of monetary lotteries. To also be able to test possible buyer-seller differences in memory and learning, we study pricing decisions from experience, obtained with the sampling paradigm, where people learn about a lottery's payoff distribution from sequential sampling...
March 6, 2017: Psychonomic Bulletin & Review
https://www.readbyqxmd.com/read/28254083/classification-techniques-on-computerized-systems-to-predict-and-or-to-detect-apnea-a-systematic-review
#18
REVIEW
Nuno Pombo, Nuno Garcia, Kouamana Bousson
BACKGROUND AND OBJECTIVE: Sleep apnea syndrome (SAS), which can significantly decrease the quality of life is associated with a major risk factor of health implications such as increased cardiovascular disease, sudden death, depression, irritability, hypertension, and learning difficulties. Thus, it is relevant and timely to present a systematic review describing significant applications in the framework of computational intelligence-based SAS, including its performance, beneficial and challenging effects, and modeling for the decision-making on multiple scenarios...
March 2017: Computer Methods and Programs in Biomedicine
https://www.readbyqxmd.com/read/28248958/fidelity-of-the-representation-of-value-in-decision-making
#19
Paul M Bays, Ben A Dowding
The ability to make optimal decisions depends on evaluating the expected rewards associated with different potential actions. This process is critically dependent on the fidelity with which reward value information can be maintained in the nervous system. Here we directly probe the fidelity of value representation following a standard reinforcement learning task. The results demonstrate a previously-unrecognized bias in the representation of value: extreme reward values, both low and high, are stored significantly more accurately and precisely than intermediate rewards...
March 2017: PLoS Computational Biology
https://www.readbyqxmd.com/read/28240598/feedback-for-reinforcement-learning-based-brain-machine-interfaces-using-confidence-metrics
#20
Noeline W Prins, Justin C Sanchez, Abhishek Prasad
OBJECTIVE: For brain-machine interfaces (BMI) to be used in activities of daily living by paralyzed individuals, the BMI should be as autonomous as possible. One of the challenges is how the feedback is extracted and utilized in the BMI. Our long-term goal is to create autonomous BMIs that can utilize an evaluative feedback from the brain to update the decoding algorithm and use it intelligently in order to adapt the decoder. In this study, we show how to extract the necessary evaluative feedback from a biologically realistic (synthetic) source, use both the quantity and the quality of the feedback, and how that feedback information can be incorporated into a reinforcement learning (RL) controller architecture to maximize its performance...
February 27, 2017: Journal of Neural Engineering
keyword
keyword
103848
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"