keyword
MENU ▼
Read by QxMD icon Read
search

Reinforcement learning

keyword
https://www.readbyqxmd.com/read/28227964/cross-entropy-optimization-for-neuromodulation
#1
Harleen K Brar, Yunpeng Pan, Babak Mahmoudi, Evangelos A Theodorou, Harleen K Brar, Yunpeng Pan, Babak Mahmoudi, Evangelos A Theodorou, Yunpeng Pan, Harleen K Brar, Babak Mahmoudi, Evangelos A Theodorou
This study presents a reinforcement learning approach for the optimization of the proportional-integral gains of the feedback controller represented in a computational model of epilepsy. The chaotic oscillator model provides a feedback control systems view of the dynamics of an epileptic brain with an internal feedback controller representative of the natural seizure suppression mechanism within the brain circuitry. Normal and pathological brain activity is simulated in this model by adjusting the feedback gain values of the internal controller...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28227165/reward-value-is-encoded-in-primary-somatosensory-cortex-and-can-be-decoded-from-neural-activity-during-performance-of-a-psychophysical-task
#2
David B McNiel, John S Choi, John P Hessburg, Joseph T Francis, David B McNiel, John S Choi, John P Hessburg, Joseph T Francis, John P Hessburg, Joseph T Francis, John S Choi, David B McNiel
Encoding of reward valence has been shown in various brain regions, including deep structures such as the substantia nigra as well as cortical structures such as the orbitofrontal cortex. While the correlation between these signals and reward valence have been shown in aggregated data comprised of many trials, little work has been done investigating the feasibility of decoding reward valence on a single trial basis. Towards this goal, one non-human primate (macaca radiata) was trained to grip and hold a target level of force in order to earn zero, one, two, or three juice rewards...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28227163/maximum-correntropy-based-attention-gated-reinforcement-learning-designed-for-brain-machine-interface
#3
Hongbao Li, Fang Wang, Qiaosheng Zhang, Shaomin Zhang, Yiwen Wang, Xiaoxiang Zheng, Jose C Principe, Hongbao Li, Fang Wang, Qiaosheng Zhang, Shaomin Zhang, Yiwen Wang, Xiaoxiang Zheng, Jose C Principe, Yiwen Wang, Jose C Principe, Xiaoxiang Zheng, Qiaosheng Zhang, Shaomin Zhang, Hongbao Li, Fang Wang
Reinforcement learning is an effective algorithm for brain machine interfaces (BMIs) which interprets the mapping between neural activities with plasticity and the kinematics. Exploring large state-action space is difficulty when the complicated BMIs needs to assign credits over both time and space. For BMIs attention gated reinforcement learning (AGREL) has been developed to classify multi-actions for spatial credit assignment task with better efficiency. However, the outliers existing in the neural signals still make interpret the neural-action mapping difficult...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28227144/optimal-medication-dosing-from-suboptimal-clinical-examples-a-deep-reinforcement-learning-approach
#4
Shamim Nemati, Mohammad M Ghassemi, Gari D Clifford, Shamim Nemati, Mohammad M Ghassemi, Gari D Clifford, Gari D Clifford, Mohammad M Ghassemi
Misdosing medications with sensitive therapeutic windows, such as heparin, can place patients at unnecessary risk, increase length of hospital stay, and lead to wasted hospital resources. In this work, we present a clinician-in-the-loop sequential decision making framework, which provides an individualized dosing policy adapted to each patient's evolving clinical phenotype. We employed retrospective data from the publicly available MIMIC II intensive care unit database, and developed a deep reinforcement learning algorithm that learns an optimal heparin dosing policy from sample dosing trails and their associated outcomes in large electronic medical records...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28226432/reinforcement-learning-for-stabilizing-an-inverted-pendulum-naturally-leads-to-intermittent-feedback-control-as-in-human-quiet-standing
#5
Kenjiro Michimoto, Yasuyuki Suzuki, Ken Kiyono, Yasushi Kobayashi, Pietro Morasso, Taishin Nomura, Kenjiro Michimoto, Yasuyuki Suzuki, Ken Kiyono, Yasushi Kobayashi, Pietro Morasso, Taishin Nomura, Ken Kiyono, Pietro Morasso, Kenjiro Michimoto, Yasushi Kobayashi, Yasuyuki Suzuki, Taishin Nomura
Intermittent feedback control for stabilizing human upright stance is a promising strategy, alternative to the standard time-continuous stiffness control. Here we show that such an intermittent controller can be established naturally through reinforcement learning. To this end, we used a single inverted pendulum model of the upright posture and a very simple reward function that gives a certain amount of punishments when the inverted pendulum falls or changes its position in the state space. We found that the acquired feedback controller exhibits hallmarks of the intermittent feedback control strategy, namely the action of the feedback controller is switched-off intermittently when the state of the pendulum is located near the stable manifold of the unstable saddle-type upright equilibrium of the inverted pendulum with no active control: this action provides an opportunity to exploit transiently converging dynamics toward the unstable upright position with no help of the active feedback control...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28226424/reward-gain-model-describes-cortical-use-dependent-plasticity
#6
Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik, Firas Mawase, Nicholas Wymbs, Shintaro Uehara, Pablo Celnik, Firas Mawase, Pablo Celnik, Shintaro Uehara, Nicholas Wymbs
Consistent repetitions of an action lead to plastic change in the motor cortex and cause shift in the direction of future movements. This process is known as use-dependent plasticity (UDP), one of the basic forms of the motor memory. We have recently demonstrated in a physiological study that success-related reinforcement signals could modulate the strength of UDP. We tested this idea by developing a computational approach that modeled the shift in the direction of future action as a change in preferred direction of population activity of neurons in the primary motor cortex...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28225034/simulating-future-value-in-intertemporal-choice
#7
Alec Solway, Terry Lohrenz, P Read Montague
The laboratory study of how humans and other animals trade-off value and time has a long and storied history, and is the subject of a vast literature. However, despite a long history of study, there is no agreed upon mechanistic explanation of how intertemporal choice preferences arise. Several theorists have recently proposed model-based reinforcement learning as a candidate framework. This framework describes a suite of algorithms by which a model of the environment, in the form of a state transition function and reward function, can be converted on-line into a decision...
February 22, 2017: Scientific Reports
https://www.readbyqxmd.com/read/28223940/reconsideration-of-serial-visual-reversal-learning-in-octopus-octopus-vulgaris-from-a-methodological-perspective
#8
Alexander Bublitz, Severine R Weinhold, Sophia Strobel, Guido Dehnhardt, Frederike D Hanke
Octopuses (Octopus vulgaris) are generally considered to possess extraordinary cognitive abilities including the ability to successfully perform in a serial reversal learning task. During reversal learning, an animal is presented with a discrimination problem and after reaching a learning criterion, the signs of the stimuli are reversed: the former positive becomes the negative stimulus and vice versa. If an animal improves its performance over reversals, it is ascribed advanced cognitive abilities. Reversal learning has been tested in octopus in a number of studies...
2017: Frontiers in Physiology
https://www.readbyqxmd.com/read/28216120/no-impact-of-repeated-extinction-exposures-on-operant-responding-maintained-by-different-reinforcer-rates
#9
John Y H Bai, Christopher A Podlesnik
Greater rates of intermittent reinforcement in the presence of discriminative stimuli generally produce greater resistance to extinction, consistent with predictions of behavioral momentum theory. Other studies reveal more rapid extinction with higher rates of reinforcers - the partial reinforcement extinction effect. Further, repeated extinction often produces more rapid decreases in operant responding due to learning a discrimination between training and extinction contingencies. The present study examined extinction repeatedly with training with different rates of intermittent reinforcement in a multiple schedule...
February 16, 2017: Behavioural Processes
https://www.readbyqxmd.com/read/28216068/stages-of-dysfunctional-decision-making-in-addiction
#10
REVIEW
Antonio Verdejo-Garcia, Trevor T-J Chong, Julie C Stout, Murat Yücel, Edythe D London
Drug use is a choice with immediate positive outcomes, but long-term negative consequences. Thus, the repeated use of drugs in the face of negative consequences suggests dysfunction in the cognitive mechanisms underpinning decision-making. This cognitive dysfunction can be mapped into three stages: the formation of preferences involving valuation of decision options; choice implementation including motivation, self-regulation and inhibitory processes; and feedback processing implicating reinforcement learning...
February 16, 2017: Pharmacology, Biochemistry, and Behavior
https://www.readbyqxmd.com/read/28215377/healthy-empowered-youth-a-positive-youth-development-program-for-native-youth
#11
Stephanie N Craig Rushing, Nichole L Hildebrandt, Carol J Grimes, Amanda J Rowsell, Benjamin C Christensen, William E Lambert
INTRODUCTION: During 2010-2012, Oregon Health & Science University's Prevention Research Center, a Northwest Tribe, and the Northwest Portland Area Indian Health Board, collaborated to evaluate the Healthy & Empowered Youth Project, a school- and community-based positive youth development program for American Indian and Alaska Native high school students. METHODS: The Native STAND (Students Together Against Negative Decisions) curriculum was enhanced with hands-on learning activities in media design to engage students in sexual and reproductive health topics covered by the curriculum...
March 2017: American Journal of Preventive Medicine
https://www.readbyqxmd.com/read/28213812/the-neuroscience-of-human-decision-making-through-the-lens-of-learning-and-memory
#12
Lesley K Fellows
We are called upon to make decisions, large and small, many times a day. Whether in the voting booth, the stock exchange, or the cafeteria line, we identify potential options, estimate and compare their subjective values, and make a choice. Decision-making has only recently become a focus for cognitive neuroscience. The last two decades have seen rapid progress in our understanding of the brain basis of at least some aspects of this rather complex aspect of cognition. This work has provided fresh perspectives on poorly understood brain regions, such as orbitofrontal cortex and ventral striatum...
February 18, 2017: Current Topics in Behavioral Neurosciences
https://www.readbyqxmd.com/read/28210998/on-t%C3%AF-he-distinction-between-value-driven-attention-and-selection-history-evidence-from-individuals-with-depressive-symptoms
#13
Brian A Anderson, Michelle Chiu, Michelle M DiBartolo, Stephanie L Leal
When predictive of extrinsic reward as targets, stimuli rapidly acquire the ability to automatically capture attention. Attentional biases for former targets of visual search also can develop without reward feedback but typically require much longer training. These learned biases towards former targets often are conceptualized within a single framework and might differ merely in degree. That is, both are the result of the reinforcement of selection history, with extrinsic reward for correct report of the target providing greater reinforcement than correct report alone...
February 16, 2017: Psychonomic Bulletin & Review
https://www.readbyqxmd.com/read/28205186/optimisation-of-cognitive-performance-in-rodent-operant-touchscreen-testing-evaluation-and-effects-of-reinforcer-strength
#14
Benjamin U Phillips, Christopher J Heath, Zofia Ossowska, Timothy J Bussey, Lisa M Saksida
Operant testing is a widely used and highly effective method of studying cognition in rodents. Performance on such tasks is sensitive to reinforcer strength. It is therefore advantageous to select effective reinforcers to minimize training times and maximize experimental throughput. To quantitatively investigate the control of behavior by different reinforcers, performance of mice was tested with either strawberry milkshake or a known powerful reinforcer, super saccharin (1.5% or 2% (w/v) saccharin/1.5% (w/v) glucose/water mixture)...
February 15, 2017: Learning & Behavior
https://www.readbyqxmd.com/read/28202319/the-women-s-empower-survey-women-s-knowledge-and-awareness-of-treatment-options-for-vulvar-and-vaginal-atrophy-remains-inadequate
#15
Michael Krychman, Shelli Graham, Brian Bernick, Sebastian Mirkin, Sheryl A Kingsberg
INTRODUCTION: Postmenopausal women's knowledge about vulvar and vaginal atrophy (VVA) and available treatment options has historically been inadequate. Recent direct-to-consumer marketing and educational efforts would have been expected to increase awareness and treatment options. AIM: To compare results of the Women's EMPOWER survey with other available VVA surveys to assess progress in women's understanding and approaches to treatment of VVA. METHODS: The Women's EMPOWER survey, an internet-based survey of US women with VVA symptoms, assessed women's awareness of VVA and their behaviors and attitudes associated with symptom treatment...
February 12, 2017: Journal of Sexual Medicine
https://www.readbyqxmd.com/read/28197082/the-role-of-acetaldehyde-in-the-increased-acceptance-of-ethanol-after-prenatal-ethanol-exposure
#16
Mirari Gaztañaga, Asier Angulo-Alcalde, Norman E Spear, M Gabriela Chotro
Recent studies show that acetaldehyde, the first metabolite in the oxidation of ethanol, can be responsible for both, the appetitive and the aversive effects produced by ethanol intoxication. More specifically, it has been hypothesized that acetaldehyde produced in the periphery by the liver is responsible for the aversive effects of ethanol, while the appetitive effects relate to the acetaldehyde produced centrally through the catalase system. On the other hand, from studies in our and other laboratories, it is known that ethanol exposure during the last gestational days (GD) consistently enhances the postnatal acceptance of ethanol when measured during early ontogeny in the rat...
2017: Frontiers in Behavioral Neuroscience
https://www.readbyqxmd.com/read/28194682/psychopharmacology-prescribing-workshops-a-novel-method-for-teaching-psychiatry-residents-how-to-talk-to-patients-about-medications
#17
Eileen P Kavanagh, John Cahill, Melissa R Arbuckle, Alison E Lenet, Kalyani Subramanyam, Ronald M Winchel, Ilana Nossel, Ravi DeSilva, Rachel A Caravella, Marra Ackerman, Henry C Park, David A Ross
OBJECTIVE: Traditional, lecture-based methods of teaching pharmacology may not translate into the skills needed to communicate effectively with patients about medications. In response, the authors developed an interactive course for third-year psychiatry residents to reinforce prescribing skills. METHODS: Residents participate in a facilitated group discussion combined with a role-play exercise where they mock-prescribe medication to their peers. Each session is focused on one medication or class of medications with an emphasis on various aspects of informed consent (such as describing the indication, dosing, expected benefits, potential side effects, and necessary work-up and follow up)...
February 13, 2017: Academic Psychiatry
https://www.readbyqxmd.com/read/28194005/genetic-inhibition-of-neurotransmission-reveals-role-of-glutamatergic-input-to-dopamine-neurons-in-high-effort-behavior
#18
M A Hutchison, X Gu, M F Adrover, M R Lee, T S Hnasko, V A Alvarez, W Lu
Midbrain dopamine neurons are crucial for many behavioral and cognitive functions. As the major excitatory input, glutamatergic afferents are important for control of the activity and plasticity of dopamine neurons. However, the role of glutamatergic input as a whole onto dopamine neurons remains unclear. Here we developed a mouse line in which glutamatergic inputs onto dopamine neurons are specifically impaired, and utilized this genetic model to directly test the role of glutamatergic inputs in dopamine-related functions...
February 14, 2017: Molecular Psychiatry
https://www.readbyqxmd.com/read/28193692/goal-directed-and-habit-like-modulations-of-stimulus-processing-during-reinforcement-learning
#19
David Luque, Tom Beesley, Richard Morris, Bradley N Jack, Oren Griffiths, Thomas Whitford, Mike E Le Pelley
Recent research has shown that perceptual processing of stimuli previously associated with high-value rewards is automatically prioritized, even when rewards are no longer available. It has been hypothesized that such reward-related modulation of stimulus salience is conceptually similar to an 'attentional habit'. Recording event-related potentials in humans during a reinforcement learning task, we show strong evidence in favor of this hypothesis. Resistance to outcome devaluation (the defining feature of a habit) was shown by the stimulus-locked P1 component, reflecting activity in the extrastriate visual cortex...
February 13, 2017: Journal of Neuroscience: the Official Journal of the Society for Neuroscience
https://www.readbyqxmd.com/read/28193602/preference-for-and-learning-of-amino-acids-in-larval-drosophila
#20
Nana Kudow, Daisuke Miura, Michael Schleyer, Naoko Toshima, Bertram Gerber, Teiichi Tanimura
Relative to other nutrients, less is known about how animals sense amino acids and how behaviour is organized accordingly. This is a significant gap in our knowledge, because amino acids are required for protein synthesis-and hence for life as we know it. Choosing larvae as a study case, we provide the first systematic analysis of both the preference behaviour for and the learning of all 20 canonical amino acids in Drosophila We report that preference for individual amino acids differs according to the kind of amino acid, both in first-instar and in third-instar larvae...
February 13, 2017: Biology Open
keyword
keyword
23454
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"