Online Reinforcement Learning Using a Probability Density Estimation.

Alejandro Agostini, Enric Celaya

Neural Computation 2017 January

Function approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to dominate the approximation in the less sampled regions. The nonstationarity comes from the recursive nature of the estimations typical of temporal difference methods. This nonstationarity has a local profile, varying not only along the learning process but also along different regions of the state space. We propose to deal with these problems using an estimation of the probability density of samples represented with a gaussian mixture model. To deal with the nonstationarity problem, we use the common approach of introducing a forgetting factor in the updating formula. However, instead of using the same forgetting factor for the whole domain, we make it dependent on the local density of samples, which we use to estimate the nonstationarity of the function at any given input point. To address the biased sampling problem, the forgetting factor applied to each mixture component is modulated according to the new information provided in the updating, rather than forgetting depending only on time, thus avoiding undesired distortions of the approximation in less sampled regions.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

Management of cardiogenic shock: a narrative review.Driss Laghlam et al.Annals of Intensive Care 2024 March 31

Heart failure with preserved ejection fraction: diagnosis, risk assessment, and treatment.Stephan von Haehling et al.Clinical Research in Cardiology : Official Journal of the German Cardiac Society 2024 April 12

Proximal versus distal diuretics in congestive heart failure.Massimo Nardone et al.Nephrology, Dialysis, Transplantation 2024 Februrary 30

World Health Organization and International Consensus Classification of eosinophilic disorders: 2024 update on diagnosis, risk stratification, and management.William Shomali, Jason GotlibAmerican Journal of Hematology 2024 March 30

Efficacy and safety of pharmacotherapy in chronic insomnia: A review of clinical guidelines and case reports.Alejandro Del Rio Verduzco et al.Mental Health Clinician 2023 October

Managing Alcohol Withdrawal Syndrome.Michael Gottlieb, Nicholas Chien, Brit LongAnnals of Emergency Medicine 2024 March 26

Fluid volumes in adults with sepsis.Brit Long, Michael GottliebAcademic Emergency Medicine 2024 April 4

Anti-Arrhythmic Effects of Heart Failure Guideline-Directed Medical Therapy and Their Role in the Prevention of Sudden Cardiac Death: From Beta-Blockers to Sodium-Glucose Cotransporter 2 Inhibitors and Beyond.Wael Zaher et al.Journal of Clinical Medicine 2024 Februrary 27

Effectiveness and safety of drugs for obesity.Kristina Henderson et al.BMJ : British Medical Journal 2024 March 26

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Online Reinforcement Learning Using a Probability Density Estimation.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app