Human-level control through deep reinforcement learning.

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis

Nature 2015 Februrary 27

The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

Heart failure with preserved ejection fraction: diagnosis, risk assessment, and treatment.Stephan von Haehling et al.Clinical Research in Cardiology : Official Journal of the German Cardiac Society 2024 April 12

Management of cardiogenic shock: a narrative review.Driss Laghlam et al.Annals of Intensive Care 2024 March 31

Proximal versus distal diuretics in congestive heart failure.Massimo Nardone et al.Nephrology, Dialysis, Transplantation 2024 Februrary 30

Efficacy and safety of pharmacotherapy in chronic insomnia: A review of clinical guidelines and case reports.Alejandro Del Rio Verduzco et al.Mental Health Clinician 2023 October

World Health Organization and International Consensus Classification of eosinophilic disorders: 2024 update on diagnosis, risk stratification, and management.William Shomali, Jason GotlibAmerican Journal of Hematology 2024 March 30

Managing Alcohol Withdrawal Syndrome.Michael Gottlieb, Nicholas Chien, Brit LongAnnals of Emergency Medicine 2024 March 26

Anti-Arrhythmic Effects of Heart Failure Guideline-Directed Medical Therapy and Their Role in the Prevention of Sudden Cardiac Death: From Beta-Blockers to Sodium-Glucose Cotransporter 2 Inhibitors and Beyond.Wael Zaher et al.Journal of Clinical Medicine 2024 Februrary 27

Effectiveness and safety of drugs for obesity.Kristina Henderson et al.BMJ : British Medical Journal 2024 March 26

Perioperative echocardiographic strain analysis: what anesthesiologists should know.Adrian Costescu et al.Canadian Journal of Anaesthesia 2024 April 11

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Human-level control through deep reinforcement learning.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app