Most recent papers with the keyword model-based reinforcement learning

#1

JOURNAL ARTICLE

A novel method-based reinforcement learning with deep temporal difference network for flexible double shop scheduling problem.

Xiao Wang, Peisi Zhong, Mei Liu, Chao Zhang, Shihao Yang

This paper studies the flexible double shop scheduling problem (FDSSP) that considers simultaneously job shop and assembly shop. It brings about the problem of scheduling association of the related tasks. To this end, a reinforcement learning algorithm with a deep temporal difference network is proposed to minimize the makespan. Firstly, the FDSSP is defined as the mathematical model of the flexible job-shop scheduling problem joined to the assembly constraint level. It is translated into a Markov decision process that directly selects behavioral strategies according to historical machining state data...

38641689

April 20, 2024: Scientific Reports

#2

JOURNAL ARTICLE

Active learning using adaptable task-based prioritisation.

Shaheer U Saeed, João Ramalhinho, Mark Pinnock, Ziyi Shen, Yunguan Fu, Nina Montaña-Brown, Ester Bonmati, Dean C Barratt, Stephen P Pereira, Brian Davidson, Matthew J Clarkson, Yipeng Hu

Supervised machine learning-based medical image computing applications necessitate expert label curation, while unlabelled image data might be relatively abundant. Active learning methods aim to prioritise a subset of available image data for expert annotation, for label-efficient model training. We develop a controller neural network that measures priority of images in a sequence of batches, as in batch-mode active learning, for multi-class segmentation tasks. The controller is optimised by rewarding positive task-specific performance gain, within a Markov decision process (MDP) environment that also optimises the task predictor...

38640779

April 16, 2024: Medical Image Analysis

#3

JOURNAL ARTICLE

A GRU-CNN model for auditory attention detection using microstate and recurrence quantification analysis.

MohammadReza EskandariNasab, Zahra Raeisi, Reza Ahmadi Lashaki, Hamidreza Najafi

Attention as a cognition ability plays a crucial role in perception which helps humans to concentrate on specific objects of the environment while discarding others. In this paper, auditory attention detection (AAD) is investigated using different dynamic features extracted from multichannel electroencephalography (EEG) signals when listeners attend to a target speaker in the presence of a competing talker. To this aim, microstate and recurrence quantification analysis are utilized to extract different types of features that reflect changes in the brain state during cognitive tasks...

38632246

April 17, 2024: Scientific Reports

#4

JOURNAL ARTICLE

Real-world humanoid locomotion with reinforcement learning.

Ilija Radosavovic, Tete Xiao, Bike Zhang, Trevor Darrell, Jitendra Malik, Koushil Sreenath

Humanoid robots that can autonomously operate in diverse environments have the potential to help address labor shortages in factories, assist elderly at home, and colonize new planets. Although classical controllers for humanoid robots have shown impressive results in a number of settings, they are challenging to generalize and adapt to new environments. Here, we present a fully learning-based approach for real-world humanoid locomotion. Our controller is a causal transformer that takes the history of proprioceptive observations and actions as input and predicts the next action...

38630806

April 17, 2024: Science Robotics

#5

JOURNAL ARTICLE

The emergence of identity, agency and consciousness from the temporal dynamics of neural elaboration.

Riccardo Fesce

Identity-differentiating self from external reality-and agency-being the author of one's acts-are generally considered intrinsic properties of awareness and looked at as mental constructs generated by consciousness. Here a different view is proposed. All physiological systems display complex time-dependent regulations to adapt or anticipate external changes. To interact with rapid changes, an animal needs a nervous system capable of modelling and predicting (not simply representing) it. Different algorithms must be employed to predict the momentary location of an object based on sensory information (received with a delay), or to design in advance and direct the trajectory of movement...

38628469

2024: Front Netw Physiol

#6

JOURNAL ARTICLE

Growing Project BioEYES: A Reflection on 20 Years of Developing and Replicating a K-12 Science Outreach Program.

Jamie R Shuda, Valerie G Butler, Theresa M Nelson, Jaqueline M Davidson, Auset M Taylor, Steven A Farber

Project BioEYES celebrated 20 years in K12 schools during the 2022-2023 school year. Using live zebrafish ( Danio rerio ) during week-long science experiments, sparks the interest of students and teachers from school districts, locally and globally. Over the past two decades, BioEYES has been replicated in different ways based on the interest and capacity of our partners. This article discusses several of the successful models, the common challenges, and how each BioEYES site has adopted guiding principles to help foster their success...

38621216

April 2024: Zebrafish

#7

JOURNAL ARTICLE

On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm.

Ukjo Hwang, Songnam Hong

Robust reinforcement learning (RRL) aims to seek a robust policy by optimizing the worst case performance over an uncertainty set. This set contains some perturbed Markov decision processes (MDPs) from a nominal MDP (N-MDP) that generate samples for training, which reflects some potential mismatches between the training simulator (i.e., N-MDP) and real-world settings (i.e., the testing environments). Unfortunately, existing RRL algorithms are only applied to the tabular setting and it is still an open problem to extend them into more general continuous state space...

38619960

April 15, 2024: IEEE Transactions on Neural Networks and Learning Systems

#8

JOURNAL ARTICLE

A variable speed limit control approach for freeway tunnels based on the model-based reinforcement learning framework with safety perception.

Jieling Jin, Ye Li, Helai Huang, Yuxuan Dong, Pan Liu

To improve the traffic safety and efficiency of freeway tunnels, this study proposes a novel variable speed limit (VSL) control strategy based on the model-based reinforcement learning framework (MBRL) with safety perception. The MBRL framework is designed by developing a multi-lane cell transmission model for freeway tunnels as an environment model, which is built so that agents can interact with the environment model while interacting with the real environment to improve the sampling efficiency of reinforcement learning...

38614052

April 12, 2024: Accident; Analysis and Prevention

#9

JOURNAL ARTICLE

Dynamic Intelligent Scheduling in Low-Carbon Heterogeneous Distributed Flexible Job Shops with Job Insertions and Transfers.

Yi Chen, Xiaojuan Liao, Guangzhu Chen, Yingjie Hou

With the rapid development of economic globalization and green manufacturing, traditional flexible job shop scheduling has evolved into the low-carbon heterogeneous distributed flexible job shop scheduling problem (LHDFJSP). Additionally, modern smart manufacturing processes encounter complex and diverse contingencies, necessitating the ability to address dynamic events in real-world production activities. To date, there are limited studies that comprehensively address the intricate factors associated with the LHDFJSP, including workshop heterogeneity, job insertions and transfers, and considerations of low-carbon objectives...

38610462

March 31, 2024: Sensors

#10

JOURNAL ARTICLE

Deep Reinforcement Learning-Based Resource Management in Maritime Communication Systems.

Xi Yao, Yingdong Hu, Yicheng Xu, Ruifeng Gao

With the growing maritime economy, ensuring the quality of communication for maritime users has become imperative. The maritime communication system based on nearshore base stations enhances the communication rate of maritime users through dynamic resource allocation. A virtual queue-based deep reinforcement learning beam allocation scheme is proposed in this paper, aiming to maximize the communication rate. More particularly, to reduce the complexity of resource management, we employ a grid-based method to discretize the maritime environment...

38610458

March 31, 2024: Sensors

#11

JOURNAL ARTICLE

Multi-User Computation Offloading and Resource Allocation Algorithm in a Vehicular Edge Network.

Xiangyan Liu, Jianhong Zheng, Meng Zhang, Yang Li, Rui Wang, Yun He

In Vehicular Edge Computing Network (VECN) scenarios, the mobility of vehicles causes the uncertainty of channel state information, which makes it difficult to guarantee the Quality of Service (QoS) in the process of computation offloading and the resource allocation of a Vehicular Edge Computing Server (VECS). A multi-user computation offloading and resource allocation optimization model and a computation offloading and resource allocation algorithm based on the Deep Deterministic Policy Gradient (DDPG) are proposed to address this problem...

38610415

March 29, 2024: Sensors

#12

JOURNAL ARTICLE

Deep Reinforcement Learning-Empowered Cost-Effective Federated Video Surveillance Management Framework.

Dilshod Bazarov Ravshan Ugli, Alaelddin F Y Mohammed, Taeheum Na, Joohyung Lee

Video surveillance systems are integral to bolstering safety and security across multiple settings. With the advent of deep learning (DL), a specialization within machine learning (ML), these systems have been significantly augmented to facilitate DL-based video surveillance services with notable precision. Nevertheless, DL-based video surveillance services, which necessitate the tracking of object movement and motion tracking (e.g., to identify unusual object behaviors), can demand a significant portion of computational and memory resources...

38610369

March 27, 2024: Sensors

#13

JOURNAL ARTICLE

Task Offloading Strategy for Unmanned Aerial Vehicle Power Inspection Based on Deep Reinforcement Learning.

Wei Zhuang, Fanan Xing, Yuhang Lu

With the ongoing advancement of electric power Internet of Things (IoT), traditional power inspection methods face challenges such as low efficiency and high risk. Unmanned aerial vehicles (UAVs) have emerged as a more efficient solution for inspecting power facilities due to their high maneuverability, excellent line-of-sight communication capabilities, and strong adaptability. However, UAVs typically grapple with limited computational power and energy resources, which constrain their effectiveness in handling computationally intensive and latency-sensitive inspection tasks...

38610282

March 24, 2024: Sensors

#14

JOURNAL ARTICLE

Adaptive Control for Virtual Synchronous Generator Parameters Based on Soft Actor Critic.

Chuang Lu, Xiangtao Zhuan

This paper introduces a model-free optimization method based on reinforcement learning (RL) aimed at resolving the issues of active power and frequency oscillations present in a traditional virtual synchronous generator (VSG). The RL agent utilizes the active power and frequency response of the VSG as state information inputs and generates actions to adjust the virtual inertia and damping coefficients for an optimal response. Distinctively, this study incorporates a setting-time term into the reward function design, alongside power and frequency deviations, to avoid prolonged system transients due to over-optimization...

38610247

March 22, 2024: Sensors

#15

REVIEW

Prediction of drug-target binding affinity based on deep learning models.

Hao Zhang, Xiaoqian Liu, Wenya Cheng, Tianshi Wang, Yuanyuan Chen

The prediction of drug-target binding affinity (DTA) plays an important role in drug discovery. Computerized virtual screening techniques have been used for DTA prediction, greatly reducing the time and economic costs of drug discovery. However, these techniques have not succeeded in reversing the low success rate of new drug development. In recent years, the continuous development of deep learning (DL) technology has brought new opportunities for drug discovery through the DTA prediction. This shift has moved the prediction of DTA from traditional machine learning methods to DL...

38608327

April 8, 2024: Computers in Biology and Medicine

#16

JOURNAL ARTICLE

Unsupervised machine learning for flaw detection in automated ultrasonic testing of carbon fibre reinforced plastic composites.

Vedran Tunukovic, Shaun McKnight, Richard Pyle, Zhiming Wang, Ehsan Mohseni, S Gareth Pierce, Randika K W Vithanage, Gordon Dobie, Charles N MacLeod, Sandy Cochran, Tom O'Hare

The use of Carbon Fibre Reinforced Plastic (CFRP) composite materials for critical components has significantly surged within the energy and aerospace industry. With this rapid increase in deployment, reliable post-manufacturing Non-Destructive Evaluation (NDE) is critical for verifying the mechanical integrity of manufactured components. To this end, an automated Ultrasonic Testing (UT) NDE process delivered by an industrial manipulator was developed, greatly increasing the measurement speed, repeatability, and locational precision, while increasing the throughput of data generated by the selected NDE modality...

38603904

April 6, 2024: Ultrasonics

#17

JOURNAL ARTICLE

Fuzzy-based collective pitch control for wind turbine via deep reinforcement learning.

Abdelhamid Nabeel, Ahmed Lasheen, Abdel Latif Elshafei, Essam Aboul Zahab

Wind turbines (WTs) have highly nonlinear and uncertain dynamics due to aerodynamic complexity, mechanical factors, and fluctuations in wind conditions. Turbulence and wind shear add complexity to modelling, especially in constant power region (region 3). Thus, an effective control design demands a deep understanding of the nonlinearities and uncertainties. This paper suggests a novel model-free reinforcement learning (RL) collective pitch angle controller to operate efficiently in region 3. The proposed controller stabilizes generator speed, maximizes power output, and minimizes fluctuations while accommodating system uncertainties, nonlinearity, and pitch limits...

38599929

March 26, 2024: ISA Transactions

#18

JOURNAL ARTICLE

Maze-solving in a plasma system based on functional analogies to reinforcement-learning model.

Osamu Sakai, Toshifusa Karasaki, Tsuyohito Ito, Tomoyuki Murakami, Manabu Tanaka, Makoto Kambara, Satoshi Hirayama

Maze-solving is a classical mathematical task, and is recently analogously achieved using various eccentric media and devices, such as living tissues, chemotaxis, and memristors. Plasma generated in a labyrinth of narrow channels can also play a role as a route finder to the exit. In this study, we experimentally observe the function of maze-route findings in a plasma system based on a mixed discharge scheme of direct-current (DC) volume mode and alternative-current (AC) surface dielectric-barrier discharge, and computationally generalize this function in a reinforcement-learning model...

38598429

2024: PloS One

#19

JOURNAL ARTICLE

LensePro: label noise-tolerant prototype-based network for improving cancer detection in prostate ultrasound with limited annotations.

Minh Nguyen Nhat To, Fahimeh Fooladgar, Paul Wilson, Mohamed Harmanani, Mahdi Gilany, Samira Sojoudi, Amoon Jamzad, Silvia Chang, Peter Black, Parvin Mousavi, Purang Abolmaesumi

PURPOSE: The standard of care for prostate cancer (PCa) diagnosis is the histopathological analysis of tissue samples obtained via transrectal ultrasound (TRUS) guided biopsy. Models built with deep neural networks (DNNs) hold the potential for direct PCa detection from TRUS, which allows targeted biopsy and subsequently enhances outcomes. Yet, there are ongoing challenges with training robust models, stemming from issues such as noisy labels, out-of-distribution (OOD) data, and limited labeled data...

38598142

April 10, 2024: International Journal of Computer Assisted Radiology and Surgery

#20

JOURNAL ARTICLE

Pontryagin's Minimum Principle-Guided RL for Minimum-Time Exploration of Spatiotemporal Fields.

Zhuo Li, Jian Sun, Antonio G Marques, Gang Wang, Keyou You

This article studies the trajectory planning problem of an autonomous vehicle for exploring a spatiotemporal field subject to a constraint on cumulative information. Since the resulting problem depends on the signal strength distribution of the field, which is unknown in practice, we advocate the use of a model-free reinforcement learning (RL) method to find the solution. Given the vehicle's dynamical model, a critical (and open) question is how to judiciously merge the model-based optimality conditions into the model-free RL framework for improved efficiency and generalization, for which this work provides some positive results...

38593018

April 9, 2024: IEEE Transactions on Neural Networks and Learning Systems

Use the keywords feature with a free QxMD account.

model-based reinforcement learning

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips