Cited by Optimizing Maintenance of Energy Supply Systems in City Logistics with Heuristics and Reinforcement Learning

Navigation With Time Limits in Transportation Networks: A Fourth Moment Approach DOI

Hongliang Guo, Zhi He, Chen Gao

et al.

IEEE Transactions on Intelligent Transportation Systems, Journal Year: 2022, Volume and Issue: 23(12), P. 23781 - 23796

Published: July 18, 2022

This paper investigates the stochastic on-time arrival (SOTA) problem in transportation networks. We propose a fourth moment approach (FMA), which calculates tight lower bound of given routing policy's on-time-arrival probability, through estimating first four moments travel time. Then, we employ generalized policy iteration (GPI) scheme to gradually improve towards optimal one. Different from state-of-the-art algorithms for SOTA problem, require full time distribution and usually incur high computational cost due convolution integration operation, FMA only requires travel-time statistics, are easily estimated statistics perspective. Moreover, algorithm's complexity analysis indicates relatively light load requirement FMA. Experimental results range networks show FMA's superior performance over state arts.

Language: Английский

Citations

Railcar Itinerary Optimization in Railway Marshalling Yards: A Graph Neural Network Based Deep Reinforcement Learning Method DOI

Hongxiang Zhang, Gongyuan Lu, Yingqian Zhang

et al.

Published: Jan. 1, 2023

The goal of Railcar Itinerary Optimization in Marshalling Yards (RIO-MY) is to achieve an effective integrated operation plan for both train shunting operations and makeup, with the aim minimizing railcar dwell time railway marshalling yard. Due complex interdependent decisions disassembly assembly process trains, conventional optimization methods problem face challenges addressing dynamic nature traffic yard offering highly efficient solutions. This paper introduces a novel approach RIO-MY using graph neural network based deep reinforcement learning method. First, we model solving as Markov decision process, utilizing tripartite represent operational state Then design isomorphism (TGIN) learn informative embeddings on graph, which are exploited reason out joint action simultaneously decide hump sequencing classification track assignment. TGIN policy trained by proximal algorithm, reward tailored well estimate each state. Moreover, develop discrete-event-based simulation yard, serves environment integrates typical heuristic rules outbound locomotive scheduling. Extensive experiments two real-world yards demonstrate that proposed method outperforms algorithms. it achieves competitive performance mixed integer nonlinear programming significantly less computational time. In addition, networks can favorably generalize scenarios unseen during training effectively handle disturbances process.

Language: Английский

Citations

Agent Guidance in Autonomous Mobility on Demand Systems: An Approach Utilizing Priority Double Deep-Q-Networks DOI

Jiyao Li, Vicki H. Allan

2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), Journal Year: 2024, Volume and Issue: unknown, P. 0291 - 0300

Published: Jan. 8, 2024

Autonomous Mobility-on-Demand (AMoD) systems have become an integral part of modern urban life, reshaping transportation dynamics. However, the challenge orchestrating multiple vehicles to maintain a dynamic equilibrium between supply and demand in multi-agent environment remains critical concern. To address this challenge, we propose novel Multiagent Deep Reinforcement Learning framework called Priority Double Deep-Q-Network (Pr-DDQN) with new cooperative reward mechanism optimize repositioning route vacant within AMoD system. Through rigorous experimentation using city-scale dataset comprising 48,000 requests on weekday Chicago, assess scalability efficiency our approach. Comparative results demonstrate that Pr-DDQN outperforms existing methods, showcasing superior performance across key metrics, including Service Rate, Satisfaction Index, Repositioning Time. These findings underscore efficacy approach enhancing operational overall systems.

Language: Английский

Citations

Hierarchical Neural Constructive Solver for Real-world TSP Scenarios DOI

Yong Liang Goh, Zhiguang Cao, Yining Ma

et al.

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Journal Year: 2024, Volume and Issue: unknown, P. 884 - 895

Published: Aug. 24, 2024

Existing neural constructive solvers for routing problems have predominantly employed transformer architectures, conceptualizing the route construction as a set-to-sequence learning task. However, their efficacy has primarily been demonstrated on entirely random problem instances that inadequately capture real-world scenarios. In this paper, we introduce realistic Traveling Salesman Problem (TSP) scenarios relevant to industrial settings and derive following insights: (1) The optimal next node (or city) visit often lies within proximity current node, suggesting potential benefits of biasing choices based locations. (2) Effectively solving TSP requires robust tracking unvisited nodes warrants succinct grouping strategies. Building upon these insights, propose integrating learnable choice layer inspired by Hypernetworks prioritize location, approximate clustering algorithm Expectation-Maximization facilitate cities. Together, two contributions form hierarchical approach towards considering both immediate local neighbourhoods an intermediate set representations. Our yields superior performance compared classical recent models, showcasing key designs.

Language: Английский

Citations

Optimizing Maintenance of Energy Supply Systems in City Logistics with Heuristics and Reinforcement Learning DOI

Antoni Guerrero, Ángel A. Juan, Álvaro García Sánchez

et al.

Mathematics, Journal Year: 2024, Volume and Issue: 12(19), P. 3140 - 3140

Published: Oct. 7, 2024

In urban logistics, effective maintenance is crucial for maintaining the reliability and efficiency of energy supply systems, impacting both asset performance operational stability. This paper addresses scheduling routing plans power generation assets over a multi-period horizon. We model this problem as team orienteering problem. To address challenge, we propose dual approach: novel reinforcement learning (RL) framework biased-randomized heuristic algorithm. The RL-based method dynamically learns from real-time data evolving conditions, adapting to changes in health failure probabilities optimize decision making. addition, develop apply algorithm designed provide solutions within practical computational limits. Our approach validated through series experiments comparing RL results demonstrate that, when properly trained, able offer equivalent or even superior compared

Language: Английский

Citations