| 系統識別號 | U0002-1108202513085300 |
|---|---|
| DOI | 10.6846/tku202500696 |
| 論文名稱(中文) | 基於多任務學習與強化學習的計程車預期派遣系統 |
| 論文名稱(英文) | An Anticipatory Taxi Dispatch System Based on Multi-Task Learning and Reinforcement Learning |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 資訊工程學系全英語碩士班 |
| 系所名稱(英文) | Master's Program, Department of Computer Science and Information Engineering (English-taught program) |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 113 |
| 學期 | 2 |
| 出版年 | 114 |
| 研究生(中文) | 鍾心悦 |
| 研究生(英文) | Hsin-Yueh Chung |
| 學號 | 611780064 |
| 學位類別 | 碩士 |
| 語言別 | 英文 |
| 第二語言別 | |
| 口試日期 | 2025-06-14 |
| 論文頁數 | 103頁 |
| 口試委員 |
指導教授
-
武士戎(wushihjung@mail.tku.edu.tw)
共同指導教授 - 張峯誠(135170@mail.tku.edu.tw) 口試委員 - 張志勇 口試委員 - 廖文華 |
| 關鍵字(中) |
計程車派遣 時空圖神經網路(STGAT) 強化學習 路徑規劃 多任務預測 Q-learning 智慧交通 空車調度 |
| 關鍵字(英) |
Taxi Dispatch STGAT Reinforcement Learning Route Planning Multi-task Learning Q-learning Intelligent Transportation Vehicle Scheduling |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
程車預期派遣系統是智慧交通管理中的關鍵技術,其核心目標為提前預測乘客需求並主動調度車輛資源,以提升整體營運效率並降低空駛成本。傳統方法往往將派遣流程視為靜態決策問題,缺乏對城市時空變化的即時掌握,致使系統在應對複雜環境與動態需求時難以維持穩定效能。現有研究在派遣需求預測方面,普遍仰賴單一時間序列或歷史平均值模型,難以捕捉區域間的時空依賴關係;在派遣策略制定方面,多以靜態規則或啟發式方法為主,未能根據預測結果調整全局資源配置;在派遣路徑規劃方面,則多採用傳統最短距離或最短時間演算法,無法動態適應即時交通狀況。 本論文提出一套整合時空圖預測與強化學習之智慧派遣系統(STGAT-Q),分別在三個層面進行技術突破:(1) 在需求預測方面,利用時空圖注意力網路(STGAT)同時預測區域需求量、空車數與道路速率,捕捉城市交通之時空耦合特性;(2) 在策略制定方面,透過多任務預測結果導出供需差異矩陣,進一步生成全域派遣策略與車輛調度建議;(3) 在路徑規劃方面,引入強化學習模型(Q-learning),根據預測速率學習成本最小之派遣路徑,實現動態與智能化的空車引導。 本論文的主要貢獻如下:一、設計一個端到端的智慧派遣與路徑優化架構,涵蓋預測、策略與路徑三階段;二、建構一套以紐約市為實驗基礎的時空圖派遣系統並完成模組整合實作;三、透過十四組實驗從多項指標(如R_idle、Path MAE、R_total)驗證模型效能、模組貢獻與泛化能力;四、首次實現預測、派遣與路徑間之閉環整合,提升系統於動態場景中的實用性。 實驗結果顯示,基於2013年紐約市14,776,615筆行程記錄的驗證,本系統在需求滿足度、路徑準確性與資源利用率等指標皆顯著優於現有方法。十四組對比實驗證實各模組的有效性,跨時段與跨區域測試驗證了系統的泛化能力,在不同時段與跨城市測試中皆保持穩定效能,證實STGAT-Q架構具備高度適應性與部署潛力。 |
| 英文摘要 |
Taxi predictive dispatch systems represent a critical technology in intelligent transportation management, aiming to proactively predict passenger demand and schedule vehicle resources for enhanced operational efficiency. Traditional approaches treat dispatch as static decision problems, lacking real-time awareness of urban spatio-temporal dynamics. Existing research limitations include: single time-series models in demand prediction that fail to capture inter-regional dependencies; static rule-based strategy formulation unable to adjust global resource allocation; and conventional routing algorithms that cannot adapt to real-time traffic conditions. This paper proposes STGAT-Q, an intelligent dispatch system integrating spatio-temporal graph prediction and reinforcement learning. The system achieves breakthroughs in three aspects: (1)Employing Spatio-Temporal Graph Attention Networks to simultaneously predict regional demand, idle vehicle counts, and road speeds; (2) Generating global dispatch strategies through supply-demand difference matrices derived from multi-task predictions; (3)Utilizing Q-learning to determine cost-efficient routes based on predicted traffic speeds. The main contributions include: designing an end-to-end architecture integrating prediction, strategy, and routing; implementing a complete spatio-temporal graph dispatch system validated on New York City data; comprehensive evaluation through thirteen experimental groups using metrics including R_idle, Path MAE, and R_total; and achieving the first closed-loop integration of prediction, dispatch, and routing modules. Experimental results based on 14,776,615 NYC taxi records demonstrate that STGAT-Q significantly outperforms existing methods in demand satisfaction, routing accuracy, and resource utilization, with consistent performance across different time periods and urban settings. |
| 第三語言摘要 | |
| 論文目次 |
LIST OF CONTENT 誌謝 I LIST OF CONTENT VII LIST OF FIGURE IX LIST OF TABLE XII CHAPTER 1 INTRODUCTION 1 1.1 Research Background 1 1.2 Research Motivation 3 1.3 Research Objectives 5 1.4 Research Contributions 7 Chapter 2 Related Work 10 2.1 Demand Forecasting for Dispatch 10 2.2 Dispatch Strategy Formulation 14 2.3 Dispatch Path Planning 16 2.4 Overview 18 Chapter 3 Background Knowledge 21 3.1 Spatio-Temporal Graph Feature Engineering 21 3.2 STGAT: Spatial-Temporal Graph Attention Network 25 3.3 Multi-Task Learning 30 3.4 Reinforcement Learning: Q-learning 33 Chapter 4 System Design 38 4.1 System Architecture 38 4.2 Feature Graph Structure Modeling 40 4.2.1 Data Collection and Preprocessing 41 4.2.2 Training Feature Construction 46 4.3 Multi-Task Prediction Module 51 4.3.1 STGAT Prediction Module 53 4.3.2 Task 1: Formulating the Dispatch Strategy 57 4.3.3 Task 2: Shortest Travel-Time Path 61 4.4 Reinforcement Learning: Dispatch Path Learning 64 4.4.1 Q-learning Training Data Preprocessing 65 4.4.2 Q-learning System Design 68 4.4.3 Training Phase: Q-Value Learning and Policy Optimization 72 4.4.4 Execution Phase: Route Planning 77 Chapter 5 Experimental Analysis 81 5.1 Dataset 81 5.2 Environment and System Parameter Settings 82 5.3 Experimental Results 85 Chapter 6 Conclusion 97 6.1 Summary of Contributions 97 6.2 Future Work 98 REFRENCE 100 LIST OF FIGURE Figure 1. Motivation: Pain Points and Improvement Directions in Taxi Dispatch 4 Figure 2. Research Objective: Enhancing User Experience and Dispatch Quality 6 Figure 3. Overall Framework 7 Figure 4. Research Contributions 9 Figure 5. Taxi Administrative Zones in Manhattan, New York City [38] 22 Figure 6. Illustration of Urban Region Graph Structure [11] 23 Figure 7. Spatio-Temporal Graph Representation of Urban Traffic Dynamics 23 Figure 8. Graph-Structured Representation of Traffic Data [8] 26 Figure 9. Framework of the Spatial-Temporal Graph Attention Network [8] 27 Figure 10. Structure of the Gated Temporal Convolution Layer [8] 27 Figure 11. Residual Network Structure [33] 28 Figure 12. Gating Control Unit [33] 29 Figure 13. GTCN Architecture and Processing Flow [33] 29 Figure 14. GAT Architecture [34] 30 Figure 15. Architecture of Multi-Task Learning [35] 32 Figure 16. Reinforcement Learning Framework [37] 34 Figure 17. Q-learning System Architecture [37] 35 Figure 18. Illustration of Mouse-in-Maze Scenario [37] 36 Figure 19. Initial Q-table and Random Actions [37] 36 Figure 20. Q-value Update Illustration [37] 36 Figure 21. Overall System Architecture and Workflow 39 Figure 22. Workflow of Feature Graph Construction 41 Figure 23. Manhattan Taxi Zone Aggregation 42 Figure 24. Node Mapping of Historical Trip Data 45 Figure 25. Required Features in the Graph Structure 47 Figure 26. Construction and Alignment of Passenger Demand 47 Figure 27. Construction of Vacant Vehicle Features 48 Figure 28. Computation of Inter-node Road Distances 49 Figure 29. Speed Allocation Algorithm Based on Trip Paths 49 Figure 30. Construction of Graph Sequences via Sliding Window 50 Figure 31. Node Adjacency Matrix for Manhattan Zones 51 Figure 32. Overall Architecture of the Multi-Task Prediction Module 53 Figure 33. Structure of the Gated Temporal Convolution Layer (GTCN) 54 Figure 34. Illustration of Graph Attention Mechanism (GAT) 55 Figure 35. Architecture of the Dual-Channel Gated Fusion Mechanism 56 Figure 36. Multi-Task Prediction and Weighted Loss Design 57 Figure 37. Dispatch Strategy Formulation Architecture 58 Figure 38. Architecture of the Global Dispatch Module 60 Figure 39. Design Method of the Global Dispatch Strategy 60 Figure 40. Architecture of the Shortest Travel Time Path Module 61 Figure 41. Method for Calculating Shortest Travel-Time Paths 63 Figure 42. Architecture of Dispatch Path Learning Module 65 Figure 43. Input Data Format of Historical Trips 66 Figure 44. Node Adjacency Structure and Matrix Representation 67 Figure 45. Example of Transformed Training Sample Format 68 Figure 46. Overall Architecture of the Reinforcement Learning Module 69 Figure 47. State Design 70 Figure 48. Reward Function Design and Learning Objectives 71 Figure 49. Design of Negative Reward Constraints 71 Figure 50. Initialization and Masking of Q-Value Matrix 73 Figure 51. Initial Reward Matrix Structure at Training Start 74 Figure 52. Training Data Sample and Its Source 75 Figure 53. Reward Calculation and Real-Time Update for a Single Sample 75 Figure 54. Average Reward after One Training Epoch 76 Figure 55. Learning Trend in Q-Value Matrix Post Update 76 Figure 56. Dispatch Path Query Mechanism 78 Figure 57. Route Query Process 79 Figure 58. Heatmap of Demand Gap (Numerical Grid) 91 Figure 59. Heatmap of Demand Gap (Geographical View) 91 Figure 60. Radar Chart of Idle Vehicle Distribution 92 Figure 61. Heatmap of Idle Vehicle Flow (Spatiotemporal View) 92 Figure 62. STGAT Task Prediction Stability 93 Figure 63. Radar Chart of System Stability and Performance 94 Figure 64. Heatmap of Feature–Performance Correlation 96 LIST OF TABLE Table 1. Comparative Summary of Dispatch Demand Forecasting Methods 19 Table 2. Comparative Summary of Dispatch Strategy Formulation Methods 20 Table 3. Comparative Summary of Dispatch Path Planning Methods 20 Tabel 4. Hardware Configuration 83 Tabel 5. Software Configuration 83 Tabel 6. STGAT Prediction Module Hyperparameter Settings 84 Tabel 7. Q-learning Path Learning Module Hyperparameter Settings 85 Tabel 8. Dispatch Strategy Performance by Time Period 87 Tabel 9. Path Planning Performance Across Time Periods 88 Tabel 10. Overall System Performance Comparison 88 Tabel 11. Ablation Study Results Comparison 89 Tabel 12. System Performance Variance Analysis 94 Tabel 13. Pearson Correlation Between Variables and System Metrics 95 Tabel 14. Cross-City System Performance Comparison 96 |
| 參考文獻 |
REFRENCE [1] X. Yin, G. Wu, J. Wei, Y. Shen, H. Yang, and S. Kurdyka, "Deep learning on traffic prediction: Methods, analysis and future directions," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4927-4943, June 2022. [2] J. Ye, L. Zhao, L. Sun, F. Du, and L. Zhang, "Coupled layer-wise graph convolution for transportation demand prediction," in Proc. AAAI Conf. Artif. Intell., vol. 35, no. 5, pp. 4617-4625, 2021. [3] B. Du, H. Peng, S. Wang, M. Z. A. Bhuiyan, L. Wang, Q. Gong, L. Liu, and J. Li, a"Deep irregular convolutional residual LSTM for urban traffic passenger flows prediction," IEEE Trans. Intell. Transp. Syst., vol. 21, no. 3, pp. 972-985, Mar. 2020. [4] C. Ma, Y. Zhao, G. Dai, X. Xu, and S. C. Wong, "A novel STFSA-CNN-GRU hybrid model for short-term traffic speed prediction," IEEE Trans. Intell. Transp. Syst., vol. 24, no. 4, pp. 3728-3737, Apr. 2023. [5] H. Tang, D. Yin, and S. Wang, "Graph multi-attention network for travel time estimation," IEEE Trans. Knowl. Data Eng., vol. 34, no. 7, pp. 3154-3167, July 2022. [6] M. Ibrahim and F. Mubarek, "Improving prediction for taxi demand forecasting using ensemble machine learning," in Proc. IEEE Data Sci. Eng. Conf., pp. 145-150, 2021. [7] L. Meng, X. Chen, and Y. Wang, "Hotspot clustering and neural network approach for taxi demand prediction," in Proc. IEEE Int. Conf. Data Sci. Cyber. Anal., pp. 245-250, 2020. [8] X. Kong, X. Xu, G. Yan, S. Zhu, and Y. Wen, "STGAT: Spatial-temporal graph attention networks for traffic flow forecasting," IEEE Access, vol. 8, pp. 134363-134372, 2020. [9] B. Kim, J. Kim, S. Huh, S. You, and I. Yang, "Multi-objective predictive taxi dispatch via network flow optimization," IEEE Access, vol. 8, pp. 21437-21452, 2020. [10] Z. Liu, J. Li, and K. Wu, "Context-aware taxi dispatching at city-scale using deep reinforcement learning," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 1996-2009, Mar. 2022. [11] X. Zhou, L. Wu, Y. Zhang, Z.-S. Chen, and S. Jiang, "A robust deep reinforcement learning approach to driverless taxi dispatching under uncertain demand," Inf. Sci., vol. 646, art. no. 119401, Sept. 2023. [12] X. Hu, Q. Wang, W. Zhang, and C. Xu, "RGMARL: Vehicle dispatching based on road information and supply-demand distribution," in Proc. IEEE 26th Int. Conf. Intell. Transp. Syst. (ITSC), pp. 1487-1493, Sept. 2023. [13] Y. Yu, K. Zhang, and G. Zhu, "Urban taxi dispatching optimization: A reinforcement learning-based approach," in Proc. IEEE 20th Int. Conf. Mobility, Sens. Netw. (MSN), pp. 1061-1066, Dec. 2024. [14] L. Li, M. Zhang, W. Hua, and X. Zhou, "Fast query decomposition for batch shortest path processing in road networks," in Proc. IEEE 36th Int. Conf. Data Eng. (ICDE), pp. 1189-1200, Apr. 2020. [15] J. Tang, Y. Wang, W. Hao, F. Liu, H. Huang, and Y. Wang, "A mixed path size logit-based taxi customer-search model considering spatio-temporal factors in route choice," IEEE Trans. Intell. Transp. Syst., vol. 21, no. 4, pp. 1347-1358, Apr. 2020. [16] B. Qu, W. Yang, G. Cui, and X. Wang, "Profitable taxi travel route recommendation based on big taxi trajectory data," IEEE Trans. Intell. Transp. Syst., vol. 21, no. 2, pp. 653-668, Feb. 2020. [17] W. Tu, Q. Li, Z. Fang, S.-L. Shaw, B. Zhou, and X. Chang, "Real-time route recommendations for e-taxies leveraging GPS trajectories," IEEE Trans. Ind. Informat., vol. 17, no. 5, pp. 3133-3142, May 2021. [18] G. Sun, K. Liu, G. O. Boateng, G. Liu, and W. Jiang, "Intelligent cruise guidance and vehicle resource management with deep reinforcement learning," IEEE Internet Things J., vol. 9, no. 5, pp. 3574-3585, Mar. 2022. [19] S. He and K. G. Shin, "Spatio-temporal capsule-based reinforcement learning for mobility-on-demand coordination," IEEE Trans. Knowl. Data Eng., vol. 34, no. 3, pp. 1446-1461, Mar. 2022. [20] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998. [21] J. L. Elman, "Finding structure in time," Cogn. Sci., vol. 14, no. 2, pp. 179-211, Apr. 1990. [22] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, "The graph neural network model," IEEE Trans. Neural Netw., vol. 20, no. 1, pp. 61-80, Jan. 2009. [23] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," in Proc. 31st Int. Conf. Neural Inf. Process. Syst. (NIPS), pp. 5998-6008, 2017. [24] Y. Zhang and Q. Yang, "An overview of multi-task learning," Natl. Sci. Rev., vol. 5, no. 1, pp. 30-43, Jan. 2018. [25] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proc. 2014 Conf. Empirical Methods Natural Lang. Process. (EMNLP), pp. 1724-1734, Oct. 2014. [26] L. Breiman, "Random forests," Mach. Learn., vol. 45, no. 1, pp. 5-32, Oct. 2001. [27] C. Cortes and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273-297, Sept. 1995. [28] F. Aslan and S. S. Kozat, "Handling irregularly sampled signals with gated temporal convolutional networks," Signal Image Video Process., vol. 17, no. 3, pp. 817-823, Mar. 2023. [29] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, "Graph attention networks," in Proc. 6th Int. Conf. Learn. Represent. (ICLR), 2018. [30] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, Sept. 2015. [31] S. Fujimoto, H. Van Hoof, and D. Meger, "Addressing function approximation error in actor-critic methods," in Proc. 35th Int. Conf. Mach. Learn. (ICML), Stockholm, Sweden, July 2018, pp. 1587–1596. [32] C. J. C. H. Watkins and P. Dayan,"Q-learning,"Mach. Learn., vol. 8, no. 3–4, pp. 279–292, 1992. [33] C. Pan, Y. Wang, and L. Yang, "Traffic prediction of space-integrated ground information network using the GTCN algorithm," Electronics, vol. 11, no. 6, p. 906, 2022. [34] Z. Liu, Y. Zhou, J. Wang, and H. Zhang, "A temporal–geospatial deep learning framework for crop yield prediction," Electronics, vol. 13, no. 21, p. 4273, Oct. 2024. [35] R. Ranjan, S. Sankaranarayanan, C. D. Castillo, and R. Chellappa, "An all-in-one convolutional neural network for face analysis," in Proc. 12th IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), pp. 17–24, 2017. [36] R. S. Sutton and A. G. Barto, "Reinforcement learning: An introduction," MIT Press, Cambridge, MA, USA, 1998. [37] E. Balagurusamy, "Q-learning vs. deep Q-learning vs. deep Q-network," Baeldung on Computer Science, July 2022. [Online]. Available: https://www.baeldung.com/cs/q-learning-vs-deep-q-learning-vs-deep-q-network. [Accessed: Aug. 5, 2025]. [38] New York City Taxi and Limousine Commission, "TLC Trip Record Data," NYC.gov, [Online]. Available: https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page [Accessed: Aug. 5, 2025]. [39] E. W. Dijkstra, "A note on two problems in connexion with graphs," Numer. Math., vol. 1, no. 1, pp. 269–271, Dec. 1959. [40] R. W. Floyd, "Algorithm 97: Shortest path," Commun. ACM, vol. 5, no. 6, p. 345, June 1962. [41] S. Warshall, "A theorem on Boolean matrices," J. ACM, vol. 9, no. 1, pp. 11–12, Jan. 1962. [42] A. A. Hagberg, D. A. Schult, and P. J. Swart, "Exploring network structure, dynamics, and function using NetworkX," in Proc. 7th Python Sci. Conf. (SciPy), Pasadena, CA, USA, pp. 11–15, 2008. [43] City of Chicago, "Chicago Traffic Tracker – Historical Congestion Estimates by Segment," City of Chicago Data Portal. [Online].Available:https://data.cityofchicago.org/Transportation/Chicago-Traffic-Tracker-Historical-Congestion-Esti/77hq-huss [Accessed: Aug. 5, 2025] [44] D. S. Johnson, "Approximation algorithms for combinatorial problems," J. Comput. Syst. Sci., vol. 9, no. 3, pp. 256–278, Dec. 1974. [45] V. Mnih, K. Kavukcuoglu, D. Silver, et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. [46] K. Lin, R. Zhao, Z. Xu, and J. Zhou, "Efficient large-scale fleet management via multi-agent deep reinforcement learning," in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jul. 2018, pp. 1774–1783. [47] T. Oda and C. Joe-Wong, "MOVI: A model-free approach to dynamic fleet management," in Proc. IEEE Conf. Comput. Commun. (INFOCOM), Apr. 2018, pp. 2708–2716. [48] M. Qu, H. Zhu, J. Liu, G. Liu, and H. Xiong, "A cost-effective recommender system for taxi drivers," in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2014, pp. 45–54. [49] Z. Liu, J. Li, and K. Wu, "Context-aware taxi dispatching at city-scale using deep reinforcement learning," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 1996–2009, Mar. 2022. [50] P. E. Hart, N. J. Nilsson, and B. Raphael, "A formal basis for the heuristic determination of minimum cost paths," IEEE Trans. Syst. Sci. Cybern., vol. 4, no. 2, pp. 100–107, July 1968. [51] H. W. Kuhn, "The Hungarian method for the assignment problem," Naval Res. Logist. Q., vol. 2, no. 1–2, pp. 83–97, Mar. 1955. [52] T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2016, pp. 785–794. [53] T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," in Proc. Int. Conf. Learn. Represent. (ICLR), Apr. 2017, pp. 1–14. |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信