§ 瀏覽學位論文書目資料
  
系統識別號 U0002-1108202513085300
DOI 10.6846/tku202500696
論文名稱(中文) 基於多任務學習與強化學習的計程車預期派遣系統
論文名稱(英文) An Anticipatory Taxi Dispatch System Based on Multi-Task Learning and Reinforcement Learning
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系全英語碩士班
系所名稱(英文) Master's Program, Department of Computer Science and Information Engineering (English-taught program)
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 113
學期 2
出版年 114
研究生(中文) 鍾心悦
研究生(英文) Hsin-Yueh Chung
學號 611780064
學位類別 碩士
語言別 英文
第二語言別
口試日期 2025-06-14
論文頁數 103頁
口試委員 指導教授 - 武士戎(wushihjung@mail.tku.edu.tw)
共同指導教授 - 張峯誠(135170@mail.tku.edu.tw)
口試委員 - 張志勇
口試委員 - 廖文華
關鍵字(中) 計程車派遣
時空圖神經網路(STGAT)
強化學習
路徑規劃
多任務預測
Q-learning
智慧交通
空車調度
關鍵字(英) Taxi Dispatch
STGAT
Reinforcement Learning
Route Planning
Multi-task Learning
Q-learning
Intelligent Transportation
Vehicle Scheduling
第三語言關鍵字
學科別分類
中文摘要
程車預期派遣系統是智慧交通管理中的關鍵技術,其核心目標為提前預測乘客需求並主動調度車輛資源,以提升整體營運效率並降低空駛成本。傳統方法往往將派遣流程視為靜態決策問題,缺乏對城市時空變化的即時掌握,致使系統在應對複雜環境與動態需求時難以維持穩定效能。現有研究在派遣需求預測方面,普遍仰賴單一時間序列或歷史平均值模型,難以捕捉區域間的時空依賴關係;在派遣策略制定方面,多以靜態規則或啟發式方法為主,未能根據預測結果調整全局資源配置;在派遣路徑規劃方面,則多採用傳統最短距離或最短時間演算法,無法動態適應即時交通狀況。
本論文提出一套整合時空圖預測與強化學習之智慧派遣系統(STGAT-Q),分別在三個層面進行技術突破:(1) 在需求預測方面,利用時空圖注意力網路(STGAT)同時預測區域需求量、空車數與道路速率,捕捉城市交通之時空耦合特性;(2) 在策略制定方面,透過多任務預測結果導出供需差異矩陣,進一步生成全域派遣策略與車輛調度建議;(3) 在路徑規劃方面,引入強化學習模型(Q-learning),根據預測速率學習成本最小之派遣路徑,實現動態與智能化的空車引導。
本論文的主要貢獻如下:一、設計一個端到端的智慧派遣與路徑優化架構,涵蓋預測、策略與路徑三階段;二、建構一套以紐約市為實驗基礎的時空圖派遣系統並完成模組整合實作;三、透過十四組實驗從多項指標(如R_idle、Path MAE、R_total)驗證模型效能、模組貢獻與泛化能力;四、首次實現預測、派遣與路徑間之閉環整合,提升系統於動態場景中的實用性。
實驗結果顯示,基於2013年紐約市14,776,615筆行程記錄的驗證,本系統在需求滿足度、路徑準確性與資源利用率等指標皆顯著優於現有方法。十四組對比實驗證實各模組的有效性,跨時段與跨區域測試驗證了系統的泛化能力,在不同時段與跨城市測試中皆保持穩定效能,證實STGAT-Q架構具備高度適應性與部署潛力。
英文摘要
Taxi predictive dispatch systems represent a critical technology in intelligent transportation management, aiming to proactively predict passenger demand and schedule vehicle resources for enhanced operational efficiency. Traditional approaches treat dispatch as static decision problems, lacking real-time awareness of urban spatio-temporal dynamics. Existing research limitations include: single time-series models in demand prediction that fail to capture inter-regional dependencies; static rule-based strategy formulation unable to adjust global resource allocation; and conventional routing algorithms that cannot adapt to real-time traffic conditions.
This paper proposes STGAT-Q, an intelligent dispatch system integrating spatio-temporal graph prediction and reinforcement learning. The system achieves breakthroughs in three aspects: (1)Employing Spatio-Temporal Graph Attention Networks to simultaneously predict regional demand, idle vehicle counts, and road speeds; (2) Generating global dispatch strategies through supply-demand difference matrices derived from multi-task predictions; (3)Utilizing Q-learning to determine cost-efficient routes based on predicted traffic speeds.
The main contributions include: designing an end-to-end architecture integrating prediction, strategy, and routing; implementing a complete spatio-temporal graph dispatch system validated on New York City data; comprehensive evaluation through thirteen experimental groups using metrics including R_idle, Path MAE, and R_total; and achieving the first closed-loop integration of prediction, dispatch, and routing modules.
Experimental results based on 14,776,615 NYC taxi records demonstrate that STGAT-Q significantly outperforms existing methods in demand satisfaction, routing accuracy, and resource utilization, with consistent performance across different time periods and urban settings.
第三語言摘要
論文目次
LIST OF CONTENT

誌謝	I
LIST OF CONTENT	VII
LIST OF FIGURE	IX
LIST OF TABLE	XII
CHAPTER 1 INTRODUCTION	1
1.1 Research Background	1
1.2 Research Motivation	3
1.3 Research Objectives	5
1.4 Research Contributions	7
Chapter 2 Related Work	10
2.1 Demand Forecasting for Dispatch	10
2.2 Dispatch Strategy Formulation	14
2.3 Dispatch Path Planning	16
2.4 Overview	18
Chapter 3 Background Knowledge	21
3.1 Spatio-Temporal Graph Feature Engineering	21
3.2 STGAT: Spatial-Temporal Graph Attention Network	25
3.3 Multi-Task Learning	30
3.4 Reinforcement Learning: Q-learning	33
Chapter 4 System Design	38
4.1 System Architecture	38
4.2 Feature Graph Structure Modeling	40
4.2.1 Data Collection and Preprocessing	41
4.2.2 Training Feature Construction	46
4.3 Multi-Task Prediction Module	51
4.3.1 STGAT Prediction Module	53
4.3.2 Task 1: Formulating the Dispatch Strategy	57
4.3.3 Task 2: Shortest Travel-Time Path	61
4.4 Reinforcement Learning: Dispatch Path Learning	64
4.4.1 Q-learning Training Data Preprocessing	65
4.4.2 Q-learning System Design	68
4.4.3 Training Phase: Q-Value Learning and Policy Optimization	72
4.4.4 Execution Phase: Route Planning	77
Chapter 5 Experimental Analysis	81
5.1 Dataset	81
5.2 Environment and System Parameter Settings	82
5.3 Experimental Results	85
Chapter 6 Conclusion	97
6.1 Summary of Contributions	97
6.2 Future Work	98
REFRENCE	100

LIST OF FIGURE
Figure 1. Motivation: Pain Points and Improvement Directions in Taxi Dispatch	4
Figure 2. Research Objective: Enhancing User Experience and Dispatch Quality	6
Figure 3. Overall Framework	7
Figure 4. Research Contributions	9
Figure 5. Taxi Administrative Zones in Manhattan, New York City [38]	22
Figure 6. Illustration of Urban Region Graph Structure [11]	23
Figure 7. Spatio-Temporal Graph Representation of Urban Traffic Dynamics	23
Figure 8. Graph-Structured Representation of Traffic Data [8]	26
Figure 9. Framework of the Spatial-Temporal Graph Attention Network [8]	27
Figure 10. Structure of the Gated Temporal Convolution Layer [8]	27
Figure 11. Residual Network Structure [33]	28
Figure 12. Gating Control Unit [33]	29
Figure 13. GTCN Architecture and Processing Flow [33]	29
Figure 14. GAT Architecture [34]	30
Figure 15. Architecture of Multi-Task Learning [35]	32
Figure 16. Reinforcement Learning Framework [37]	34
Figure 17. Q-learning System Architecture [37]	35
Figure 18. Illustration of Mouse-in-Maze Scenario [37]	36
Figure 19. Initial Q-table and Random Actions [37]	36
Figure 20. Q-value Update Illustration [37]	36
Figure 21. Overall System Architecture and Workflow	39
Figure 22. Workflow of Feature Graph Construction	41
Figure 23. Manhattan Taxi Zone Aggregation	42
Figure 24. Node Mapping of Historical Trip Data	45
Figure 25. Required Features in the Graph Structure	47
Figure 26. Construction and Alignment of Passenger Demand	47
Figure 27. Construction of Vacant Vehicle Features	48
Figure 28. Computation of Inter-node Road Distances	49
Figure 29. Speed Allocation Algorithm Based on Trip Paths	49
Figure 30. Construction of Graph Sequences via Sliding Window	50
Figure 31. Node Adjacency Matrix for Manhattan Zones	51
Figure 32. Overall Architecture of the Multi-Task Prediction Module	53
Figure 33. Structure of the Gated Temporal Convolution Layer (GTCN)	54
Figure 34. Illustration of Graph Attention Mechanism (GAT)	55
Figure 35. Architecture of the Dual-Channel Gated Fusion Mechanism	56
Figure 36. Multi-Task Prediction and Weighted Loss Design	57
Figure 37. Dispatch Strategy Formulation Architecture	58
Figure 38. Architecture of the Global Dispatch Module	60
Figure 39. Design Method of the Global Dispatch Strategy	60
Figure 40. Architecture of the Shortest Travel Time Path Module	61
Figure 41. Method for Calculating Shortest Travel-Time Paths	63
Figure 42. Architecture of Dispatch Path Learning Module	65
Figure 43. Input Data Format of Historical Trips	66
Figure 44. Node Adjacency Structure and Matrix Representation	67
Figure 45. Example of Transformed Training Sample Format	68
Figure 46. Overall Architecture of the Reinforcement Learning Module	69
Figure 47. State Design	70
Figure 48. Reward Function Design and Learning Objectives	71
Figure 49. Design of Negative Reward Constraints	71
Figure 50. Initialization and Masking of Q-Value Matrix	73
Figure 51. Initial Reward Matrix Structure at Training Start	74
Figure 52. Training Data Sample and Its Source	75
Figure 53. Reward Calculation and Real-Time Update for a Single Sample	75
Figure 54. Average Reward after One Training Epoch	76
Figure 55. Learning Trend in Q-Value Matrix Post Update	76
Figure 56. Dispatch Path Query Mechanism	78
Figure 57. Route Query Process	79
Figure 58. Heatmap of Demand Gap (Numerical Grid)	91
Figure 59. Heatmap of Demand Gap (Geographical View)	91
Figure 60. Radar Chart of Idle Vehicle Distribution	92
Figure 61. Heatmap of Idle Vehicle Flow (Spatiotemporal View)	92
Figure 62. STGAT Task Prediction Stability	93
Figure 63. Radar Chart of System Stability and Performance	94
Figure 64. Heatmap of Feature–Performance Correlation	96

LIST OF TABLE
Table 1. Comparative Summary of Dispatch Demand Forecasting Methods	19
Table 2. Comparative Summary of Dispatch Strategy Formulation Methods	20
Table 3. Comparative Summary of Dispatch Path Planning Methods	20
Tabel 4. Hardware Configuration	83
Tabel 5. Software Configuration	83
Tabel 6. STGAT Prediction Module Hyperparameter Settings	84
Tabel 7. Q-learning Path Learning Module Hyperparameter Settings	85
Tabel 8. Dispatch Strategy Performance by Time Period	87
Tabel 9. Path Planning Performance Across Time Periods	88
Tabel 10. Overall System Performance Comparison	88
Tabel 11. Ablation Study Results Comparison	89
Tabel 12. System Performance Variance Analysis	94
Tabel 13. Pearson Correlation Between Variables and System Metrics	95
Tabel 14. Cross-City System Performance Comparison	96

參考文獻
REFRENCE
[1]	X. Yin, G. Wu, J. Wei, Y. Shen, H. Yang, and S. Kurdyka, "Deep learning on traffic prediction: Methods, analysis and future directions," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4927-4943, June 2022.
[2]	J. Ye, L. Zhao, L. Sun, F. Du, and L. Zhang, "Coupled layer-wise graph convolution for transportation demand prediction," in Proc. AAAI Conf. Artif. Intell., vol. 35, no. 5, pp. 4617-4625, 2021.
[3]	B. Du, H. Peng, S. Wang, M. Z. A. Bhuiyan, L. Wang, Q. Gong, L. Liu, and J. Li, a"Deep irregular convolutional residual LSTM for urban traffic passenger flows prediction," IEEE Trans. Intell. Transp. Syst., vol. 21, no. 3, pp. 972-985, Mar. 2020.
[4]	C. Ma, Y. Zhao, G. Dai, X. Xu, and S. C. Wong, "A novel STFSA-CNN-GRU hybrid model for short-term traffic speed prediction," IEEE Trans. Intell. Transp. Syst., vol. 24, no. 4, pp. 3728-3737, Apr. 2023.
[5]	H. Tang, D. Yin, and S. Wang, "Graph multi-attention network for travel time estimation," IEEE Trans. Knowl. Data Eng., vol. 34, no. 7, pp. 3154-3167, July 2022.
[6]	M. Ibrahim and F. Mubarek, "Improving prediction for taxi demand forecasting using ensemble machine learning," in Proc. IEEE Data Sci. Eng. Conf., pp. 145-150, 2021.
[7]	L. Meng, X. Chen, and Y. Wang, "Hotspot clustering and neural network approach for taxi demand prediction," in Proc. IEEE Int. Conf. Data Sci. Cyber. Anal., pp. 245-250, 2020.
[8]	X. Kong, X. Xu, G. Yan, S. Zhu, and Y. Wen, "STGAT: Spatial-temporal graph attention networks for traffic flow forecasting," IEEE Access, vol. 8, pp. 134363-134372, 2020.
[9]	B. Kim, J. Kim, S. Huh, S. You, and I. Yang, "Multi-objective predictive taxi dispatch via network flow optimization," IEEE Access, vol. 8, pp. 21437-21452, 2020.
[10]	Z. Liu, J. Li, and K. Wu, "Context-aware taxi dispatching at city-scale using deep reinforcement learning," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 1996-2009, Mar. 2022.
[11]	X. Zhou, L. Wu, Y. Zhang, Z.-S. Chen, and S. Jiang, "A robust deep reinforcement learning approach to driverless taxi dispatching under uncertain demand," Inf. Sci., vol. 646, art. no. 119401, Sept. 2023.
[12]	X. Hu, Q. Wang, W. Zhang, and C. Xu, "RGMARL: Vehicle dispatching based on road information and supply-demand distribution," in Proc. IEEE 26th Int. Conf. Intell. Transp. Syst. (ITSC), pp. 1487-1493, Sept. 2023.
[13]	Y. Yu, K. Zhang, and G. Zhu, "Urban taxi dispatching optimization: A reinforcement learning-based approach," in Proc. IEEE 20th Int. Conf. Mobility, Sens. Netw. (MSN), pp. 1061-1066, Dec. 2024.
[14]	L. Li, M. Zhang, W. Hua, and X. Zhou, "Fast query decomposition for batch shortest path processing in road networks," in Proc. IEEE 36th Int. Conf. Data Eng. (ICDE), pp. 1189-1200, Apr. 2020.
[15]	J. Tang, Y. Wang, W. Hao, F. Liu, H. Huang, and Y. Wang, "A mixed path size logit-based taxi customer-search model considering spatio-temporal factors in route choice," IEEE Trans. Intell. Transp. Syst., vol. 21, no. 4, pp. 1347-1358, Apr. 2020.
[16]	B. Qu, W. Yang, G. Cui, and X. Wang, "Profitable taxi travel route recommendation based on big taxi trajectory data," IEEE Trans. Intell. Transp. Syst., vol. 21, no. 2, pp. 653-668, Feb. 2020.
[17]	W. Tu, Q. Li, Z. Fang, S.-L. Shaw, B. Zhou, and X. Chang, "Real-time route recommendations for e-taxies leveraging GPS trajectories," IEEE Trans. Ind. Informat., vol. 17, no. 5, pp. 3133-3142, May 2021.
[18]	G. Sun, K. Liu, G. O. Boateng, G. Liu, and W. Jiang, "Intelligent cruise guidance and vehicle resource management with deep reinforcement learning," IEEE Internet Things J., vol. 9, no. 5, pp. 3574-3585, Mar. 2022.
[19]	S. He and K. G. Shin, "Spatio-temporal capsule-based reinforcement learning for mobility-on-demand coordination," IEEE Trans. Knowl. Data Eng., vol. 34, no. 3, pp. 1446-1461, Mar. 2022.
[20]	Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[21]	J. L. Elman, "Finding structure in time," Cogn. Sci., vol. 14, no. 2, pp. 179-211, Apr. 1990.
[22]	F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, "The graph neural network model," IEEE Trans. Neural Netw., vol. 20, no. 1, pp. 61-80, Jan. 2009.
[23]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," in Proc. 31st Int. Conf. Neural Inf. Process. Syst. (NIPS), pp. 5998-6008, 2017.
[24]	Y. Zhang and Q. Yang, "An overview of multi-task learning," Natl. Sci. Rev., vol. 5, no. 1, pp. 30-43, Jan. 2018.
[25]	K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proc. 2014 Conf. Empirical Methods Natural Lang. Process. (EMNLP), pp. 1724-1734, Oct. 2014.
[26]	L. Breiman, "Random forests," Mach. Learn., vol. 45, no. 1, pp. 5-32, Oct. 2001.
[27]	C. Cortes and V. Vapnik, "Support-vector networks," Mach. Learn., vol. 20, no. 3, pp. 273-297, Sept. 1995.
[28]	F. Aslan and S. S. Kozat, "Handling irregularly sampled signals with gated temporal convolutional networks," Signal Image Video Process., vol. 17, no. 3, pp. 817-823, Mar. 2023.
[29]	P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, "Graph attention networks," in Proc. 6th Int. Conf. Learn. Represent. (ICLR), 2018.
[30]	T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, Sept. 2015.
[31]	S. Fujimoto, H. Van Hoof, and D. Meger, "Addressing function approximation error in actor-critic methods," in Proc. 35th Int. Conf. Mach. Learn. (ICML), Stockholm, Sweden, July 2018, pp. 1587–1596.
[32]	C. J. C. H. Watkins and P. Dayan,"Q-learning,"Mach. Learn., vol. 8, no. 3–4, pp. 279–292, 1992.
[33]	C. Pan, Y. Wang, and L. Yang, "Traffic prediction of space-integrated ground information network using the GTCN algorithm," Electronics, vol. 11, no. 6, p. 906, 2022.
[34]	Z. Liu, Y. Zhou, J. Wang, and H. Zhang, "A temporal–geospatial deep learning framework for crop yield prediction," Electronics, vol. 13, no. 21, p. 4273, Oct. 2024.
[35]	R. Ranjan, S. Sankaranarayanan, C. D. Castillo, and R. Chellappa, "An all-in-one convolutional neural network for face analysis," in Proc. 12th IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), pp. 17–24, 2017.
[36]	R. S. Sutton and A. G. Barto, "Reinforcement learning: An introduction," MIT Press, Cambridge, MA, USA, 1998.
[37]	E. Balagurusamy, "Q-learning vs. deep Q-learning vs. deep Q-network," Baeldung on Computer Science, July 2022. [Online]. Available: https://www.baeldung.com/cs/q-learning-vs-deep-q-learning-vs-deep-q-network. [Accessed: Aug. 5, 2025].
[38]	New York City Taxi and Limousine Commission, "TLC Trip Record Data," NYC.gov, [Online]. Available: https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page [Accessed: Aug. 5, 2025].
[39]	E. W. Dijkstra, "A note on two problems in connexion with graphs," Numer. Math., vol. 1, no. 1, pp. 269–271, Dec. 1959.
[40]	R. W. Floyd, "Algorithm 97: Shortest path," Commun. ACM, vol. 5, no. 6, p. 345, June 1962.
[41]	S. Warshall, "A theorem on Boolean matrices," J. ACM, vol. 9, no. 1, pp. 11–12, Jan. 1962.
[42]	A. A. Hagberg, D. A. Schult, and P. J. Swart, "Exploring network structure, dynamics, and function using NetworkX," in Proc. 7th Python Sci. Conf. (SciPy), Pasadena, CA, USA, pp. 11–15, 2008.
[43]	City of Chicago, "Chicago Traffic Tracker – Historical Congestion Estimates by Segment," City of Chicago Data Portal. [Online].Available:https://data.cityofchicago.org/Transportation/Chicago-Traffic-Tracker-Historical-Congestion-Esti/77hq-huss [Accessed: Aug. 5, 2025]
[44]	D. S. Johnson, "Approximation algorithms for combinatorial problems," J. Comput. Syst. Sci., vol. 9, no. 3, pp. 256–278, Dec. 1974.
[45]	V. Mnih, K. Kavukcuoglu, D. Silver, et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015.
[46]	K. Lin, R. Zhao, Z. Xu, and J. Zhou, "Efficient large-scale fleet management via multi-agent deep reinforcement learning," in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jul. 2018, pp. 1774–1783.
[47]	T. Oda and C. Joe-Wong, "MOVI: A model-free approach to dynamic fleet management," in Proc. IEEE Conf. Comput. Commun. (INFOCOM), Apr. 2018, pp. 2708–2716.
[48]	M. Qu, H. Zhu, J. Liu, G. Liu, and H. Xiong, "A cost-effective recommender system for taxi drivers," in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2014, pp. 45–54.
[49]	Z. Liu, J. Li, and K. Wu, "Context-aware taxi dispatching at city-scale using deep reinforcement learning," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 1996–2009, Mar. 2022.
[50]	P. E. Hart, N. J. Nilsson, and B. Raphael, "A formal basis for the heuristic determination of minimum cost paths," IEEE Trans. Syst. Sci. Cybern., vol. 4, no. 2, pp. 100–107, July 1968.
[51]	H. W. Kuhn, "The Hungarian method for the assignment problem," Naval Res. Logist. Q., vol. 2, no. 1–2, pp. 83–97, Mar. 1955.
[52]	T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2016, pp. 785–794.
[53]	T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," in Proc. Int. Conf. Learn. Represent. (ICLR), Apr. 2017, pp. 1–14.
論文全文使用權限
國家圖書館
不同意無償授權國家圖書館
校內
校內紙本論文立即公開
同意電子論文全文授權於全球公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信