§ 瀏覽學位論文書目資料
  
系統識別號 U0002-0508202506221300
DOI 10.6846/TKU_Electronic Theses & Dissertations Service202500381
論文名稱(中文) 基於深度學習的游泳動作準確度分析
論文名稱(英文) Deep Learning Based Swimming Motion Accuracy Analysis
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系碩士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 113
學期 2
出版年 114
研究生(中文) 胡中奕
研究生(英文) Chung-I Hu
學號 613410124
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2025-06-23
論文頁數 53頁
口試委員 指導教授 - 陳建彰( ccchen34@mail.tku.edu.tw)
口試委員 - 許哲銓
口試委員 - 林其誼
關鍵字(中) 游泳姿勢評分
非穿戴式偵測
YOLO
長短期記憶網路
雙向多層長短期記憶網路
關鍵點骨架
時序回歸
關鍵字(英) Swimming Posture Assessment
Non wearable Detection
YOLO
Long Short Term Memory
Keypoint Skeleton
Bidirectional LSTM
Multi layer LSTM
Temporal Regression
第三語言關鍵字
學科別分類
中文摘要
本研究旨在建立一套無需穿戴裝置的自動化游泳姿勢評分系統,以解決傳統人工評分的主觀性,以及水中穿戴感測器對動作及流體阻力造成的干擾問題。系統流程包含影像前處理(色彩校正、影像旋轉)、骨架偵測、資料清洗與模型訓練四階段首先進行影像前處理,將處理後的影像資料輸入 YOLO Pose 模型,偵測水中自由式游泳者的骨架,接著執行資料處理步驟,再將九種時序特徵送入多種 LSTM 模型,進行動作序列建模與回歸評分。資料擷取自自行攝製並人工標註之影片,將動作品質標記為 0、50、100 分,並以此進行模型訓練與測試,最終將連續預測分數轉換為三分類以計算 F1‑score。實驗結果顯示,Bi‑LSTM 在學習時序特徵方面表現最佳,預測分數與教練標註具有高度相似度,其三分類 F1‑score 顯著優於單層 LSTM 與 Multi‑LSTM。整體而言,本系統可作為教練即時輔助工具,提升泳姿評分之客觀性與效率,並為後續多泳姿擴充與雲端評分平台奠定基礎。
英文摘要
This study aims to develop an automated swimming posture evaluation system that requires no wearable devices, addressing the subjectivity of traditional manual scoring and the interference caused by wearable sensors in water, such as movement restriction and increased fluid resistance. The system consists of four main stages: image preprocessing (including color correction and horizontal rotation), skeleton detection, data cleansing, and model training. After preprocessing, the enhanced video frames are fed into the YOLO Pose model to detect the skeletons of freestyle swimmers underwater. Subsequently, nine types of temporal features are extracted and input into various LSTM-based models to perform sequence modeling and regression-based scoring. The dataset comprises self-recorded and manually labeled videos, with swimming performance annotated as 0, 50, or 100 points. These labels are used for model training and evaluation. The continuous predicted scores are then converted into three categories to calculate the F1-score. Experimental results demonstrate that the Bi-LSTM model achieves the best performance in learning temporal features, producing predictions closely aligned with expert annotations. Its three-class F1-score significantly outperforms those of the single-layer LSTM and Multi-LSTM models. Overall, the proposed system can serve as a real-time assistant tool for coaches, enhancing the objectivity and efficiency of swimming posture evaluation. It also lays the foundation for future extensions to multiple swimming styles and cloud-based evaluation platforms.
第三語言摘要
論文目次
目錄
第一章 緒論	1
1.1 研究背景與動機	1
1.2 研究目的	2
1.3 論文架構	3
第二章 文獻探討	4
2.1 傳統感測器偵測	4
2.1.1 感測器原理與應用範圍	4
2.1.2 動作時序偵測與分類技術	6
2.1.3 傳統感測器技術之挑戰與限制	7
2.2 電腦視覺技術在游泳姿勢評估中的應用與演進	9
2.2.1 非接觸式骨架辨識技術	10
2.2.2 影像前處理與水下補償方法	11
2.2.3 辨識資料之平滑與缺點補全技術	11
2.3 深度學習模型於游泳姿勢評估之應用	12
2.3.1 模型選擇與設計原則	13
2.3.2 損失函數與評估指標選擇	16
第三章 基於深度學習的游泳動作準確度分析	19
3.1 系統流程架構	19
3.2 骨架資料與特徵提取	20
3.3 水下影像前處理	22
3.3.1 影像旋轉	23
3.3.2 色彩補償	23
3.4 時序資料建構與正規化	25
3.4.1 線性插值補幀法	26
3.4.2 平滑處理	26
3.4.3 正規化處理	27
3.4.4 動作區間偵測	27
3.4.5 時序資料建構與統一長度處理	28
3.5 模型訓練與參數設計	29
3.5.1 模型類型比較	29
3.5.2 資料集	30
3.5.3 參數選擇	30
第四章 實驗結果	31
4.1 模型以及前處理說明	31
4.1.1 YOLO模型選擇	31
4.1.2 水下影像前處理	33
4.1.3 訓練集資料處理	34
4.2 模型訓練	35
4.2.1 LSTM	35
4.2.2 Bi-LSTM	37
4.2.3 MultiLSTM	40
4.2.4 Bi-MultiLSTM	42
4.3 實驗總結	45
第五章 結論與未來研究方向	49
參考文獻	51
 
圖目錄
圖 1、IMU 感測器裝設於前臂之示意圖[18]	5
圖 2、不同泳姿下加速度與角速度的週期性波形,對應每次週期與階段[18]	7
圖 3、穿戴式慣性感測器於游泳訓練中之應用限制示意圖[18]	8
圖 4、YOLO-pose COCO的 17 個關鍵點	10
圖 5、應用濾波消除骨架資料的高頻波動[20]	12
圖 6、(a): LSTM 記憶體單元結構圖,(b): LSTM 模型架構流程圖[21]	13
圖 7、Bi-LSTM 結構示意圖[7]	14
圖 8、Multi-LSTM(多層LSTM)模型架構圖[22]	15
圖 9、影片回歸評分示意圖[23]	17
圖 10、系統流程架構	20
圖 11、使用的關鍵點示意圖	22
圖 12、色彩補償前後比較圖(a):色彩補償前,(b):色彩補償後	25
圖 13、平滑前後比較圖(a):滑動平均平滑,(b):未滑動平均平滑	34
圖 14、LSTM Validation Loss (a):LSTM (16),(b):LSTM (32)	35
圖 15、LSTM混合矩陣(a): LSTM (16),(b): LSTM (32)	36
圖 16、Bi-LSTM Validation Loss (a):Bi-LSTM (16, 16),(b):Bi-LSTM (32, 16),(c):Bi-LSTM (32, 32)	38
圖 17、Bi-LSTM 混合矩陣(a):Bi-LSTM (16, 16)、(b):Bi-LSTM (32, 16)、(c):Bi-LSTM (32, 32)	39
圖 18、MultiLSTM Validation Loss (a):MultiLSTM (16),(b):MultiLSTM (32)	41
圖 19、MultiLSTM混合矩陣(a): MultiLSTM (16),(b): MultiLSTM (32)	41
圖 20、Bi-MultiLSTM Validation Loss (a): Bi-MultiLSTM (16, 16),(b): Bi-MultiLSTM (32, 16),(c): Bi-MultiLSTM (32, 32)	43
圖 21、Bi-MultiLSTM 混合矩陣(a):  Bi-MultiLSTM (16, 16),(b): Bi-MultiLSTM (32, 16),(c): Bi-MultiLSTM (32, 32)的混合矩陣	44
 
表目錄
表 1、穿戴式感測器與影像辨識的比較	9
表2、回歸分數的用途	17
表3、YOLO模型比較	32
表4、色彩校正前後比較	33
表 5、LSTM每個分數標籤的F1-Score	37
表 6、Bi-LSTM每個分數標籤的F1-Score	40
表 7、Multi-LSTM每個分數標籤的F1-Score	42
表 8、Bi-MultiLSTM每個分數標籤的F1-Score	45
表 9、總體分數的F1-Score	46
表 10、整體模型的各分數Precision與Recall	48

參考文獻
[1]	Y. Wang, J. Guo, H. Gao, and H. Yue, “UIEC²-Net: CNN-based underwater image enhancement using two color space,” arXiv preprint, arXiv:2103.07138v2, 2021.
[2]	W. D. Xiang, P. Yang, S. Wang, B. Xu, and H. Liu, “Underwater image enhancement based on red channel weighted compensation and gamma correction model,” Opto-Electronic Advances, vol. 1, no. 10, p. 180024, 2018.
[3]	J. Ouyang, D. Trinh, and C. C. Choo, “Optimization of swim pose estimation and recognition with data augmentation,” in Proc. IEEE Southwest Symp. Image Anal. Interpret. (SSIAI), Santa Fe, NM, USA, 2024, pp. 101–104.
[4]	Y. Ben and X. Lu, “Underwater image enhancement based on weighted fusion algorithm,” J. Appl. Opt., vol. 60, no. 16, pp. 1610007–1610007, 2024.
[5]	S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[6]	A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Netw., vol. 18, no. 5–6, pp. 602–610, 2005.
[7]	D. Tarasevičius, “Uni-directional and bi-directional LSTM comparison on sensor based swimming data,” Int. J. Adv. Res., vol. 8, no. 5, pp. 735–741, 2020.
[8]	K. Hakozaki, N. Kato, M. Tanabiki, J. Furuyama, Y. Sato, and Y. Aoki, “Swimmer’s stroke estimation using CNN and MultiLSTM,” J. Signal Process., vol. 22, no. 4, pp. 219–222, Jul. 2018.
[9]	Y. Ohgi, H. Ichikawa, and M. Homma, “A wearable motion sensor system for swimming stroke analysis,” Sports Eng., vol. 17, no. 4, pp. 235–243, 2014.
[10]	F. Dadashi, G. P. Millet, K. Aminian, and B. Mariani, “Automatic front-crawl temporal phase detection using adaptive filtering of inertial signals,” J. Sports Sci., vol. 33, no. 11, pp. 1141–1150, 2015.
[11]	A. Mannini and A. M. Sabatini, “A hidden Markov model-based technique for gait segmentation using a foot-mounted gyroscope,” Biomed. Eng. Online, vol. 10, no. 3, 2011.
[12]	S. E. Slawson, P. P. Conway, and A. A. West, “The development and testing of a system to measure the force exerted by a swimmer during the front crawl stroke,” Procedia Eng., vol. 60, pp. 162–167, 2013.
[13]	Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime multi-person 2D pose estimation using part affinity fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, 2021.
[14]	K. Sun, B. Xiao, D. Liu, and J. Wang, “Deep high-resolution representation learning for human pose estimation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 5693–5703.
[15]	A. M. Ferrari, S. Micera, and K. Aminian, “Sensor fusion and inertial sensors for swimming performance analysis,” Sensors, vol. 16, no. 4, p. 668, 2016.
[16]	T. Shimojo, H. Yoneyama, T. Takagi, and M. Homma, “A new method for determining the stroke phases in front crawl using inertial sensors,” Procedia Eng., vol. 112, pp. 418–423, 2015.
[17]	T. Stamm, R. James, and D. Thiel, “Speed-based parameter estimation for front crawl swimming using inertial measurement units,” IEEE Sensors J., vol. 18, no. 9, pp. 3752–3761, 2018.
[18]	M. J. Connick and D. G. G. Li, “Inertial sensor technology for elite swimming performance analysis: A systematic review,” Sensors, vol. 22, no. 1, p. 348, 2022.
[19]	Y. Wang, L. Zhang, Y. Li, and Q. Liu, “Missing keypoint recovery using spatio-temporal graph convolutional networks,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 11, pp. 7440–7453, Nov. 2022. 
[20]	A. Savitzky and M. J. E. Golay, “Smoothing and differentiation of data by simplified least squares procedures,” Anal. Chem., vol. 36, no. 8, pp. 1627–1639, 1964.
[21]	L. Guo, J. Ye, and B. Yang, “Cyber-attack detection for electric vehicles using physics-guided machine learning,” IEEE Trans. Transp. Electrific., Early Access.
[22]	F. Li, M. Valero, L. Zhao, and Y. Mahmoud, “Cybersecurity strategy against cyber attacks towards smart grids with PVs,” in KSU Proc. Cybersecurity Educ. Res. Pract., Oct. 2020.
[23]	P. Parmar and B. Tran Morris, “Learning to score Olympic events,” arXiv preprint, arXiv:1611.05125v3, 2017.
[24]	H. Wang and Q. Yang, “Comparison of MAE and MSE in evaluating prediction accuracy,” J. Stat. Comput., vol. 22, no. 4, pp. 35–42, 2020.
[25]	S. Pfister, A. Westermann, M. Maurer, and B. Schauerte, “Predicting human movement with regression models: A comparative study,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 5, pp. 1251–1265, 2020.
[26]	D. Aoyagi and A. R. Hargens, “IMU-based estimation of human lower body kinematics and applications to extravehicular operations,” Ph.D. dissertation, Massachusetts Institute of Technology, 2016.
[27]	P. Liashchynskyi and P. Liashchynskyi, “Grid Search, Random Search, Genetic Algorithm: A Big Comparison for NAS,” arXiv preprint arXiv:1912.06059, 2019.
[28]	Y. Zhao, R. Yang, G. Chevalier, R. C. Shah, and R. Romijnders, “Applying deep bidirectional LSTM and mixture density network for basketball trajectory prediction,” arXiv preprint arXiv:1708.05824, 2017. 
論文全文使用權限
國家圖書館
不同意無償授權國家圖書館
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權予資料庫廠商
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信