§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2608201818391200
DOI 10.6846/TKU.2018.00855
論文名稱(中文) 基於行為複製之機械手臂的物件夾取
論文名稱(英文) Object Grasping for Robot Manipulator Based on Behavioral Cloning
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 電機工程學系機器人工程碩士班
系所名稱(英文) Master's Program In Robotics Engineering, Department Of Electrical And Computer Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 106
學期 2
出版年 107
研究生(中文) 黃宥竣
研究生(英文) You-Jun Huang
學號 605470201
學位類別 碩士
語言別 繁體中文
第二語言別
口試日期 2018-06-27
論文頁數 65頁
口試委員 指導教授 - 翁慶昌
共同指導教授 - 蔡奇謚
委員 - 龔宗鈞
委員 - 翁慶昌
委員 - 李世安
關鍵字(中) 機械手臂
物件夾取
模仿學習
行為複製
深度神經網路
關鍵字(英) Robot Manipulator
Object Grasping
Imitation Learning
Behavioral Cloning
Deep Neural Network
第三語言關鍵字
學科別分類
中文摘要
本論文實現一個基於行為複製之機械手臂夾取物件的方法,並提出一個雙視覺網路模型來讓深度神經網路能夠有效地學習任務相關的特徵,使機械手臂可以執行期望之機械手臂夾取物件的行為。主要有三個部分:(1)模仿學習、(2)深度神經網路、及(3)訓練樣本蒐集。在模仿學習的部分,本論文採用行為複製方法並結合資料集聚合演算法來讓深度神經網路學習範例資料所示範的行為,並且降低訓練過的神經網路之複合誤差。在深度神經網路的部分,本論文提出一個基於卷積神經網路的雙視覺網路模型來優化網路模型對目標物件的辨識、定位、及任務相關的特徵之學習。雙視覺網路模型之輸入為外部攝影機與機械手臂手部攝影機的RGB影像以及機械手臂之回授輸出。首先將兩個攝影機的RGB影像分別輸入到對應的卷積層,然後各用一個全連接層分別對兩個卷積之輸出以及機械手臂之回授輸出做接合,再用多個全連接層來處理這兩個接合結果,最後網路模型之輸出為控制機械手臂及夾具的命令。在訓練樣本蒐集的部分,本論文利用領域隨機化及資料集聚合演算法來產生各種的訓練樣本,使深度神經網路更具有強健性。在實驗結果的部分,本論文比較三種網路模型之執行物件夾取任務的成功率,分別為基準網路模型、雙視覺網路模型V1、及雙視覺網路模型V2。從實驗結果可知,本論文所提出之雙視覺網路模型V2的方法確實可以提高深度神經網路之學習效果。此外,在夾取任務失敗時,深度神經網路會再一次執行夾取行為來讓機械手臂能夠完成期望的行為。
英文摘要
A method based on behavioral cloning for a robot manipulator to grasp objects is implemented and a dual vision neural network model is proposed to enable the deep neural network (DNN) to effectively learn the task-related features so that the robot manipulator can perform the desired behavior of grasping object. There are three main parts: (1) imitation learning, (2) deep neural network, and (3) training sample collection. In the imitation learning, the behavioral cloning method combined with the dataset aggregation algorithm is used to let the DNN learn the behaviors demonstrated by the demonstration data and to reduce the compounding errors of the trained neural network. In the deep neural network, a dual vision neural network model based on the convolutional neural network is proposed to optimize the network model to learn the recognition, location, and task-related features of the target object. The inputs of the dual vision network model are the RGB images of both the external camera and eye-in-hand camera, and the outputs of manipulator. First, the images of the two cameras are respectively input to the corresponding convolution layer. The outputs of the two convolution outputs and the outputs of robot manipulator are respectively joined by a fully connected layer, and the two joint results are processed by multiple fully connected layers. Finally, the outputs of network model are commands to control the robot manipulator and gripper. In the training sample collection, the domain randomization and data set aggregation algorithms are used to generate various training samples, which make the DNN more robust. In the experimental results, the success rate of the execution tasks of the three network models (the reference network model, the dual visual network model V1, and the dual visual network model V2) is compared. The experimental results illustrate that the proposed method can indeed improve the learning effect of DNN. Moreover, when the grapping task fails, the DNN will perform the gripping behavior again to let the robot manipulator be able to perform the desired behavior.
第三語言摘要
論文目次
目錄
中文摘要	I
英文摘要	II
目錄	III
圖目錄	V
表目錄	VIII
符號對照表	IX
第一章 緒論	1
1.1 研究背景	1
1.2 研究動機	4
1.3 論文架構	5
第二章 模擬環境與實驗平台	6
2.1 模擬環境	6
2.2 機械手臂與末端效應器	7
2.3 軟硬體系統	8
第三章 機械手臂之運動學	10
3.1 D-H連桿參數表	11
3.2 正向運動學	15
3.3 逆向運動學	20
第四章 類神經網路	25
4.1 類神經網路之概念	25
4.2 卷積神經網路	28
第五章 機械手臂的物件夾取之方法	35
5.1 強化學習	35
5.2 模仿學習	36
5.3 深度神經網路之設計	41
5.4 訓練樣本蒐集	46
第六章 實驗結果	50
6.1 三種網路模型之測試結果	50
6.2 機械手臂的物件夾取方法執行結果	55
6.3 加入未知物件至模擬環境中	58
第七章 結論與未來展望	60
7.1 結論	60
7.2 未來展望	61
參考文獻	62 

圖目錄
圖1.1、KUKA機械手臂	2
圖1.2、深度Q網路的輸出入之關係圖	3
圖2.1、機械手臂之模擬環境	7
圖2.2、機械手臂各關節及末端效應器之配置圖	8
圖2.3、軟硬體系統之架構圖	9
圖3.1、Zi-1與Zi無共平面之示意圖	11
圖3.2、Zi-1與Zi相互平行之示意圖	12
圖3.3、Zi-1與Zi相交之狀況一之示意圖	13
圖3.4、Zi-1與Zi相交之狀況二之示意圖	13
圖3.5、機械手臂的座標系之配置圖	14
圖3.6、機械手臂之兩指平行開合夾爪的三個方位向量	16
圖3.7、尤拉角法的旋轉之示意圖	17
圖3.8、球型關節之示意圖	20
圖3.9、運動學解耦合之示意圖	21
圖3.10、前三軸的關節連桿之幾何關係圖	21
圖4.1、神經元架構之示意圖	25
圖4.2、單層神經網路之示意圖	26
圖4.3、多層感知機之示意圖	27
圖4.4、卷積神經網路之架構圖	29
圖4.5、卷積運算的過程之示意圖	30
圖4.6、ReLU活化函數之示意圖	31
圖4.7、最大池化的運算之示意圖	32
圖4.8、卷積層全連接至全連接層的過程之示意圖	32
圖4.9、LeNet之架構圖	33
圖4.10、AlexNet之架構圖	34
圖5.1、強化學習演算法之示意圖	35
圖5.2、模仿學習演算法之示意圖	36
圖5.3、狀態流程之示意圖	37
圖5.4、行為複製演算法之示意圖	38
圖5.5、監督式神經網路訓練之示意圖	38
圖5.6、資料集聚合演算法之流程圖	40
圖5.7、資料集聚合演算法之虛擬碼	40
圖5.8、基準網路模型之架構圖	41
圖5.9、雙視覺網路模型V1之架構圖	43
圖5.10、雙視覺網路模型V2之架構圖	44
圖5.11、三個階段任務的分鏡之示意圖	47
圖5.12、領域隨機化的訓練樣本	48
圖5.13、資料集聚合產生訓練樣本之流程圖	49
圖6.1、第三階段任務的成功率比較之折線圖	54
圖6.2、測試網路模型之流程圖	55
圖6.3、末端效應器移至目標物件的上方之分鏡圖	56
圖6.4、末端效應器向下接近目標物件之分鏡圖	56
圖6.5、末端效應器閉合其夾指之分鏡圖	57
圖6.6、物件夾取任務失敗的結果之分鏡圖	57
圖6.7、增加未知的干擾物實驗之分鏡圖	58
圖6.8、替換目標物件的紋理實驗之分鏡圖	59

表目錄
表2.1、機械手臂之規格表	7
表2.2、末端效應器之規格表	8
表2.3、個人電腦之規格表	9
表3.1、D-H連桿參數說明	14
表3.2、機械手臂之D-H連桿參數表	14
表6.1、基準網路模型之三個階段任務的成功率	51
表6.2、雙視覺網路模型V1之三個階段任務的成功率	52
表6.3、雙視覺網路模型V2之三個階段任務的成功率	53
參考文獻
[1]	J.F. Engelberger, Robotics in Service, MIT Press, 1989.
[2]	KUKA Manipulator, URL: http://www.kuka.com
[3]	KUKA Specialized System for Riveting, URL: https://roboticsandautomationnews.com/2018/04/26/kuka-launches-specialised-system-for-riveting/16995/
[4]	KUKA Manipulator Writing, URL: https://rushtips.com/kuka-robot-design-award
[5]	A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” in Proc. Advances Neural Inf. Process. Syst., pp.1106-1114, 2012.
[6]	V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness, M.G. Bellemare, A. Graves, M. Riedmiller, A.K. Fidjeland, and G. Ostrovski, “Human-Level Control through Deep Reinforcement Learning,” Nature, vol. 518, pp. 529-533, 2015.
[7]	Y.J. Huang, Y.C. Lai, R.J. Chen, C.Y. Tsai, and C.C. Wong, “A Deep Learning-Based Object Detection Algorithm Applied in Shelf-Picking Robot,” Conference on International Automatic Control Conference (CACS), 2017.
[8]	L. Pinto and A. Gupta, “Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours,” IEEE International Conference on Robotics and Automation (ICRA), pp. 3406-3413, 2016.
[9]	OpenAI Gym, URL: https://gym.openai.com/
[10]	G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “OpenAI Gym,” arXiv, 2016.
[11]	Python Programming Language, URL: https://www.python.org/
[12]	MuJoCo Physics Engine, URL: http://www.mujoco.org/
[13]	Fetch Robot, URL: https://fetchrobotics.com/research-platforms/fetch-mobile-manipulator/
[14]	Fetch Robot Configuration, URL: http://docs.fetchrobotics.com/robot_hardware.html
[15]	J. J. Craig, Introduction to Robotics: Mechanics and Control, 3rd Ed., New York, NY, USA: Prentice Hall, 2004.
[16]	M. W. Spong, S. Hutchinson, and M. Vidyasagar, Robot Dynamics and Control, 2nd Ed., John Wiley & Sons, 2004.
[17]	M. A. Ali, H. A. Park, and C. S. G. Lee, “Closed-form inverse kinematic joint solution for humanoid robots,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, pp.704-709, 2010.
[18]	N. Rochester, J. Holland, L. Haibt, and W. Duda, “Tests on A Cell Assembly Theory of The Action of The Brain, Using A Large Digital Computer,” IRE Transactions on Information Theory, vol. 2, pp. 80-93, 1956.
[19]	M. Leshno, V.Y. Lin, A. Pinkus, and S. Schocken, “Multilayer Feedforward Networks with A Nonpolynomial Activation Function Can Approximate Any Function,” Neural Networks, vol. 6, no. 6, pp. 861-867, 1993.
[20]	Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, and L.D. Jackel, “Deep Learning,” Journal of Neural Computation, vol. 1, pp. 541-551, 1989.
[21]	X. Glorot, A. Bordes, and Y. Bengio, “Deep Sparse Rectifier Neural Networks,” International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 315-323, 2011.
[22]	Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, pp. 2278-2324, 1998.
[23]	Imitaion Learning, URL: https://zhuanlan.zhihu.com/p/25688750
[24]	Reinforcement Learning Unexpected Behavior, URL: https://www.inside.com.tw/2018/04/13/deep-reinforcementlearing-not-work
[25]	D.P. Kingma and J.L. Ba., “Adam: A Method for Stochastic Optimization,” International Conference for Learning Representations (ICLR), 2015.
[26]	S. Ross, G.J. Gordon, and J.A. Bagnell, “A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning,” International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 627-635, 2011.
[27]	S. Ross and J.A. Bagnell, “Efficient Reductions for Imitation Learning,” International Conference on Artificial Intelligence and Statistics, vol. 9, pp. 661-668, 2010.
[28]	M. Yan, I. Frosio, S. Tyree, and J. Kautz, “Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control,” Conference on Neural Information Processing Systems (NIPS), 2017.

[29]	S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-End Training of Deep Visuomotor Policies,” Journal of Machine Learning Research, pp. 1334-1373, 2016.
[30]	C. Finn, T. Yu, T. Zhang, P. Abbeel, and S. Levine, “One-Shot Visual Imitation Learning via Meta-Learning,” Conference on Robot Learning (CoRL), pp. 357-368, 2017.
[31]	C. Finn, X.Y. Tan, Y. Duan, T. Darrell, S. Levine, P. Abbeel, “Deep Spatial Autoencoders for Visuomotor Learning,” IEEE International Conference on Robotics and Automation (ICRA), pp. 512-519, 2016.
[32]	S. James, A.J. Davison, and E. Johns, “Transferring End-to-End Visuomotor Control from Simulation to Real World for A Multi-Stage Task,” arXiv, 2017.
[33]	J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23-30, 2017.
論文全文使用權限
校內
紙本論文於授權書繳交後5年公開
同意電子論文全文授權校園內公開
校內電子論文於授權書繳交後5年公開
校外
同意授權
校外電子論文於授權書繳交後5年公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信