電子學位論文服務

§ 瀏覽學位論文書目資料

本論文電子全文於2021-09-16起於校外公開使用
本論文紙本於2021-09-16起公開使用

系統識別號	U0002-0309202119475200
DOI	10.6846/TKU.2021.00086
論文名稱(中文)	運用語音轉文本技術之固話先進留言機
論文名稱(英文)	Advanced Answering Machine for Fixed Line Using Speech to Text Technology
第三語言論文名稱
校院名稱	淡江大學
系所名稱(中文)	電機工程學系碩士在職專班
系所名稱(英文)	Department of Electrical and Computer Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度	109
學期	2
出版年	110
研究生(中文)	趙天辰
研究生(英文)	Tien-Chen Chao
學號	707440037
學位類別	碩士
語言別	繁體中文
第二語言別
口試日期	2021-07-05
論文頁數	71頁
口試委員	指導教授 - 楊淳良委員 - 李三良委員 - 周肇基
關鍵字(中)	谷歌AIY語音套件語音轉文本留言機
關鍵字(英)	Google AIY Voice Kit Speech-to-Text Answering Machine
第三語言關鍵字
學科別分類
中文摘要	本論文探討運用語音轉文本技術之固話先進留言機，利用谷歌自己動手的人工智慧語音套件(Google AIY Voice Kit)，將無人接聽的固話留言，或居家老人和幼童的緊急口頭需求，它能即時地將語音轉文本後，透過Wi-Fi通訊界面將文本傳給所設定的LINE Notify服務帳號，也意味能及時的將文本傳給需要知道或能及時處理狀況的人，進而達到本研究目的。架構二，麥克風與音源的最短距離為0公尺時辨識率僅有68%，並且隨著麥克風與音源的距離拉長而嚴重的降低了辨識率。架構三，麥克風與音源的距離能維持在最短距離0公尺，但USB Jabra Speaker與谷歌自己動手的人工智慧語音套件在最短距離為0公尺時的辨識率也只有51%，而且USB Jabra Speaker與谷歌自己動手的人工智慧語音套件也會因為距離的拉長，而稍微的降低了辨識率。架構四，聲音的傳輸都是靠有線的方式傳給谷歌自己動手的人工智慧語音套件，改善了因為距離而降低辨識率的方式，辨識率可高達97%，也優於架構一使用iPhone智慧型手機搭配Google Translate APP，幾乎完整地將來電者的語意傳達給所設定的LINE Notify服務帳號。
英文摘要	This paper discusses the proposed advanced answering machines for fixed-line using speech to text technology, employing the Google AIY Voice Kit to leave unanswered local calls or urgent verbal needs of the elderly and young children at home. It transfers the sound source to the corresponding text message, which delivers to the LINE Notify service through the Wi-Fi communication interface. The text message sends to people who need to know or handle the situation in time. The proposed advanced fixed-line answering machine can achieve the research purpose. The second architecture has the best recognition correct rate of only 68% high when the distance between the microphone and the sound source from the digital answering machine's speaker is zero. Unfortunately, the correct recognition rate severely decreases as the distance between the microphone and the sound source becomes longer. In the third architecture, keeping the zero distance between the microphone and the sound source from the digital answering machine's speaker, this architecture has the best recognition correct rate of only 51% under zero distance between the USB Jabra Speaker and the Google AIY Voice Kit. Also, the correct recognition rate can slightly decrease with the lengthening of the distance between the USB Jabra Speaker and Google AIY Voice Kit. Finally, in the fourth architecture, the USB sound card plugged into the Google AIY Voice Kit is directly connected to the telephone adapter by an audio wire, which can immediately improve the correct recognition rate to 97% high. It is also better than the first architecture using the smartphone iPhone with Google Translate APP. Therefore, this architecture can convey almost the complete semantics to the specific LINE Notify service account.
第三語言摘要
論文目次	第一章緒論 1 1.1前言 1 1.2研究目的 2 1.3論文架構介紹 3 第二章本研究架構 4 2.1 既有產品簡介 4 2.2本研究架構一 5 2.3本研究架構二 6 2.4 本研究架構三 7 2.5 本研究架構四 8 第三章系統運作流程 9 3.1 架構一 9 3.2 架構二 10 3.3 架構三 11 3.3 架構四 12 第四章硬體設備介紹 13 4.1 數位留言機 13 4.2 話機 17 4.3 USB Jabra Speakers 18 4.4 樹莓派Model 3B開發板 19 4.5 USB Sound Card 22 第五章語音套件設置 23 5.1系統映像檔燒錄 23 5.2語音套件硬體組裝 24 5.3 連接Wi-Fi 26 5.4 參數設定 26 5.5 獲取憑證 28 第六章實驗結果 34 6.1 架構一 34 6.1.1架構一_0公尺 34 6.1.2架構一_1公尺 36 6.1.3架構一_2公尺 38 6.2 架構二 40 6.2.1架構二_0公尺 40 6.2.2架構二_1公尺 42 6.2.3架構二_2公尺 44 6.3 架構三 46 6.3.1架構三_0公尺 46 6.3.2架構三_1公尺 48 6.3.3架構三_2公尺 50 6.3.4架構三_3公尺 52 6.3.5架構三_4公尺 54 6.3.6架構三_5公尺 56 6.3.7架構三_6公尺 58 6.3.8架構三_7公尺 60 6.4 架構四 63 第七章結論與未來展望 67 7.1 結論 67 7.2 未來展望 69 參考文獻 70 圖目錄圖2.1 iFLYTEK科大訊飛智能耳機[11] 4 圖2.2本研究架構一方塊圖 5 圖2.3本研究架構二方塊圖 6 圖2.4本研究架構三方塊圖 7 圖2.5本研究架構四方塊圖 8 圖3.1架構一運作流程 9 圖3.2架構二運作流程 10 圖3.3架構三運作流程 11 圖3.4架構四運作流程 12 圖4.1 數位留言機 13 圖4.2數位留言機功能 14 圖4.3數位留言機規格 16 圖4.4 話機 17 圖4.5 Jabra Speaker 510 18 圖4.6 Jabra Speaker 510 圖示和操作說明 19 圖4.7 樹莓派Model 3B開發板 20 圖4.8 樹莓派Mode 3B開發板的GPIO接腳 21 圖4.9 USB Sound Card 22 圖5.1 燒錄語音套件系統映像檔 23 圖5.2 Google AIY Voice Kit硬體組裝 24 圖5.3插入Micro SD card 25 圖5.4 Google AIY Voice Kit周邊設備 25 圖5.5 Google AIY Voice Kit連接Wi-Fi 26 圖5.6設定時區 26 圖5.7安裝Blueman 27 圖5.8 登入谷歌雲端平台開啟免費試用 28 圖5.9創建新專案 29 圖5.10 開啟Google Assistant API 功能 30 圖5.11創建憑證 30 圖5.12下載json 檔案 31 圖5.13設置assistant.json檔案 31 圖5.14雲端語音轉文本功能啟動 32 圖5.15執行語音轉文本 32 圖5.16辨識成功 33 圖6.1架構一_0公尺實驗設置圖 34 圖6.2架構一_0公尺實驗結果 35 圖6.3架構一_1公尺實驗設置圖 36 圖6.4架構一_1公尺實驗結果 37 圖6.5架構一_2公尺實驗設置圖 38 圖6.6架構一_2公尺實驗結果 39 圖6.7架構二_0公尺實驗設置圖 40 圖6.8架構二_0公尺實驗結果 41 圖6.9架構二_1公尺實驗設置圖 42 圖6.10架構二_1公尺實驗結果 43 圖6.11架構二_2公尺實驗設置圖 44 圖6.12架構二_2公尺實驗結果 45 圖6.13架構三_0公尺實驗設置圖 46 圖6.14架構三_0公尺實驗結果 47 圖6.15架構三_1公尺實驗設置圖 48 圖6.16架構三_1公尺實驗結果 49 圖6.17架構三_2公尺實驗設置圖 50 圖6.18架構三_2公尺實驗結果 51 圖6.19架構三_3公尺實驗設置圖 52 圖6.20架構三_3公尺實驗結果 53 圖6.21架構三_4公尺實驗設置圖 54 圖6.22架構三_4公尺實驗結果 55 圖6.23架構三_5公尺實驗設置圖 56 圖6.24架構三_5公尺實驗結果 57 圖6.25架構三_6公尺實驗設置圖 58 圖6.26架構三_6公尺實驗結果 59 圖6.27架構三_7公尺實驗設置圖 60 圖6.28架構三_7公尺實驗結果 61 圖6.29架構四實驗設置圖 63 圖6.30架構四實驗結果 64 圖6.31架構四實驗結果 66 圖7.1各種架構的辨識率比較 68
參考文獻	[1] P. Singh, P. Nayak, A. Datta, D. Sani, G. Raghav and R. Tejpal, "Voice Control Device using Raspberry Pi," 2019 Amity International Conference on Artificial Intelligence (AICAI), 2019, pp. 723-728, doi: 10.1109/AICAI.2019.8701409 [2] 數位留言機,website: https://www.dmecom.com.tw/products-show.php?no=3&p=75#tab-1 [3] 數位留言機,website: https://www.dmecom.com.tw/products-show.php?no=3&p=75#tab-1 [4] 數位留言機,website: https://www.dmecom.com.tw/products-show.php?no=3&p=75#tab-1 [5] 話機, website: https://www.etmall.com.tw/ TECO東元來電顯示有線電話機-XYFXC302-二色/i/2077719 [6] Jabra Speaker 510 , website: https://www.jabra.com/_/media/Jabra_VXi_Product- Documentation/Jabra-SPEAK-510-Series/User-Manuals/RevK/Jabra-Speak-510-user-manual_CHT-RevK.pdf [7] Jabra Speaker 510圖示和操作說明, website: https://www.jabra.com/_/media/Jabra_VXi_Product-Documentation/Jabra-SPEAK-510-Series/User-Manuals/RevK/Jabra-Speak-510-user-manual_CHT-RevK.pdf [8] 樹莓派Model 3B開發板, website: https://zh.wikipedia.org/wiki/樹莓派 [9] 樹莓派Mode 3B開發板的GPIO接腳, website: https://ithelp.ithome.com.tw/ articles/10215294 [10] USB Sound Card, website: https://tw.mall.yahoo.com/item/電腦USB聲卡-7-1聲道-免驅動-即插即用-筆電-USB耳機轉-p0005206286237 [11] IFLYTEK科大訊飛智能藍芽耳機,website: https://www.papichang.com/ products/iflytek-iflybuds [12] I. Dogaru, D. Stan and R. Dogaru, "Compact Isolated Speech Recognition on Raspberry-Pi based on Reaction Diffusion Transform," 2019 6th International Symposium on Electrical and Electronics Engineering (ISEEE), 2019, pp. 1-4, doi: 10.1109/ISEEE48094.2019.9136152. [13] A. P. Pant, K. -R. Wu and Y. -C. Tseng, "Speak to Action: Offline and Hybrid Language Recognition on Embedded Board for Smart Control System," 2020 International Computer Symposium (ICS), 2020, pp. 85-90, doi: 10.1109/ICS51289.2020.00026.
論文全文使用權限	校內：校內紙本論文立即公開同意電子論文全文授權校園內公開校內電子論文立即公開校外：同意授權校外電子論文立即公開

返回頁首

如有問題，歡迎洽詢！
圖書館數位資訊組　(02)2621-5656 轉 2487 或來信