系統識別號 | U0002-1906200611070500 |
---|---|
DOI | 10.6846/TKU.2006.00559 |
論文名稱(中文) | 以內容為基礎之影像查詢研究 |
論文名稱(英文) | A Study on Content-Based Image Retrieval |
第三語言論文名稱 | |
校院名稱 | 淡江大學 |
系所名稱(中文) | 資訊工程學系博士班 |
系所名稱(英文) | Department of Computer Science and Information Engineering |
外國學位學校名稱 | |
外國學位學院名稱 | |
外國學位研究所名稱 | |
學年度 | 94 |
學期 | 2 |
出版年 | 95 |
研究生(中文) | 高楊達 |
研究生(英文) | Yang-Ta Kao |
學號 | 889190087 |
學位類別 | 博士 |
語言別 | 英文 |
第二語言別 | |
口試日期 | 2006-06-13 |
論文頁數 | 91頁 |
口試委員 |
指導教授
-
林慧珍(hjlin@cs.tku.edu.tw)
委員 - 徐道義 委員 - 黃俊堯 委員 - 施國琛 委員 - 顏淑惠 |
關鍵字(中) |
影像內容查詢 爬山式序列表示法 爬山式序列及交叉點表示法 邊緣偵測 分支及黏合 方向及距離 區域邊緣點地圖 小波係數 相似度測量 比對 AdaBoost演算法 |
關鍵字(英) |
content-based image retrieval (CBIR) Mountain-Climbing-Sequence (MCS) Mountain-Climbing-Sequence-with-Intersection-Points (MCSIP) branch-and-bound orientation-distance local edge map wave coefficient similarity measure matching AdaBoost algorithm |
第三語言關鍵字 | |
學科別分類 | |
中文摘要 |
以內容為基礎之影像查詢(CBIR)的研究內容包含特徵選取、特徵表示以及結果比對。常被取用的特徵表示包含形狀、顏色、空間位置、及紋理等。一個好的影像表示法必須能夠克服影像的位移、旋轉、以及放大或縮小等問題。甚至於對影像在一定程度內的損毀下也必須能夠有好的比對結果。在以影像內容為查詢基礎時這些要求均為相當重要的議題。 考慮以上的要求,本論文提出三個以影像內容為基礎之查詢系統。首先是針對物體邊緣的研究,我們使用一快速的邊緣點偵測演算法來偵測出影像中所有可能的邊緣點,並提出一新的物件表示法—爬山式序列表示法(Mountain Climbing Sequence (MCS))。此表示法對於前面所提之影像中的位移、旋轉、以及放大或縮小等問題都可以達到不變 性的要求。另外,由於邊緣點的偵測就目前的研究經驗上並無法保証能夠找一物件的完整外形,因此我們也將嘗試在現有的外形特徵表示法下,克服物件外形不完整抽取的情況,甚至於在物件少部份被遮蔽的狀況下我們所提的系統也能得到良好的比對結果。其次是邊緣點與相鄰邊緣點關係的研究,在真實影像中,我們很難精確的偵測出影像中物件的完整的邊緣點,因此,我們計算影像中邊緣點與邊緣點間之關係,建立一方向-距離之長條圖,用以表示此真實影像,此表示法能克服查詢時所須面對的縮放及旋轉問題。 最後,我們結合了許多各類不同之影像特徵來表示一真實影像,包含方向—距離長條圖、區域邊緣點關係、小波係數、顏色分佈長條圖、顏色飽合度分長條圖及亮度分佈長條圖共1仟多維度的特徵。在此高維度的特徵比對過程,我們利用一個修改後的AdaBoost演算法,快速且有效率的查詢出相似影像,實驗證明,我們所提出的方法確實能達到在查詢影像時所須的條件—快速及有效率。 |
英文摘要 |
Because of recent advances in computer technology and the revolution in the way information is processed, increasing interest has been developed in automatic information retrieval from huge databases. In particular, content-based image retrieval (CBIR) has become a hot research topic and, consequently, improving the technology for content-based querying systems is of more challenge. The work of content-based image retrieval includes selection, object representation, and matching. A good image representation should meet some requirements, including invariance to translation, rotation, scaling and reversion, and sustaining deformation of query images. In this dissertation, we proposed three robust and efficient methods for CBIR. First, an efficient and robust shape-based image retrieval system is proposed. We introduce a shape representation method, the Mountain-Climbing-Sequence (MCS), which can be used to make the retrieval to be invariant to translation, rotation, scaling, and reversion. Second, a new feature called orientation-distance histogram is introduced, and a CBIR system based on this feature is proposed. This system transforms the RGB color model of a given image to the HSV color model and detects edge points by using the H-components, and then evaluates the orientation-distance histogram for each of the detected edge points to form a feature vector. With normalization of feature vectors and the use of MCS sequences, this system is of invariance to scaling and rotation. Last, we collect a variety of features for representing real images, including orientation-distance histogram, local edge map, wavelet coefficients, color information, and intensity information, forming feature vectors of dimension up to one thousand. Over such a large set of features, we use an improved version of the AdaBoost algorithm to select the most important features indicated by the user, and so as to efficiently achieve effective retrieval results. |
第三語言摘要 | |
論文目次 |
List of Figures III CHAPTER 1 Introduction 1 1.1 Motivation 1 1.2 Related Work 1 1.3 Organization of the Dissertation 4 CHAPTER 2 Shape-Based Study 5 2.1 Shape Representation 5 2.1.1 Polar Representation and Distance Sequences 5 2.1.2 Objects with Concavity/Convexity 7 2.1.3 Invariance to Transformations 8 2.2 Tackling Request of Invariance 11 2.2.1 Rotation Invariance 12 2.2.2 Scaling Invariance 12 2.2.3 Reversion Invariance 12 2.3 Shape Matching 13 2.4 Experimental Results and Comparison 15 2.4.1 Experiment 1—Tested on Sebastian et al. shape database 16 2.4.2 Experiment 2—Tested on Mokhtarian et al. shape database 18 CHAPTER 3 Edge-Based Study 20 3.1 Prompt Edge Detection 20 3.2 Feature Extraction 22 3.3 Feature Normalization 24 3.3.1 Scaling Invariance 24 3.3.2 Rotation Invariance 25 3.4 Similarity Measure 26 3.5 Experimental Results 27 CHAPTER 4 Combination of Various Features 30 4.1 Feature Description 30 4.2 Retrieval Method 34 4.3 Experimental Results 40 4.3.1 Performance Based on Each Type of Feature 42 4.3.2 Performance Based on the Combination of All Features 46 4.3.3 Modified Version versus the Conventional Adaboost 48 CHAPTER 5 Conclusion and Future Work 50 Bibliography 52 Publication List 56 Appendix A. Database of Shape Images 58 Appendix B. Database of Real Images 77 List of Figures Figure.2-1. Distance d and angle θ of a contour point (x,y) related to the centroid (x, y).········ 6 Figure 2-2. Graph of polar equation for the contour shown in Figure 2-1. ································ 7 Figure 2-3. The nearest intersection points are extracted from each of objects A and B.··········· 8 Figure 2-4. The farthest intersection points are extracted from each of object B.······················ 8 Figure 2-5. Each of the objects in (a) and (b) has two maximal distances from the centroid.·· 10 Figure 2-6. Test images provided by Sebastian et al.······························································· 17 Figure 3-1. (a) The original image. (b) The edge points detected based on the V component. (c) The edge points detected based on the H component. ····························································· 21 Figure 3-2. The 8 neighbors of the center point.······································································ 21 Figure 3-3. (a) points of 1-distance, (b) points of 3-distance. ·················································· 23 Figure 3-4. (a) An original image, (b) The orientation-distance histogram of (a).23Fiugre 3-5. (a) The 50%-scaled-down version of 3-3(a), (b) The orientation-distance histogram of 3-5(a), (c) The normalized version of 3-4(b), (d) The normalized version of 3-5(b). ········ 25 Fiugre 3-5. (a) The 50%-scaled-down version of 3-3(a), (b) The orientation-distance histogram of 3-5(a), (c) The normalized version of 3-4(b), (d) The normalized version of 3-5(b). ···· 25 Fiugre 3-6. (a) A rotated (by 90 degrees) version of the image in 3-4(a), (b) The oreintation-distance histogram of 3-6(a), (c) The shifted and normalized result of 3-4(b) with s = 3, (d) The shifted and normalized result of 3-6(b) with s = 12.···························· 26 Figure 3-7. Examples of query results of (a) the method of F. Mahmoudi et al., and (b) our method. ····························································································································· 29 Figure 3-8. The Pr-Re curves of (a) the method of F. Mahmoudi et al., and (b) our method. ·· 29 Figure 4-1. (a) A 3x3 window W3x3, (b) A window wLEP of binomial multipliers ····················· 32 Figure 4-2. (a) Gray values of pixel (x,y) and its 8 neghbors, (b) Neighborhood of (x,y) on the edge map, (c) window Tx,y, (d) The convolution value of Tx,y with wLEP is 145.················ 32 Figure 4-3. HSV color space. ·································································································· 34 Figure 4-4. The operation of the modified Adaboost. (a) The table g with the classification error rates err, (b) The table gs with error rates es for the best b features evaluated and selected in Steps 3 and 4, (c) The table G evaluated in Step 5, (d) The normalized weights and weighted errors evaluated in Step 5, (e) The final features (or weak classifiers) with their original indices selected in Step 6. ············································································ 40 Figure 4-5. Various types of testing images.············································································ 42 Figure 4-6. The precision rate and recall rate based on OD feature. ········································ 43 Figure 4-7. The precision rate and recall rate based on LEP feature.······································· 44 Figure 4-8. The precision rate and recall rate based on Wavelet Energy feature.····················· 44 Figure 4-9. precision rate and recall rate based on color-based feature. ·································· 45 Figure 4-10. precision rate and recall rate based on intensity feature. ····································· 46 Figure 4-11. precision rate and recall rate based on combination of all features.····················· 47 Figure 4-12. The average precision rate and recall rate based on different features.················ 48 Figure 4-13. The precision and recall rates on various values of T. ········································· 48 Figure 4-14. Comparison of precision and recall rates of the proposed modified version and conventional Adaboost. ····································································································· 49 |
參考文獻 |
[1]. J. Shanbehzadeh, A. M. E. Moghadam, and F. Mahmoudi, “Image indexing and retrieval techniques: past, present and next”, Proceedings of the SPIE: Storage and Retrieval for Multimedia Database Vol. 3972, San Jose, California, USA, 2000, pp. 461-470. [2]. D. Zhang and G. Lu, “Review of shape representation and description techniques”, Pattern Recognition 37, 2004, pp. 1-19. [3]. T. Wang, Y. Rui, and J. G. Sun, “Constraint based region matching for image retrieval”, International Journal of Computer Vision, 2003, pp. 37-45. [4]. J. Peng, “Multi-class relevance feedback content-based image retrieval”, Computer Vision and Image Understanding 90, 2003, pp. 42-67. [5]. QBIC(TM) -- IBM's Query By Image Content, http://wwwqbic.almaden.ibm.com/. [6]. J. H. Chang, K. C. Fan, and Y. L. Chang, “Multi-modal gray-level histogram modeling and decomposition”, Image and Vision Computing 20, 2002, pp. 203-216. [7]. R. Brenulli and O. Mich, “Histograms analysis for image retrieval”, Pattern Recognition 34, 2001, pp. 1625-1637. [8]. H. Wei and D. Y. Y. Yun, “Illumination-invariant image indexing using directional gradient angular histogram”, Proceedings of the IASTED International Conference Computer Graphics and Imaging, Honolulu, Hawaii, USA, 2001, pp. 13-16. [9]. Y. Chen, X. Zhou, and T. Huang, “One-class SVM for learning in image retrieval”, Proceedings of the IEEE International Conference on Image Processing, Thessaloniki, Greece, 2001, pp. 815-818. [10]. F. Mahmoudi, J. Shanbehzadeh, and A. M. Eftekhari-Moghadam, “Image retrieval based on shape similarity by edge orientation autocorrelogram”, Pattern Recognition 36, 2003, pp. 1725-1736. [11]. T. Bernier and J. A. Landry, “A new method for representing and matching shapes of natural objects”, Pattern Recognition 36, 2003, pp. 1711-1723. [12]. J. Zhang, X. Zhang, H. Krim, and G. G. Walter, “Object representation and recognition in shape spaces”, Pattern Recognition 36, 2003, pp. 1143-1154. [13]. H. Nishida, “Structural feature indexing for retrieval of partially visible shapes”, Pattern Recognition 35, 2002, pp. 55-67. [14]. A. Bonnassie, F. Peyrin, and D. Attali, “A new method for analyzing local shape in three-dimensional images based on medial axis transformation”, IEEE Transactions on Systems, Man, and Cybernetics-Part B, Cybernetics 33, 2003, pp. 700-705. [15]. P. Yushkevich, P. Fletcher, S. Joshi, A. Thall, and S. M. Pizer, “Continuous medial representations for geometric object modeling in 2D and 3D”, Image and Vision Computing 21, 2003, pp. 17-27. [16]. C. D. Ruberto, “Recognition of shapes by attributed skeletal graphs”, Pattern Recognition 37, 2004, pp. 21-23. [17]. M. Stricker and M. Orengo, “Similarity of color images”, SPlE Storage and Retrieval for Image and Video Databases, 1995, pp. 381-392. [18]. M. Swain and D. Ballad, “Color indexing,” International Journal of Computer Vision, Vol. 7, No. I, 1991, pp. 11-32. [19]. G. Lu and J. Phillips, “Using Perceptually Weighted Histograms for Colour-based Image Retrieval”, Proceedings of IEEE International Conference on Signal Processing, 1998, pp. 1150-1153. [20]. R.L. Kashyap and A. Khotanzed, “A model-based method for rotation invariant texture classification”. IEEE Transaction PAMI (8), 1986, pp. 472–481. [21]. M. Leung and A. M. Peterson, “Scale and rotation invariant texture classification”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 1991, pp. 461-465. [22]. M. S. Choi and W. Y. Kim, “A novel two stage template matching method for rotation and illumination invariance”, Pattern Recognition 35, pp. 119-129, 2002. [23]. H. Araujo and J. M. Dias, “An introduction to the Log-polar mapping”, Proceedings of Cybernetic Vision, Second Workshop, pp. 139-144, 1996. [24]. D. G. Kendall, “Shape manifolds, Procrustean metrics, and complex projective spaces”, Bulletin London Mathematics Society 16, pp. 81-121, 1984. [25]. T. B. Sebastian, P. N. Klein, and B. B. Kimia, “Recognition of shapes by editing shock graphs”, Proceedings of the 8th IEEE International Conference on Computer Vision, ICCV 1, pp. 755-762, 2001. [26]. F. Mokhtarian, S. Abbasi, and J. Kittler, “Robust and efficient shape indexing through curvature scale space”, Proceedings of the 6th British Machine Vision Conference, BMVC ‘96, Edinburgh, UK, 1996, pp. 53-62. [27]. H. J. Lin, Y. T. Kao, S. H. Yen, and C. J. Wang, “A study of shape-based image retrieval”, Proceedings of the 6th International Workshop on Multimedia Network Systems and Applications, 2004, pp. 118-123. [28]. H. J. Lin, Y. T. Kao, “A prompt contour detection method”, International Conference on the Distributed Multimedia Systems, 2001. [29]. R. G. Rafel, E. G. Richard Woods, Digital Image Processing. [30]. F. Stein and G. Medioni, “Structural indexing: efficient 2-D object recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence 14 (12), 1992, pp. 1198-1204. [31]. K. K. Yu and N. C. Hu, “Two-dimensional gray-level object recognition using shape-specific points”, Journal of the Chinese Institute of Engineers 24(2), 2001, pp. 245-252. [32]. Y. Freund and R. E. Schapire. “A decision-theoretic generalization of online learning and an application to boosting”, Journal of Computer and Systems 55(1), 1997, pp. 119-139. |
論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信