§ 瀏覽學位論文書目資料
  
系統識別號 U0002-1906200611070500
DOI 10.6846/TKU.2006.00559
論文名稱(中文) 以內容為基礎之影像查詢研究
論文名稱(英文) A Study on Content-Based Image Retrieval
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 資訊工程學系博士班
系所名稱(英文) Department of Computer Science and Information Engineering
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 94
學期 2
出版年 95
研究生(中文) 高楊達
研究生(英文) Yang-Ta Kao
學號 889190087
學位類別 博士
語言別 英文
第二語言別
口試日期 2006-06-13
論文頁數 91頁
口試委員 指導教授 - 林慧珍(hjlin@cs.tku.edu.tw)
委員 - 徐道義
委員 - 黃俊堯
委員 - 施國琛
委員 - 顏淑惠
關鍵字(中) 影像內容查詢
爬山式序列表示法
爬山式序列及交叉點表示法
邊緣偵測
分支及黏合
方向及距離
區域邊緣點地圖
小波係數
相似度測量
比對
AdaBoost演算法
關鍵字(英) content-based image retrieval (CBIR)
Mountain-Climbing-Sequence (MCS)
Mountain-Climbing-Sequence-with-Intersection-Points (MCSIP)
branch-and-bound
orientation-distance
local edge map
wave coefficient
similarity measure
matching
AdaBoost algorithm
第三語言關鍵字
學科別分類
中文摘要
以內容為基礎之影像查詢(CBIR)的研究內容包含特徵選取、特徵表示以及結果比對。常被取用的特徵表示包含形狀、顏色、空間位置、及紋理等。一個好的影像表示法必須能夠克服影像的位移、旋轉、以及放大或縮小等問題。甚至於對影像在一定程度內的損毀下也必須能夠有好的比對結果。在以影像內容為查詢基礎時這些要求均為相當重要的議題。
考慮以上的要求,本論文提出三個以影像內容為基礎之查詢系統。首先是針對物體邊緣的研究,我們使用一快速的邊緣點偵測演算法來偵測出影像中所有可能的邊緣點,並提出一新的物件表示法—爬山式序列表示法(Mountain Climbing Sequence (MCS))。此表示法對於前面所提之影像中的位移、旋轉、以及放大或縮小等問題都可以達到不變 性的要求。另外,由於邊緣點的偵測就目前的研究經驗上並無法保証能夠找一物件的完整外形,因此我們也將嘗試在現有的外形特徵表示法下,克服物件外形不完整抽取的情況,甚至於在物件少部份被遮蔽的狀況下我們所提的系統也能得到良好的比對結果。其次是邊緣點與相鄰邊緣點關係的研究,在真實影像中,我們很難精確的偵測出影像中物件的完整的邊緣點,因此,我們計算影像中邊緣點與邊緣點間之關係,建立一方向-距離之長條圖,用以表示此真實影像,此表示法能克服查詢時所須面對的縮放及旋轉問題。
最後,我們結合了許多各類不同之影像特徵來表示一真實影像,包含方向—距離長條圖、區域邊緣點關係、小波係數、顏色分佈長條圖、顏色飽合度分長條圖及亮度分佈長條圖共1仟多維度的特徵。在此高維度的特徵比對過程,我們利用一個修改後的AdaBoost演算法,快速且有效率的查詢出相似影像,實驗證明,我們所提出的方法確實能達到在查詢影像時所須的條件—快速及有效率。
英文摘要
Because of recent advances in computer technology and the revolution in the way information is processed, increasing interest has been developed in automatic information retrieval from huge databases. In particular, content-based image retrieval (CBIR) has become a hot research topic and, consequently, improving the technology for content-based querying systems is of more challenge.
The work of content-based image retrieval includes selection, object representation, and matching. A good image representation should meet some requirements, including invariance to translation, rotation, scaling and reversion, and sustaining deformation of query images.
In this dissertation, we proposed three robust and efficient methods for CBIR. First, an efficient and robust shape-based image retrieval system is proposed. We introduce a shape representation method, the Mountain-Climbing-Sequence (MCS), which can be used to make the retrieval to be invariant to translation, rotation, scaling, and reversion. Second, a new feature called orientation-distance histogram is introduced, and a CBIR system based on this feature is proposed. This system transforms the RGB color model of a given image to the HSV color model and detects edge points by using the H-components, and then evaluates the orientation-distance histogram for each of the detected edge points to form a feature vector. With normalization of feature vectors and the use of MCS sequences, this system is of invariance to scaling and rotation. Last, we collect a variety of features for representing real images, including orientation-distance histogram, local edge map, wavelet coefficients, color information, and intensity information, forming feature vectors of dimension up to one thousand. Over such a large set of features, we use an improved version of the AdaBoost algorithm to select the most important features indicated by the user, and so as to efficiently achieve effective retrieval results.
第三語言摘要
論文目次
List of Figures	III
CHAPTER 1 Introduction	1
1.1	Motivation	1
1.2	Related Work	1
1.3	Organization of the Dissertation	4
CHAPTER 2 Shape-Based Study	5
2.1	Shape Representation	5
2.1.1 Polar Representation and Distance Sequences	5
2.1.2 Objects with Concavity/Convexity	7
2.1.3 Invariance to Transformations	8
2.2 Tackling Request of Invariance	11
2.2.1 Rotation Invariance	12
2.2.2 Scaling Invariance	12
2.2.3 Reversion Invariance	12
2.3 Shape Matching	13
2.4 Experimental Results and Comparison	15
2.4.1 Experiment 1—Tested on Sebastian et al. shape database	16
2.4.2 Experiment 2—Tested on Mokhtarian et al. shape database	18
CHAPTER 3 Edge-Based Study	20
3.1	Prompt Edge Detection	20
3.2 Feature Extraction	22
3.3 Feature Normalization	24
3.3.1 Scaling Invariance	24
3.3.2 Rotation Invariance	25
3.4 Similarity Measure	26
3.5 Experimental Results	27
CHAPTER 4 Combination of Various Features	30
4.1	Feature Description	30
4.2 Retrieval Method	34
4.3 Experimental Results	40
4.3.1 Performance Based on Each Type of Feature	42
4.3.2 Performance Based on the Combination of All Features	46
4.3.3 Modified Version versus the Conventional Adaboost	48
CHAPTER 5 Conclusion and Future Work	50
Bibliography	52
Publication List	56
Appendix A. Database of Shape Images	58
Appendix B. Database of Real Images	77

List of Figures
Figure.2-1. Distance d and angle θ of a contour point (x,y) related to the centroid (x, y).········ 6
Figure 2-2. Graph of polar equation for the contour shown in Figure 2-1. ································ 7
Figure 2-3. The nearest intersection points are extracted from each of objects A and B.··········· 8
Figure 2-4. The farthest intersection points are extracted from each of object B.······················ 8
Figure 2-5. Each of the objects in (a) and (b) has two maximal distances from the centroid.·· 10
Figure 2-6. Test images provided by Sebastian et al.······························································· 17
Figure 3-1. (a) The original image. (b) The edge points detected based on the V component. (c)
The edge points detected based on the H component. ····························································· 21
Figure 3-2. The 8 neighbors of the center point.······································································ 21
Figure 3-3. (a) points of 1-distance, (b) points of 3-distance. ·················································· 23
Figure 3-4. (a) An original image, (b) The orientation-distance histogram of (a).23Fiugre 3-5.
(a) The 50%-scaled-down version of 3-3(a), (b) The orientation-distance histogram of
3-5(a), (c) The normalized version of 3-4(b), (d) The normalized version of 3-5(b). ········ 25
Fiugre 3-5. (a) The 50%-scaled-down version of 3-3(a), (b) The orientation-distance histogram
of 3-5(a), (c) The normalized version of 3-4(b), (d) The normalized version of 3-5(b). ···· 25
Fiugre 3-6. (a) A rotated (by 90 degrees) version of the image in 3-4(a), (b) The
oreintation-distance histogram of 3-6(a), (c) The shifted and normalized result of 3-4(b)
with s = 3, (d) The shifted and normalized result of 3-6(b) with s = 12.···························· 26
Figure 3-7. Examples of query results of (a) the method of F. Mahmoudi et al., and (b) our
method. ····························································································································· 29

Figure 3-8. The Pr-Re curves of (a) the method of F. Mahmoudi et al., and (b) our method. ·· 29
Figure 4-1. (a) A 3x3 window W3x3, (b) A window wLEP of binomial multipliers ····················· 32
Figure 4-2. (a) Gray values of pixel (x,y) and its 8 neghbors, (b) Neighborhood of (x,y) on the
edge map, (c) window Tx,y, (d) The convolution value of Tx,y with wLEP is 145.················ 32
Figure 4-3. HSV color space. ·································································································· 34
Figure 4-4. The operation of the modified Adaboost. (a) The table g with the classification
error rates err, (b) The table gs with error rates es for the best b features evaluated and
selected in Steps 3 and 4, (c) The table G evaluated in Step 5, (d) The normalized weights
and weighted errors evaluated in Step 5, (e) The final features (or weak classifiers) with
their original indices selected in Step 6. ············································································ 40
Figure 4-5. Various types of testing images.············································································ 42
Figure 4-6. The precision rate and recall rate based on OD feature. ········································ 43
Figure 4-7. The precision rate and recall rate based on LEP feature.······································· 44
Figure 4-8. The precision rate and recall rate based on Wavelet Energy feature.····················· 44
Figure 4-9. precision rate and recall rate based on color-based feature. ·································· 45
Figure 4-10. precision rate and recall rate based on intensity feature. ····································· 46
Figure 4-11. precision rate and recall rate based on combination of all features.····················· 47
Figure 4-12. The average precision rate and recall rate based on different features.················ 48
Figure 4-13. The precision and recall rates on various values of T. ········································· 48
Figure 4-14. Comparison of precision and recall rates of the proposed modified version and
conventional Adaboost. ····································································································· 49
參考文獻
[1].	J. Shanbehzadeh, A. M. E. Moghadam, and F. Mahmoudi, “Image indexing and retrieval techniques: past, present and next”, Proceedings of the SPIE: Storage and Retrieval for Multimedia Database Vol. 3972, San Jose, California, USA, 2000, pp. 461-470.
[2].	D. Zhang and G. Lu, “Review of shape representation and description techniques”, Pattern Recognition 37, 2004, pp. 1-19.
[3].	T. Wang, Y. Rui, and J. G. Sun, “Constraint based region matching for image retrieval”, International Journal of Computer Vision, 2003, pp. 37-45.
[4].	J. Peng, “Multi-class relevance feedback content-based image retrieval”, Computer Vision and Image Understanding 90, 2003, pp. 42-67.
[5].	QBIC(TM) -- IBM's Query By Image Content, http://wwwqbic.almaden.ibm.com/.
[6].	J. H. Chang, K. C. Fan, and Y. L. Chang, “Multi-modal gray-level histogram modeling and decomposition”, Image and Vision Computing 20, 2002, pp. 203-216.
[7].	R. Brenulli and O. Mich, “Histograms analysis for image retrieval”, Pattern Recognition 34, 2001, pp. 1625-1637.
[8].	H. Wei and D. Y. Y. Yun, “Illumination-invariant image indexing using directional gradient angular histogram”, Proceedings of the IASTED International Conference Computer Graphics and Imaging, Honolulu, Hawaii, USA, 2001, pp. 13-16.
[9].	Y. Chen, X. Zhou, and T. Huang, “One-class SVM for learning in image retrieval”, Proceedings of the IEEE International Conference on Image Processing, Thessaloniki, Greece, 2001, pp. 815-818.
[10].	F. Mahmoudi, J. Shanbehzadeh, and A. M. Eftekhari-Moghadam, “Image retrieval based on shape similarity by edge orientation autocorrelogram”, Pattern Recognition 36, 2003, pp. 1725-1736.
[11].	T. Bernier and J. A. Landry, “A new method for representing and matching shapes of natural objects”, Pattern Recognition 36, 2003, pp. 1711-1723.
[12].	J. Zhang, X. Zhang, H. Krim, and G. G. Walter, “Object representation and recognition in shape spaces”, Pattern Recognition 36, 2003, pp. 1143-1154.
[13].	H. Nishida, “Structural feature indexing for retrieval of partially visible shapes”, Pattern Recognition 35, 2002, pp. 55-67.
[14].	A. Bonnassie, F. Peyrin, and D. Attali, “A new method for analyzing local shape in three-dimensional images based on medial axis transformation”, IEEE Transactions on Systems, Man, and Cybernetics-Part B, Cybernetics 33, 2003, pp. 700-705.
[15].	P. Yushkevich, P. Fletcher, S. Joshi, A. Thall, and S. M. Pizer, “Continuous medial representations for geometric object modeling in 2D and 3D”, Image and Vision Computing 21, 2003, pp. 17-27.
[16].	C. D. Ruberto, “Recognition of shapes by attributed skeletal graphs”, Pattern Recognition 37, 2004, pp. 21-23.
[17].	M. Stricker and M. Orengo, “Similarity of color images”, SPlE Storage and Retrieval for Image and Video Databases, 1995, pp. 381-392.
[18].	M. Swain and D. Ballad, “Color indexing,” International Journal of Computer Vision, Vol. 7, No. I, 1991, pp. 11-32.
[19].	G. Lu and J. Phillips, “Using Perceptually Weighted Histograms for Colour-based Image Retrieval”, Proceedings of IEEE International Conference on Signal Processing, 1998, pp. 1150-1153.
[20].	R.L. Kashyap and A. Khotanzed, “A model-based method for rotation invariant texture classification”. IEEE Transaction PAMI (8), 1986, pp. 472–481.
[21].	M. Leung and A. M. Peterson, “Scale and rotation invariant texture classification”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 1991, pp. 461-465.
[22].	M. S. Choi and W. Y. Kim, “A novel two stage template matching method for rotation and illumination invariance”, Pattern Recognition 35, pp. 119-129, 2002.
[23].	H. Araujo and J. M. Dias, “An introduction to the Log-polar mapping”, Proceedings of Cybernetic Vision, Second Workshop, pp. 139-144, 1996.
[24].	D. G. Kendall, “Shape manifolds, Procrustean metrics, and complex projective spaces”, Bulletin London Mathematics Society 16, pp. 81-121, 1984.
[25].	T. B. Sebastian, P. N. Klein, and B. B. Kimia, “Recognition of shapes by editing shock graphs”, Proceedings of the 8th IEEE International Conference on Computer Vision, ICCV 1, pp. 755-762, 2001.
[26].	F. Mokhtarian, S. Abbasi, and J. Kittler, “Robust and efficient shape indexing through curvature scale space”, Proceedings of the 6th British Machine Vision Conference, BMVC ‘96, Edinburgh, UK, 1996, pp. 53-62.
[27].	H. J. Lin, Y. T. Kao, S. H. Yen, and C. J. Wang, “A study of shape-based image retrieval”, Proceedings of the 6th International Workshop on Multimedia Network Systems and Applications, 2004, pp. 118-123.
[28].	H. J. Lin, Y. T. Kao, “A prompt contour detection method”, International Conference on the Distributed Multimedia Systems, 2001.
[29].	R. G. Rafel, E. G. Richard Woods, Digital Image Processing.
[30].	F. Stein and G. Medioni, “Structural indexing: efficient 2-D object recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence 14 (12), 1992, pp. 1198-1204.
[31].	K. K. Yu and N. C. Hu, “Two-dimensional gray-level object recognition using shape-specific points”, Journal of the Chinese Institute of Engineers 24(2), 2001, pp. 245-252.
[32].	Y. Freund and R. E. Schapire. “A decision-theoretic generalization of online learning and an application to boosting”, Journal of Computer and Systems 55(1), 1997, pp. 119-139.
論文全文使用權限
校內
校內紙本論文延後至2006-09-15公開
同意電子論文全文授權校園內公開
校內電子論文延後至2006-09-15公開
校內書目立即公開
校外
同意授權
校外電子論文延後至2006-09-15公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信