§ 瀏覽學位論文書目資料
  
系統識別號 U0002-2007201709245400
DOI 10.6846/TKU.2017.00699
論文名稱(中文) 基於累積切片平均估計的非線性維度縮減法
論文名稱(英文) Nonlinear dimension reduction based on the cumulative slicing mean estimation
第三語言論文名稱
校院名稱 淡江大學
系所名稱(中文) 數學學系碩士班
系所名稱(英文) Department of Mathematics
外國學位學校名稱
外國學位學院名稱
外國學位研究所名稱
學年度 105
學期 2
出版年 106
研究生(中文) 王子豪
研究生(英文) Tzu-Hao Wang
學號 604190206
學位類別 碩士
語言別 英文
第二語言別
口試日期 2017-07-14
論文頁數 57頁
口試委員 指導教授 - 吳漢銘(hmwu@gm.ntpu.edu.tw)
共同指導教授 - 黃逸輝(yhhuang@mail.tku.edu.tw)
委員 - 蘇家玉(emilysu@tmu.edu.tw)
委員 - 陳怡如(viviyjchen@stat.tku.edu.tw)
關鍵字(中) 累積切片估計
等距特徵映射
流形學習
非線性維度縮 減
切片逆迴歸
關鍵字(英) Cumulative slicing estimation
isometric feature mapping
manifold learning
nonlinear dimension reduction
sliced inverse regression
第三語言關鍵字
學科別分類
中文摘要
文獻中,對於流形學習的非線性維度縮減已有不少研究。其中,迴
歸等軸距切片逆迴歸法(ISOSIR),是屬於一種半監督式的學習演算
法,已被提出並証明它可以有效地探索非線性流形資料隱含的幾何
結構,例如瑞士捲資料。ISOSIR 是採用均值法做為一個基礎的群集
分析,應用到預先計算好的資料集等距距離矩陣。然而,反應變數
在群內及群間的順序訊息在群集之後會被忽略,而順序結構是非線
性資料很重要的特徵之一。另一方面,假設資料的具有類別資訊,
等距離矩陣的計算並沒有考慮到這個資訊。在本研究中,我們擴展
ISOSIR 和等軸距累積切片平均估計法,提出一監督式演算法,用以
解決上述這兩個問題。我們進行了模擬研究和實際資料分析,結果
顯示所提出的方法可以揭示非線性流形資料的幾何結構,同時與監
督式的ISOSIR 表現相當。我們更進一步研究,應用所找出的低維度
資料特徵於實際資料的分類及回歸問題。
英文摘要
A number of studies have been conducted on the nonlinear dimension reduction
for manifold learning in the literature. Among them, the isometric sliced
inverse regression (ISOSIR), a semi-supervised learning algorithm, has been
proposed and shown to be useful for exploring the embedded geometric
structure of the nonlinear manifold data set such as the Swiss roll. ISOSIR
applied K-means as a base clustering method to the pre-calculated isometric
distance matrix of the data set. However, the ordering information of
response both within and between the resulting clusters was ignored where
the ordering structure was one of the most important characteristics of a
nonlinear manifold data set. On the other hand, the construction of the
isometric distance matrix did not consider the class labels of data if they
were available. In this study, we are motivated to settle these two defects
and propose the supervised extensions of ISOSIR and isometric cumulative
slicing mean estimation. We conducted the simulation studies and real data
analysis and shown that the proposed method can reveal the geometric
structure of a nonlinear manifold data set and the results were comparable
to the supervised ISOSIR. We further investigated the applications of the
found features for the classification and regression problems to the real
world data sets.
第三語言摘要
論文目次
1 Introduction 1 
2 Briefreviewofdimensionreductiontechniques 2
2.1 TheclassicalSIR.............................2 
2.2 Thecumulativeslicingestimation(CUME)...............4 
2.3 ThegeodesicdistanceapproximationandISOMAP..........5 
2.4 TheisometricSIR(ISOSIR).......................6 
3 ExtensionsofISOSIRandCUME 8 
3.1 Thegeodesicdistanceapproximationrevisited.............8 
3.2 TheextensionsofISOSIRandISOCUME...............9 
4 Theslicingandtheseriationstrategies 10 
4.1 TheslicingstrategyforSIR-basedmethods...............10 
4.2 TheseriationstrategyforCUME-basedmethods...........11 
5 Simulationstudies 12
6 Applications 16 
6.1 Datavisualization............................16 
6.2 Classificationproblems..........................16 
6.3 Regressionproblems...........................17 
7 Conclusion and discussion 18
參考文獻
References
Aizerman, M.,Braverman,E.,andRozonoer,L.:Theoreticalfoundationsofthe
potentialfunctionmethodinpatternrecognitionlearning.Automationand
Remote Control 25, 821-837(1964)
Balasubramanian, M.,Schwartz,E.L.:Theisomapalgorithmandtopologicalsta-
bility.Science 295(5552), 7-7(2002)
Belkin, M.,Niyogi,P.:Laplacianeigenmapsfordimensionalityreductionanddata
representation.NeuralComputation 15(6), 1373-1396(2003)
Bengio, Y.,Paiement,J.,Vincent,P.,Delalleau,O.,Roux,N.L.,Ouimet,M.:
Out-of-sample extensionsforLLE,Isomap,MDS,Eigenmaps,andspectral
clustering. InNeuralInformationProcessingSystems,pp.177-184.MIT
Press (2003)
Bian, W.,Tao,D.:ManifoldregularizationforSIRwithrateroot-nconvergence.
AdvancesinNeuralInformationProcessingSystems 22, 117-125(2009)
Bura, E.,Pfeiffer,R.M.:Graphicalmethodsforclasspredictionusingdimension
reduction techniquesonDNAmicroarraydata.Bioinformatics 19(10), 1252-
1258 (2003)
Chen, C.H.:Generalizedassociationplots:informationvisualizationviaiteratively
generated correlationmatrices.StatisticaSinica 12, 7-29(2002)
22
Chen, C.H.,Li,K.C.:CanSIRbeaspopularasmultiplelinearregression?Statis-
tica Sinica 8, 289-316(1998)
Chen, C.H.,Li,K.C.:GeneralizationofFisher’slineardiscriminantanalysisvia
the approachofslicedinverseregression.JournaloftheKoreanStatistical
Society 30, 193-217(2001)
Coifman, R.R.,Lafon,S.,Lee,A.B.,Maggioni,M.,Nadler,B.,Warner,F.,Zucker,
S.W.: Geometricdiffusionsasatoolforharmonicanalysisandstructurede-
finition ofdata:diffusionmaps.Proc.Natl.Acad.Sci.USA 102, 7426-7431
(2005)
Cook,R.D.:Ontheinterpretationofregressionplots.JournaloftheAmerican
Statistical Association 89, 177-190(1994)
Cook,R.D.:Graphicsforregressionswithabinaryresponse.JournaloftheAme-
rican StatisticalAssociation 91, 983-992(1996)
Cook,R.D.:SAVE:amethodfordimensionreductionandgraphicsinregression.
CommunicationsinStatistics:TheoryandMethods 29, 2109-2121(2000)
Cook,R.D.,Critchley,F.:Identifyingregressionoutliersandmixturesgraphically.
Journal oftheAmericanStatisticalAssociation 95, 781-794(2000)
Cook,R.D.,Ni,L.:Sufficientdimensionreductionviainverseregression:amini-
mumdiscrepancyapproach.JournaloftheAmericanStatisticalAssociation
100(470), 410-428(2005)
Cook,R.D.,Ni,L.:Usingintraslicecovariancesforimprovedestimationofthe
centralsubspaceinregression.Biometrika 93(1), 65-74(2006)
Cox,T.F.,Cox,M.A.A.:MultidimensionalScaling,London:ChapmanandHall.
(1994)
Dettling, M.,Bühlmann,P.:Supervisedclusteringofgenes.GenomeBiology
3(12), research0069.1-0069.15.(2002)
Donoho, D.L.,Grimes,C.:Hessianeigenmaps:locallylinearembeddingtechniques
for high-dimensionaldata.Proc.Natl.Acad.Sci.USA 100(10), 5591-5596
(2003)
23
Frank,A.,Asuncion,A.:UCIMachineLearningRepository[http://archive.ics.uci.edu/ml].
Irvine, CA:UniversityofCalifornia,SchoolofInformationandComputer
Science (2010)
Fukumizu,K.,Bach,F.R.,Jordan,M.I.:Kerneldimensionreductioninregression.
Ann. Statist. 37(4) 1871-1905(2009)
Gaoa, X.,Liang,J.:Thedynamicalneighborhoodselectionbasedonthesam-
pling densityandmanifoldcurvatureforisometricdataembedding,Pattern
Recognition Letters32(2),202-209(2011)
Garber,M.etal.:Diversityofgeneexpressioninadenocarcinomaofthelung.
Proc.Natl.Acad.Sci.USA 98(24), 13784-13789(2001)
Gather, U.,Hilker,T.,Becker,C.:Anoteonoutliersensitivityofslicedinverse
regression. Statistics 36(4), 271-281(2002)
Geng, X.,Zhan,D.C.,Zhou,Z.H.:Supervisednonlineardimensionalityreduction
for visualizationandclassification.IEEETransSystManCybernBCybern
35(6), 1098-1107(2005)
Ham, J.,Lee,D.D.,Mika,S.,Scholkopf,B.:Akernelviewofthedimensionality
reduction ofmanifolds.ACMInternationalConferenceProceedingSeries 69,
Proceedingsofthetwenty-firstinternationalconferenceonMachinelearning
(2004).
Hartigan, J.A.,Wong,M.A.:Ak-meansclusteringalgorithm.AppliedStatistics
28, 100-108(1979)
Hastie, T.,Tibshirani,R.:DiscriminantanalysisbyGaussianmixtures.Journal
of theRoyalStatisticalSociety,SeriesB 58, 155-176(1996)
Hastie, T.,Tibshirani,R.,Friedman,J.:TheElementsofStatisticalLearning:
Data Mining,Inference,andPrediction,SecondEdition,Springer.(2009)
Hsing, T.:Nearestneighborinverseregression.TheAnnalsofStatistics 27(2),
697-731 (1999)
Kuss, M.:NonlinearMultivariateAnalysiswithGeodesicKernels.Technische
UniversitatBerlin,DiplomaTheses(2002)
24
Lee, Y.J.,Huang,S.Y.:Reducedsupportvectormachines:astatisticaltheory.
IEEE TransactionsonNeuralNetworks 18, 1-13(2007)
Li, K.C.:Slicedinverseregressionfordimensionreduction.JournalofTheAme-
rican StatisticalAssociation 86, 316-342(1991)
Li, L.:Sparsesufficientdimensionreduction.Biometrika 94(3) 603-613(2007)
Li, C.G.,Guo,J.:Supervisedisomapwithexplicitmapping.Proceedingsofthe
First InternationalConferenceonInnovativeComputing,Informationand
Control-Volume 3, 345-348(2006)
Li, L.,Yin,X.:Slicedinverseregressionwithregularizations.Biometrics 64(1),
124-131 (2007)
Ni, L.,Cook,R.D.:Arobustinverseregressionestimator.Statistics&Probability
Letters 77(3), 343-349(2007)
Nilsson, J.,Fioretos,T.,Hoglund, M.,Fontes,M.:Approximategeodesicdistan-
ces revealbiologicallyrelevantstructuresinmicroarraydata,Bioinformatics
20(6), 874-880(2004)
Roweis,S.,Saul,L.:Nonlineardimensionalityreductionbylocallylinearembed-
ding. Science 290, 2323-2326(2000)
Samko,O.,Marshall,A.D.,Rosin,PL.:Selectionoftheoptimalparametervalue
for theISOMAPalgorithm.PatternRecognitionLetters 27(9), 968-979(2006)
Saul, L.K.,Roweis,S.T.:Thinkglobally,fitlocally:unsupervisedlearningoflow
dimensional manifolds.JournalofMachineLearningResearch 4, 119-155
(2003)
Setodji,C.M.,Cook,R.D:K-meansinverseregression.Technometrics 46(4), 421-
429 (2004)
Smola, A.J.,Schölkopf,B.:Sparsegreedymatrixapproximationformachinelear-
ning. inProceedingsofthe17thInternationalConferenceonMachineLear-
ning, 911-918,StanfordUniversity,CA,MorganKaufmannPublishers(2000)
Schölkopf,B.,Smola,A.J.:LearningWithKernels:SupportVectorMachines,Re-
gularization, Optimization,andBeyond,MITPress,Cambridge,MA(2002)
25
Tenenbaum,J.B.,deSilva,V.,Langford,J.C.:Aglobalgeometricframeworkfor
nonlinear dimensionalityreduction.Science 290, 2319-2323(2000)
Tien, Y.J.Lee,Y.S.,Wu,H.M.,Chen,C.H.:Methodsforsimultaneouslyidenti-
fying coherentlocalclusterswithsmoothglobalpatternsingeneexpression
profiles. BMCBioinformatics 9:155 (2008)
Vlachos,M.,Domeniconi,C.,Gunopulos,D.,Kollios,G.,Koudas,N.:Nonli-
near dimensionalityreductiontechniquesforclassificationandvisualization.
InternationalConferenceonKnowledgeDiscoveryandDataMining,645-651.
Proceedingsofthe8thACMSIGKDDInternationalConferenceonKnowledge
DiscoveryandDataMining(2002)
Weinberger,K.Q.,Sha,F.,andSaul,L.K.:Learningakernelmatrixfornonli-
near dimensionalityreduction.ProceedingsoftheTwentyFirstInternational
Conference onMachineLearning(ICML2004),pp.839-846,Banff,Canada
(2004)
Williams, C.,Seeger,M.:UsingtheNystrommethodtospeedupkernelmachines,
in Leen,T.K.,Dietterich,T.G.,andTresp,V.(eds),AdvancesinNeural
Information ProcessingSystem 13, 682-688.MITPress(2001)
Wu,H.M.:KernelSlicedinverseregressionwithapplicationsonclassification.
Journal ofComputationalandGraphicalStatistics 17(3), 590-610(2008)
Wu,H.M.,Lu,H.H.-S.:Supervisedmotionsegmentationbyspatial-frequential
analysis anddynamicslicedinverseregression.StatisticaSinica 14, 413-430
(2004)
Wu,H.M.,Lu,H.H.-S.:Iterativeslicedinverseregressionforsegmentationoful-
trasound andMRImages.PatternRecognition 40(12) 3492-3502(2007)
Wu,H.M.,Tien,Y.J.,Chen,C.H.:GAP:agraphicalenvironmentformatrix
visualization andclusteranalysis,ComputationalStatisticsandDataAnalysis
54, 767-778(2010)
Wu,Q.,Mukherjee,S.,Liang,F.:Localizedslicedinverseregression.Advances
in NeuralInformationProcessingSystems 20, Cambridge,MA:MITPress
(2008)
26
Yeh,Y.R.,Huang,S.Y.,Lee,Y.J.:Nonlineardimensionreductionwithkernel
sliced inverseregression.IEEETransactionsonKnowledgeandDataEngi-
neering 21(11), 1590-1603(2009)
Zhong, W.,Zeng,P.,Ma,P.,Liu,J.S.,Zhu,Y.:RSIR:regularizedslicedinverse
regression formotifdiscovery.Bioinformatics 21(22), 4169-4175(2005)
論文全文使用權限
校內
校內紙本論文立即公開
同意電子論文全文授權校園內公開
校內電子論文立即公開
校外
同意授權
校外電子論文立即公開

如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信