淡江大學覺生紀念圖書館 (TKU Library)
進階搜尋


下載電子全文限經由淡江IP使用) 
系統識別號 U0002-2007201709245400
中文論文名稱 基於累積切片平均估計的非線性維度縮減法
英文論文名稱 Nonlinear dimension reduction based on the cumulative slicing mean estimation
校院名稱 淡江大學
系所名稱(中) 數學學系碩士班
系所名稱(英) Department of Mathematics
學年度 105
學期 2
出版年 106
研究生中文姓名 王子豪
研究生英文姓名 Tzu-Hao Wang
學號 604190206
學位類別 碩士
語文別 英文
口試日期 2017-07-14
論文頁數 60頁
口試委員 指導教授-吳漢銘
共同指導教授-黃逸輝
委員-蘇家玉
委員-陳怡如
中文關鍵字 累積切片估計  等距特徵映射  流形學習  非線性維度縮 減  切片逆迴歸 
英文關鍵字 Cumulative slicing estimation  isometric feature mapping  manifold learning  nonlinear dimension reduction  sliced inverse regression 
學科別分類 學科別自然科學數學
中文摘要 文獻中,對於流形學習的非線性維度縮減已有不少研究。其中,迴
歸等軸距切片逆迴歸法(ISOSIR),是屬於一種半監督式的學習演算
法,已被提出並証明它可以有效地探索非線性流形資料隱含的幾何
結構,例如瑞士捲資料。ISOSIR 是採用均值法做為一個基礎的群集
分析,應用到預先計算好的資料集等距距離矩陣。然而,反應變數
在群內及群間的順序訊息在群集之後會被忽略,而順序結構是非線
性資料很重要的特徵之一。另一方面,假設資料的具有類別資訊,
等距離矩陣的計算並沒有考慮到這個資訊。在本研究中,我們擴展
ISOSIR 和等軸距累積切片平均估計法,提出一監督式演算法,用以
解決上述這兩個問題。我們進行了模擬研究和實際資料分析,結果
顯示所提出的方法可以揭示非線性流形資料的幾何結構,同時與監
督式的ISOSIR 表現相當。我們更進一步研究,應用所找出的低維度
資料特徵於實際資料的分類及回歸問題。
英文摘要 A number of studies have been conducted on the nonlinear dimension reduction
for manifold learning in the literature. Among them, the isometric sliced
inverse regression (ISOSIR), a semi-supervised learning algorithm, has been
proposed and shown to be useful for exploring the embedded geometric
structure of the nonlinear manifold data set such as the Swiss roll. ISOSIR
applied K-means as a base clustering method to the pre-calculated isometric
distance matrix of the data set. However, the ordering information of
response both within and between the resulting clusters was ignored where
the ordering structure was one of the most important characteristics of a
nonlinear manifold data set. On the other hand, the construction of the
isometric distance matrix did not consider the class labels of data if they
were available. In this study, we are motivated to settle these two defects
and propose the supervised extensions of ISOSIR and isometric cumulative
slicing mean estimation. We conducted the simulation studies and real data
analysis and shown that the proposed method can reveal the geometric
structure of a nonlinear manifold data set and the results were comparable
to the supervised ISOSIR. We further investigated the applications of the
found features for the classification and regression problems to the real
world data sets.
論文目次 1 Introduction 1
2 Briefreviewofdimensionreductiontechniques 2
2.1 TheclassicalSIR.............................2
2.2 Thecumulativeslicingestimation(CUME)...............4
2.3 ThegeodesicdistanceapproximationandISOMAP..........5
2.4 TheisometricSIR(ISOSIR).......................6
3 ExtensionsofISOSIRandCUME 8
3.1 Thegeodesicdistanceapproximationrevisited.............8
3.2 TheextensionsofISOSIRandISOCUME...............9
4 Theslicingandtheseriationstrategies 10
4.1 TheslicingstrategyforSIR-basedmethods...............10
4.2 TheseriationstrategyforCUME-basedmethods...........11
5 Simulationstudies 12
6 Applications 16
6.1 Datavisualization............................16
6.2 Classificationproblems..........................16
6.3 Regressionproblems...........................17
7 Conclusion and discussion 18
參考文獻 References
Aizerman, M.,Braverman,E.,andRozonoer,L.:Theoreticalfoundationsofthe
potentialfunctionmethodinpatternrecognitionlearning.Automationand
Remote Control 25, 821-837(1964)
Balasubramanian, M.,Schwartz,E.L.:Theisomapalgorithmandtopologicalsta-
bility.Science 295(5552), 7-7(2002)
Belkin, M.,Niyogi,P.:Laplacianeigenmapsfordimensionalityreductionanddata
representation.NeuralComputation 15(6), 1373-1396(2003)
Bengio, Y.,Paiement,J.,Vincent,P.,Delalleau,O.,Roux,N.L.,Ouimet,M.:
Out-of-sample extensionsforLLE,Isomap,MDS,Eigenmaps,andspectral
clustering. InNeuralInformationProcessingSystems,pp.177-184.MIT
Press (2003)
Bian, W.,Tao,D.:ManifoldregularizationforSIRwithrateroot-nconvergence.
AdvancesinNeuralInformationProcessingSystems 22, 117-125(2009)
Bura, E.,Pfeiffer,R.M.:Graphicalmethodsforclasspredictionusingdimension
reduction techniquesonDNAmicroarraydata.Bioinformatics 19(10), 1252-
1258 (2003)
Chen, C.H.:Generalizedassociationplots:informationvisualizationviaiteratively
generated correlationmatrices.StatisticaSinica 12, 7-29(2002)
22
Chen, C.H.,Li,K.C.:CanSIRbeaspopularasmultiplelinearregression?Statis-
tica Sinica 8, 289-316(1998)
Chen, C.H.,Li,K.C.:GeneralizationofFisher’slineardiscriminantanalysisvia
the approachofslicedinverseregression.JournaloftheKoreanStatistical
Society 30, 193-217(2001)
Coifman, R.R.,Lafon,S.,Lee,A.B.,Maggioni,M.,Nadler,B.,Warner,F.,Zucker,
S.W.: Geometricdiffusionsasatoolforharmonicanalysisandstructurede-
finition ofdata:diffusionmaps.Proc.Natl.Acad.Sci.USA 102, 7426-7431
(2005)
Cook,R.D.:Ontheinterpretationofregressionplots.JournaloftheAmerican
Statistical Association 89, 177-190(1994)
Cook,R.D.:Graphicsforregressionswithabinaryresponse.JournaloftheAme-
rican StatisticalAssociation 91, 983-992(1996)
Cook,R.D.:SAVE:amethodfordimensionreductionandgraphicsinregression.
CommunicationsinStatistics:TheoryandMethods 29, 2109-2121(2000)
Cook,R.D.,Critchley,F.:Identifyingregressionoutliersandmixturesgraphically.
Journal oftheAmericanStatisticalAssociation 95, 781-794(2000)
Cook,R.D.,Ni,L.:Sufficientdimensionreductionviainverseregression:amini-
mumdiscrepancyapproach.JournaloftheAmericanStatisticalAssociation
100(470), 410-428(2005)
Cook,R.D.,Ni,L.:Usingintraslicecovariancesforimprovedestimationofthe
centralsubspaceinregression.Biometrika 93(1), 65-74(2006)
Cox,T.F.,Cox,M.A.A.:MultidimensionalScaling,London:ChapmanandHall.
(1994)
Dettling, M.,Bühlmann,P.:Supervisedclusteringofgenes.GenomeBiology
3(12), research0069.1-0069.15.(2002)
Donoho, D.L.,Grimes,C.:Hessianeigenmaps:locallylinearembeddingtechniques
for high-dimensionaldata.Proc.Natl.Acad.Sci.USA 100(10), 5591-5596
(2003)
23
Frank,A.,Asuncion,A.:UCIMachineLearningRepository[http://archive.ics.uci.edu/ml].
Irvine, CA:UniversityofCalifornia,SchoolofInformationandComputer
Science (2010)
Fukumizu,K.,Bach,F.R.,Jordan,M.I.:Kerneldimensionreductioninregression.
Ann. Statist. 37(4) 1871-1905(2009)
Gaoa, X.,Liang,J.:Thedynamicalneighborhoodselectionbasedonthesam-
pling densityandmanifoldcurvatureforisometricdataembedding,Pattern
Recognition Letters32(2),202-209(2011)
Garber,M.etal.:Diversityofgeneexpressioninadenocarcinomaofthelung.
Proc.Natl.Acad.Sci.USA 98(24), 13784-13789(2001)
Gather, U.,Hilker,T.,Becker,C.:Anoteonoutliersensitivityofslicedinverse
regression. Statistics 36(4), 271-281(2002)
Geng, X.,Zhan,D.C.,Zhou,Z.H.:Supervisednonlineardimensionalityreduction
for visualizationandclassification.IEEETransSystManCybernBCybern
35(6), 1098-1107(2005)
Ham, J.,Lee,D.D.,Mika,S.,Scholkopf,B.:Akernelviewofthedimensionality
reduction ofmanifolds.ACMInternationalConferenceProceedingSeries 69,
Proceedingsofthetwenty-firstinternationalconferenceonMachinelearning
(2004).
Hartigan, J.A.,Wong,M.A.:Ak-meansclusteringalgorithm.AppliedStatistics
28, 100-108(1979)
Hastie, T.,Tibshirani,R.:DiscriminantanalysisbyGaussianmixtures.Journal
of theRoyalStatisticalSociety,SeriesB 58, 155-176(1996)
Hastie, T.,Tibshirani,R.,Friedman,J.:TheElementsofStatisticalLearning:
Data Mining,Inference,andPrediction,SecondEdition,Springer.(2009)
Hsing, T.:Nearestneighborinverseregression.TheAnnalsofStatistics 27(2),
697-731 (1999)
Kuss, M.:NonlinearMultivariateAnalysiswithGeodesicKernels.Technische
UniversitatBerlin,DiplomaTheses(2002)
24
Lee, Y.J.,Huang,S.Y.:Reducedsupportvectormachines:astatisticaltheory.
IEEE TransactionsonNeuralNetworks 18, 1-13(2007)
Li, K.C.:Slicedinverseregressionfordimensionreduction.JournalofTheAme-
rican StatisticalAssociation 86, 316-342(1991)
Li, L.:Sparsesufficientdimensionreduction.Biometrika 94(3) 603-613(2007)
Li, C.G.,Guo,J.:Supervisedisomapwithexplicitmapping.Proceedingsofthe
First InternationalConferenceonInnovativeComputing,Informationand
Control-Volume 3, 345-348(2006)
Li, L.,Yin,X.:Slicedinverseregressionwithregularizations.Biometrics 64(1),
124-131 (2007)
Ni, L.,Cook,R.D.:Arobustinverseregressionestimator.Statistics&Probability
Letters 77(3), 343-349(2007)
Nilsson, J.,Fioretos,T.,Hoglund, M.,Fontes,M.:Approximategeodesicdistan-
ces revealbiologicallyrelevantstructuresinmicroarraydata,Bioinformatics
20(6), 874-880(2004)
Roweis,S.,Saul,L.:Nonlineardimensionalityreductionbylocallylinearembed-
ding. Science 290, 2323-2326(2000)
Samko,O.,Marshall,A.D.,Rosin,PL.:Selectionoftheoptimalparametervalue
for theISOMAPalgorithm.PatternRecognitionLetters 27(9), 968-979(2006)
Saul, L.K.,Roweis,S.T.:Thinkglobally,fitlocally:unsupervisedlearningoflow
dimensional manifolds.JournalofMachineLearningResearch 4, 119-155
(2003)
Setodji,C.M.,Cook,R.D:K-meansinverseregression.Technometrics 46(4), 421-
429 (2004)
Smola, A.J.,Schölkopf,B.:Sparsegreedymatrixapproximationformachinelear-
ning. inProceedingsofthe17thInternationalConferenceonMachineLear-
ning, 911-918,StanfordUniversity,CA,MorganKaufmannPublishers(2000)
Schölkopf,B.,Smola,A.J.:LearningWithKernels:SupportVectorMachines,Re-
gularization, Optimization,andBeyond,MITPress,Cambridge,MA(2002)
25
Tenenbaum,J.B.,deSilva,V.,Langford,J.C.:Aglobalgeometricframeworkfor
nonlinear dimensionalityreduction.Science 290, 2319-2323(2000)
Tien, Y.J.Lee,Y.S.,Wu,H.M.,Chen,C.H.:Methodsforsimultaneouslyidenti-
fying coherentlocalclusterswithsmoothglobalpatternsingeneexpression
profiles. BMCBioinformatics 9:155 (2008)
Vlachos,M.,Domeniconi,C.,Gunopulos,D.,Kollios,G.,Koudas,N.:Nonli-
near dimensionalityreductiontechniquesforclassificationandvisualization.
InternationalConferenceonKnowledgeDiscoveryandDataMining,645-651.
Proceedingsofthe8thACMSIGKDDInternationalConferenceonKnowledge
DiscoveryandDataMining(2002)
Weinberger,K.Q.,Sha,F.,andSaul,L.K.:Learningakernelmatrixfornonli-
near dimensionalityreduction.ProceedingsoftheTwentyFirstInternational
Conference onMachineLearning(ICML2004),pp.839-846,Banff,Canada
(2004)
Williams, C.,Seeger,M.:UsingtheNystrommethodtospeedupkernelmachines,
in Leen,T.K.,Dietterich,T.G.,andTresp,V.(eds),AdvancesinNeural
Information ProcessingSystem 13, 682-688.MITPress(2001)
Wu,H.M.:KernelSlicedinverseregressionwithapplicationsonclassification.
Journal ofComputationalandGraphicalStatistics 17(3), 590-610(2008)
Wu,H.M.,Lu,H.H.-S.:Supervisedmotionsegmentationbyspatial-frequential
analysis anddynamicslicedinverseregression.StatisticaSinica 14, 413-430
(2004)
Wu,H.M.,Lu,H.H.-S.:Iterativeslicedinverseregressionforsegmentationoful-
trasound andMRImages.PatternRecognition 40(12) 3492-3502(2007)
Wu,H.M.,Tien,Y.J.,Chen,C.H.:GAP:agraphicalenvironmentformatrix
visualization andclusteranalysis,ComputationalStatisticsandDataAnalysis
54, 767-778(2010)
Wu,Q.,Mukherjee,S.,Liang,F.:Localizedslicedinverseregression.Advances
in NeuralInformationProcessingSystems 20, Cambridge,MA:MITPress
(2008)
26
Yeh,Y.R.,Huang,S.Y.,Lee,Y.J.:Nonlineardimensionreductionwithkernel
sliced inverseregression.IEEETransactionsonKnowledgeandDataEngi-
neering 21(11), 1590-1603(2009)
Zhong, W.,Zeng,P.,Ma,P.,Liu,J.S.,Zhu,Y.:RSIR:regularizedslicedinverse
regression formotifdiscovery.Bioinformatics 21(22), 4169-4175(2005)
論文使用權限
  • 同意紙本無償授權給館內讀者為學術之目的重製使用,於2017-07-25公開。
  • 同意授權瀏覽/列印電子全文服務,於2017-07-25起公開。


  • 若您有任何疑問,請與我們聯絡!
    圖書館: 請來電 (02)2621-5656 轉 2281 或 來信