Sangam: A Confluence of Knowledge Streams

Cluster-Based Bounded Influence Regression

Show simple item record

dc.contributor Statistics
dc.contributor Birch, Jeffrey B.
dc.contributor Ye, Keying
dc.contributor Anderson-Cook, Christine M.
dc.contributor Terrell, George R.
dc.contributor Smith, Eric P.
dc.creator Lawrence, David E.
dc.date 2014-03-14T20:14:32Z
dc.date 2014-03-14T20:14:32Z
dc.date 2003-07-17
dc.date 2003-07-30
dc.date 2006-10-12
dc.date 2003-08-14
dc.date.accessioned 2023-02-28T18:22:13Z
dc.date.available 2023-02-28T18:22:13Z
dc.identifier etd-07302003-183651
dc.identifier http://hdl.handle.net/10919/28455
dc.identifier http://scholar.lib.vt.edu/theses/available/etd-07302003-183651/
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/269792
dc.description In the field of linear regression analysis, a single outlier can dramatically influence ordinary least squares estimation while low-breakdown procedures such as M regression and bounded influence regression may be unable to combat a small percentage of outliers. A high-breakdown procedure such as least trimmed squares (LTS) regression can accommodate up to 50% of the data (in the limit) being outlying with respect to the general trend. Two available one-step improvement procedures based on LTS are Mallows 1-step (M1S) regression and Schweppe 1-step (S1S) regression (the current state-of-the-art method). Issues with these methods include (1) computational approximations and sub-sampling variability, (2) dramatic coefficient sensitivity with respect to very slight differences in initial values, (3) internal instability when determining the general trend and (4) performance in low-breakdown scenarios. A new high-breakdown regression procedure is introduced that addresses these issues, plus offers an insightful summary regarding the presence and structure of multivariate outliers. This proposed method blends a cluster analysis phase with a controlled bounded influence regression phase, thereby referred to as cluster-based bounded influence regression, or CBI. Representing the data space via a special set of anchor points, a collection of point-addition OLS regression estimators forms the basis of a metric used in defining the similarity between any two observations. Cluster analysis then yields a main cluster "halfset" of observations, with the remaining observations becoming one or more minor clusters. An initial regression estimator arises from the main cluster, with a multiple point addition DFFITS argument used to carefully activate the minor clusters through a bounded influence regression framework. CBI achieves a 50% breakdown point, is regression equivariant, scale equivariant and affine equivariant and distributionally is asymptotically normal. Case studies and Monte Carlo studies demonstrate the performance advantage of CBI over S1S and the other high breakdown methods regarding coefficient stability, scale estimation and standard errors. A dendrogram of the clustering process is one graphical display available for multivariate outlier detection. Overall, the proposed methodology represents advancement in the field of robust regression, offering a distinct philosophical viewpoint towards data analysis and the marriage of estimation with diagnostic summary.
dc.description Ph. D.
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.format application/pdf
dc.publisher Virginia Tech
dc.relation Chapter3.pdf
dc.relation Back.pdf
dc.relation Front.pdf
dc.relation Chapter2.pdf
dc.relation Chapter4.pdf
dc.relation Chapter6.pdf
dc.relation Chapter8.pdf
dc.relation Chapter9.pdf
dc.relation Chapter7.pdf
dc.relation Chapter5.pdf
dc.relation Chapter1.pdf
dc.rights In Copyright
dc.rights http://rightsstatements.org/vocab/InC/1.0/
dc.subject High-breakdown
dc.subject Robust
dc.subject Linear
dc.subject Outlier
dc.subject LTS
dc.title Cluster-Based Bounded Influence Regression
dc.type Dissertation


Files in this item

Files Size Format View
Back.pdf 225.9Kb application/pdf View/Open
Chapter1.pdf 261.0Kb application/pdf View/Open
Chapter2.pdf 285.1Kb application/pdf View/Open
Chapter3.pdf 141.8Kb application/pdf View/Open
Chapter4.pdf 179.2Kb application/pdf View/Open
Chapter5.pdf 594.9Kb application/pdf View/Open
Chapter6.pdf 381.5Kb application/pdf View/Open
Chapter7.pdf 451.7Kb application/pdf View/Open
Chapter8.pdf 390.9Kb application/pdf View/Open
Chapter9.pdf 128.9Kb application/pdf View/Open
Front.pdf 236.7Kb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse