Diagnostics
wls
LinRegOutliers.OrdinaryLeastSquares.wls — Function
wls(X, y, wts)
Estimate weighted least squares regression and create OLS object with estimated parameters.Arguments
X::AbstractMatrix{Float64}: Design matrix.y::AbstractVector{Float64}: Response vector.wts::AbstractVector{Float64}: Weights vector.
Examples
julia> X = hcat(ones(24), phones[:,"year"]);
julia> y = phones[:,"calls"];
julia> w = ones(24)
julia> w[15:20] .= 0.0
julia> reg = wls(X, y, w)
julia> reg.betas
2-element Vector{Float64}:
-63.481644325290425
1.3040571939231453dffit
LinRegOutliers.Diagnostics.dffit — Function
dffit(setting, i)Calculate the effect of the ith observation on the linear regression fit.
Arguments
setting::RegressionSetting: A regression setting object.i::Int: Index of the observation.
Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> dffit(reg, 1)
2.3008326745719785
julia> dffit(reg, 15)
2.7880619386124295
julia> dffit(reg, 16)
3.1116532421969794
julia> dffit(reg, 17)
4.367981450347031
julia> dffit(reg, 21)
-5.81610150322166References
Belsley, David A., Edwin Kuh, and Roy E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. Vol. 571. John Wiley & Sons, 2005.
dffits
LinRegOutliers.Diagnostics.dffits — Function
dffits(setting)Calculate dffit for all observations.
Arguments
setting::RegressionSetting: A regression setting object.
Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> dffits(reg)
24-element Vector{Float64}:
2.3008326745719785
1.2189579001467337
0.35535667547543426
-0.14458523141740898
-0.5558346324846752
-0.8441316814464983
-1.0329184407957257
-1.16600692151232
-1.2005633711667656
-1.2549187193476428
-1.3195581500053777
-1.42383876236147
-1.5917690629803474
-1.6582086833534504
2.7880619386124295
3.1116532421969794
4.367981450347031
5.927603041427858
8.442860517217582
12.370243663029527
-5.81610150322166
-10.089153963127842
-12.10803256546825
-14.67006851119936References
Belsley, David A., Edwin Kuh, and Roy E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. Vol. 571. John Wiley & Sons, 2005.
hatmatrix
LinRegOutliers.Diagnostics.hatmatrix — Function
hatmatrix(setting)Calculate Hat matrix of dimensions n x n for a given regression setting with n observations.
Arguments
setting::RegressionSetting: A regression setting object.
Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> size(hatmatrix(reg))
(24, 24)studentizedResiduals
LinRegOutliers.Diagnostics.studentizedResiduals — Function
studentizedResiduals(setting)Calculate Studentized residuals for a given regression setting.
# Arguments:
setting::RegressionSetting: A regression setting object.
Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> studentizedResiduals(reg)
24-element Vector{Float64}:
0.2398783264505892
0.1463945666608097
0.04934549995087145
-0.023289236798461784
-0.10408303320973748
-0.18382934382804111
-0.2609395640240455
-0.33934473417314376
-0.3973205657179429
-0.46258080183149236
-0.5261488085924144
-0.5918396227060093
-0.6616423337899147
-0.6611792918262785
1.0277190922689816
1.0297863954540103
1.2712201589839855
1.4974523565936426
1.8386296155264197
2.316394853333409
-0.9368354141338643
-1.4009989983319822
-1.4541520919831887
-1.529459974327181adjustedResiduals
LinRegOutliers.Diagnostics.adjustedResiduals — Function
adjustedResiduals(setting)Calculate adjusted residuals for a given regression setting.
# Arguments:
setting::RegressionSetting: A regression setting object.
Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> adjustedResiduals(reg)
24-element Vector{Float64}:
13.486773572526268
8.2307993473897
2.774371467851612
-1.3093999279776498
-5.851901346871404
-10.335509559699863
-14.670907823058053
-19.07911256736661
-22.338710565623828
-26.00786250934617
-29.58187157605512
-33.27523207616458
-37.19977737822219
-37.173743587631165
57.781855070799956
57.898085871534626
71.47231139524963
84.19185329435882
103.37399662263209
130.23557965295348
-52.6720662600165
-78.76891816539992
-81.75736547266746
-85.9914301855088jacknifedS
LinRegOutliers.Diagnostics.jacknifedS — Function
jacknifedS(setting, k)Estimate standard error of regression with the kth observation is dropped.
# Arguments
setting::RegressionSetting: A regression setting object.k::Int: Index of the omitted observation.
# Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> jacknifedS(reg, 2)
57.518441664761035
julia> jacknifedS(reg, 15)
56.14810222161477cooks
LinRegOutliers.Diagnostics.cooks — Function
cooks(setting)Calculate Cook distances for all observations in a regression setting.
Arguments
setting::RegressionSetting: A regression setting object.
Examples
julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> cooks(reg)
24-element Vector{Float64}:
0.005344774190779822
0.0017088194691033689
0.00016624914057962608
3.1644452583114795e-5
0.0005395058666404081
0.0014375008774859539
0.0024828140956511258
0.0036279720445167277
0.004357605989540906
0.005288503758364767
0.006313578057565415
0.0076561205696857254
0.009568574875389256
0.009970039008782357
0.02610396373381051
0.029272523880917646
0.05091236198400663
0.08176555044049343
0.14380266904640235
0.26721539425047447
0.051205153558783356
0.13401084683481085
0.16860324592350226
0.2172819114905912References
Cook, R. Dennis. "Detection of influential observation in linear regression." Technometrics 19.1 (1977): 15-18.
cooksoutliers
LinRegOutliers.Diagnostics.cooksoutliers — Function
cooksoutliers(setting; alpha = 0.5)Calculates Cooks distance for a given regression setting and reports the potentials outliers
Arguments
setting::RegressionSetting: RegressionSetting object with a formula and dataset.alpha::Float: Probability for cutoff value. quantile(Fdist(p, n-p), alpha) is used for cutoff. Default is 0.5.
Output
["distance"]: Cooks distances.["cutoff"]: Quantile of the F distribution.["potentials"]: Vector of indices of potential regression outliers.
mahalanobisSquaredMatrix
LinRegOutliers.Diagnostics.mahalanobisSquaredMatrix — Function
mahalanobisSquaredMatrix(data::DataFrame; meanvector=nothing, covmatrix=nothing)Calculate Mahalanobis distances.
Arguments
data::DataFrame: A DataFrame object of the multivariate data.meanvector::AbstractVector{Float64}: Optional mean vector of variables.covmatrix::AbstractMatrix{Float64}: Optional covariance matrix of data.
# References
Mahalanobis, Prasanta Chandra. "On the generalized distance in statistics." National Institute of Science of India, 1936.
dfbeta
LinRegOutliers.Diagnostics.dfbeta — Function
dfbeta(setting, omittedIndex)Apply DFBETA diagnostic for a given regression setting and observation index.
Arguments
setting::RegressionSetting: A regression setting object.omittedIndex::Int: Index of the omitted observation.
Example
julia> setting = createRegressionSetting(@formula(calls ~ year), phones);
julia> dfbeta(setting, 1)
2-element Vector{Float64}:
9.643915678524024
-0.14686166007904422dfbetas
LinRegOutliers.Diagnostics.dfbetas — Function
dfbetas(setting)Apply DFBETA diagnostic of all of the observations for a given regression setting.
Arguments
setting::RegressionSetting: A regression setting object.
See also: dfbeta
covratio
LinRegOutliers.Diagnostics.covratio — Function
covratio(setting, omittedIndex)Apply covariance ratio diagnostic for a given regression setting and observation index.
Arguments
setting::RegressionSetting: A regression setting object.omittedIndex::Int: Index of the omitted observation.
Example
julia> setting = createRegressionSetting(@formula(calls ~ year), phones);
julia> covratio(setting, 1)
1.2945913799871505hadimeasure
LinRegOutliers.Diagnostics.hadimeasure — Function
hadimeasure(setting; c = 2.0)Apply Hadi's regression diagnostic for a given regression setting
Arguments
setting::RegressionSetting: A regression setting object.c::Float64: Critical value selected between 2.0 - 3.0. The default is 2.0.
Example
julia> setting = createRegressionSetting(@formula(calls ~ year), phones);
julia> hadimeasure(setting)References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley & Sons, 2012.
diagnose
LinRegOutliers.Diagnostics.diagnose — Function
diagnose(setting; alpha = 0.5)Diagnose a regression setting and report potential outliers using dffits, dfbetas cooks, and hatmatrix
Arguments
setting::RegressionSetting: A regression setting object.alpha: Alpha value for Cooks distance cutoff. Seecooksoutliers.