Diagnostics

wls

LinRegOutliers.OrdinaryLeastSquares.wlsFunction
wls(X, y, wts)

Estimate weighted least squares regression and create OLS object with estimated parameters.

Arguments

  • X::AbstractMatrix{Float64}: Design matrix.
  • y::AbstractVector{Float64}: Response vector.
  • wts::AbstractVector{Float64}: Weights vector.

Examples

julia> X = hcat(ones(24), phones[:,"year"]);
julia> y = phones[:,"calls"];
julia> w = ones(24)
julia> w[15:20] .= 0.0
julia> reg = wls(X, y, w)
julia> reg.betas
2-element Vector{Float64}:
 -63.481644325290425
   1.3040571939231453
source

dffit

LinRegOutliers.Diagnostics.dffitFunction
dffit(setting, i)

Calculate the effect of the ith observation on the linear regression fit.

Arguments

  • setting::RegressionSetting: A regression setting object.
  • i::Int: Index of the observation.

Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> dffit(reg, 1)
2.3008326745719785

julia> dffit(reg, 15)
2.7880619386124295

julia> dffit(reg, 16)
3.1116532421969794

julia> dffit(reg, 17)
4.367981450347031

julia> dffit(reg, 21)
-5.81610150322166

References

Belsley, David A., Edwin Kuh, and Roy E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. Vol. 571. John Wiley & Sons, 2005.

source

dffits

LinRegOutliers.Diagnostics.dffitsFunction
dffits(setting)

Calculate dffit for all observations.

Arguments

  • setting::RegressionSetting: A regression setting object.

Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);

julia> dffits(reg)
24-element Vector{Float64}:
   2.3008326745719785
   1.2189579001467337
   0.35535667547543426
  -0.14458523141740898
  -0.5558346324846752
  -0.8441316814464983
  -1.0329184407957257
  -1.16600692151232
  -1.2005633711667656
  -1.2549187193476428
  -1.3195581500053777
  -1.42383876236147
  -1.5917690629803474
  -1.6582086833534504
   2.7880619386124295
   3.1116532421969794
   4.367981450347031
   5.927603041427858
   8.442860517217582
  12.370243663029527
  -5.81610150322166
 -10.089153963127842
 -12.10803256546825
 -14.67006851119936

References

Belsley, David A., Edwin Kuh, and Roy E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. Vol. 571. John Wiley & Sons, 2005.

source

hatmatrix

LinRegOutliers.Diagnostics.hatmatrixFunction
hatmatrix(setting)

Calculate Hat matrix of dimensions n x n for a given regression setting with n observations.

Arguments

  • setting::RegressionSetting: A regression setting object.

Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> size(hatmatrix(reg))

(24, 24)
source

studentizedResiduals

LinRegOutliers.Diagnostics.studentizedResidualsFunction
studentizedResiduals(setting)

Calculate Studentized residuals for a given regression setting.

# Arguments:

  • setting::RegressionSetting: A regression setting object.

Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);

julia> studentizedResiduals(reg)
24-element Vector{Float64}:
  0.2398783264505892
  0.1463945666608097
  0.04934549995087145
 -0.023289236798461784
 -0.10408303320973748
 -0.18382934382804111
 -0.2609395640240455
 -0.33934473417314376
 -0.3973205657179429
 -0.46258080183149236
 -0.5261488085924144
 -0.5918396227060093
 -0.6616423337899147
 -0.6611792918262785
  1.0277190922689816
  1.0297863954540103
  1.2712201589839855
  1.4974523565936426
  1.8386296155264197
  2.316394853333409
 -0.9368354141338643
 -1.4009989983319822
 -1.4541520919831887
 -1.529459974327181
source

adjustedResiduals

LinRegOutliers.Diagnostics.adjustedResidualsFunction
adjustedResiduals(setting)

Calculate adjusted residuals for a given regression setting.

# Arguments:

  • setting::RegressionSetting: A regression setting object.

Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> adjustedResiduals(reg)
24-element Vector{Float64}:
  13.486773572526268
   8.2307993473897
   2.774371467851612
  -1.3093999279776498
  -5.851901346871404
 -10.335509559699863
 -14.670907823058053
 -19.07911256736661
 -22.338710565623828
 -26.00786250934617
 -29.58187157605512
 -33.27523207616458
 -37.19977737822219
 -37.173743587631165
  57.781855070799956
  57.898085871534626
  71.47231139524963
  84.19185329435882
 103.37399662263209
 130.23557965295348
 -52.6720662600165
 -78.76891816539992
 -81.75736547266746
 -85.9914301855088
source

jacknifedS

LinRegOutliers.Diagnostics.jacknifedSFunction
jacknifedS(setting, k)

Estimate standard error of regression with the kth observation is dropped.

# Arguments

  • setting::RegressionSetting: A regression setting object.
  • k::Int: Index of the omitted observation.

# Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);
julia> jacknifedS(reg, 2)
57.518441664761035

julia> jacknifedS(reg, 15)
56.14810222161477
source

cooks

LinRegOutliers.Diagnostics.cooksFunction
cooks(setting)

Calculate Cook distances for all observations in a regression setting.

Arguments

  • setting::RegressionSetting: A regression setting object.

Examples

julia> reg = createRegressionSetting(@formula(calls ~ year), phones);

julia> cooks(reg)
24-element Vector{Float64}:
 0.005344774190779822
 0.0017088194691033689
 0.00016624914057962608
 3.1644452583114795e-5
 0.0005395058666404081
 0.0014375008774859539
 0.0024828140956511258
 0.0036279720445167277
 0.004357605989540906
 0.005288503758364767
 0.006313578057565415
 0.0076561205696857254
 0.009568574875389256
 0.009970039008782357
 0.02610396373381051
 0.029272523880917646
 0.05091236198400663
 0.08176555044049343
 0.14380266904640235
 0.26721539425047447
 0.051205153558783356
 0.13401084683481085
 0.16860324592350226
 0.2172819114905912

References

Cook, R. Dennis. "Detection of influential observation in linear regression." Technometrics 19.1 (1977): 15-18.

source

cooksoutliers

LinRegOutliers.Diagnostics.cooksoutliersFunction
cooksoutliers(setting; alpha = 0.5)

Calculates Cooks distance for a given regression setting and reports the potentials outliers

Arguments

  • setting::RegressionSetting: RegressionSetting object with a formula and dataset.
  • alpha::Float: Probability for cutoff value. quantile(Fdist(p, n-p), alpha) is used for cutoff. Default is 0.5.

Output

  • ["distance"]: Cooks distances.
  • ["cutoff"]: Quantile of the F distribution.
  • ["potentials"]: Vector of indices of potential regression outliers.
source

mahalanobisSquaredMatrix

LinRegOutliers.Diagnostics.mahalanobisSquaredMatrixFunction
mahalanobisSquaredMatrix(data::DataFrame; meanvector=nothing, covmatrix=nothing)

Calculate Mahalanobis distances.

Arguments

  • data::DataFrame: A DataFrame object of the multivariate data.
  • meanvector::AbstractVector{Float64}: Optional mean vector of variables.
  • covmatrix::AbstractMatrix{Float64}: Optional covariance matrix of data.

# References

Mahalanobis, Prasanta Chandra. "On the generalized distance in statistics." National Institute of Science of India, 1936.

source

dfbeta

LinRegOutliers.Diagnostics.dfbetaFunction
dfbeta(setting, omittedIndex)

Apply DFBETA diagnostic for a given regression setting and observation index.

Arguments

  • setting::RegressionSetting: A regression setting object.
  • omittedIndex::Int: Index of the omitted observation.

Example

julia> setting = createRegressionSetting(@formula(calls ~ year), phones);
julia> dfbeta(setting, 1)
2-element Vector{Float64}:
  9.643915678524024
 -0.14686166007904422
source

dfbetas

LinRegOutliers.Diagnostics.dfbetasFunction
dfbetas(setting)

Apply DFBETA diagnostic of all of the observations for a given regression setting.

Arguments

  • setting::RegressionSetting: A regression setting object.

See also: dfbeta

source

covratio

LinRegOutliers.Diagnostics.covratioFunction
covratio(setting, omittedIndex)

Apply covariance ratio diagnostic for a given regression setting and observation index.

Arguments

  • setting::RegressionSetting: A regression setting object.
  • omittedIndex::Int: Index of the omitted observation.

Example

julia> setting = createRegressionSetting(@formula(calls ~ year), phones);
julia> covratio(setting, 1)
1.2945913799871505
source

hadimeasure

LinRegOutliers.Diagnostics.hadimeasureFunction
hadimeasure(setting; c = 2.0)

Apply Hadi's regression diagnostic for a given regression setting

Arguments

  • setting::RegressionSetting: A regression setting object.
  • c::Float64: Critical value selected between 2.0 - 3.0. The default is 2.0.

Example

julia> setting = createRegressionSetting(@formula(calls ~ year), phones);
julia> hadimeasure(setting)

References

Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley & Sons, 2012.

source

diagnose