WSEAS Transactions on Mathematics
Print ISSN: 1109-2769, E-ISSN: 2224-2880
Volume 22, 2023
Comparison of Logistic Regression and Discriminant Analysis for Classification of Multicollinearity Data
Author:
Abstract: The objective of this study is to concentrate on the classification method of the logistic regression and the discriminant analysis by using the simulation dataset and the liver patients as the actual data. These datasets are used the binary dependent variable depending on the correlated independent variables or called multicollinearity data. The standard classification method is logistic regression, which uses the logit function’s probability to conduct the dichotomous dependent variable. The iteration process can be solved to estimate logit function parameters and explain the relationship between a dependent binary variable and independent variables. Discriminant analysis is a powerful classification based on linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and regularized discriminant analysis (RDA). These methods consider the decision boundaries by building a classifier model on the multivariate normal distribution. LDA defines the standard covariance matrix, but QDA has an individual covariance matrix. RDA extends from QDA by setting the regularized parameter to estimate the covariance matrix. In the case of the simulation study, the independent variables are generated by defining the constant correlation on the multivariate normal distribution that made the multicollinearity problem. Then the binary response variable can be approximated from the logit function. For application to actual data, we expressed the classification of type liver and non-liver patients as the dependent variables and obtained patient personal information on the nine independent variables. The highest average percentage of accuracy determines the performance of these methods. The results have shown that the logistic regression was successful when using small independent variables, but the RDA performed when using large independent variables.
Search Articles
Keywords: linear discriminant analysis, quadratic discriminant analysis, regularized discriminant analysis
Pages: 120-131
DOI: 10.37394/23206.2023.22.15