How To Deal With Missing Covariates In Logistic Regression? A Bayesian Approach

Gh. Jandaghi, K. Azam, M. Karimlou, R. Wolfe, A. Forbes, K. Mohammad, M.R. Meskhani, S. Nedjat


Logistic regression is an analytical tool widely used in medical and epidemiological researches. In many studies, we face data sets in which some parts of the data are not reported or in other words are missing. The simplest way of dealing with such data is just to ignore the subjects with missing observations, and analyze with the complete cases which is obviously inefficient. We consider methods for analyzing data with logistic regression models when some covariates (Z) are completely observed but other covariates (X) are missing for some subjects. When data on X are missing at random(MAR), we present a likelihood approach for the observed data that allows an analysis similar to the case as if the data were complete. Following this approach, the parameters estimation is carried out using both Maximum likelihood and Bayesian methods through the Markov Chain Monte Carlo numerical computation scheme, and the results are compared . The illustrative example considered in this article involves Lung Auscultation of a Cross Sectional study data set taken from a Health Survey in Tehran.


Missing covariates, Missing at random, Markov Chain Monte Carlo, Logistic Regression, Lung Auscultation, Likelihood

