Predictive mean matching

From Wikipedia, the free encyclopedia

Predictive mean matching (PMM)[1] is a widely used[2] statistical imputation method for missing values, first proposed by Donald B. Rubin in 1986[3] and R. J. A. Little in 1988.[4]

It aims to reduce the bias introduced in a dataset through imputation, by drawing real values sampled from the data.[5] This is achieved by building a small subset of observations where the outcome variable matches the outcome of the observations with missing values.[1]

Compared to other imputation methods, it usually imputes less implausible values (e.g. negative incomes) and takes heteroscedastic data into account more appropriately.[6]

References[edit]

  1. ^ a b "3.4 Predictive mean matching". stefvanbuuren.name. Retrieved 30 June 2019.
  2. ^ "Web of Science [v.5.32] – All Databases Results". apps.webofknowledge.com. Retrieved 30 June 2019.
  3. ^ Rubin, Donald B. (30 June 1986). "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations". Journal of Business & Economic Statistics. 4 (1): 87–94. doi:10.2307/1391390. JSTOR 1391390.
  4. ^ Little, Roderick J. A. (30 June 1988). "Missing-Data Adjustments in Large Surveys". Journal of Business & Economic Statistics. 6 (3): 287–296. doi:10.2307/1391878. JSTOR 1391878.
  5. ^ "Imputation by Predictive Mean Matching: Promise & Peril – Statistical Horizons". statisticalhorizons.com. Retrieved 30 June 2019.
  6. ^ "Predictive Mean Matching Imputation (Example in R)". Statistics Globe. Retrieved 2020-09-18.