Evaluating Model-assisted Estimators: A Comparative Study in High-dimensional Survey Data

Chhalotre, Rakesh and Naik, B. Samuel and Karthik, V C and Varma, Manoj and Singh, Akarsh and ., Balan C and Gupta, Ashish (2024) Evaluating Model-assisted Estimators: A Comparative Study in High-dimensional Survey Data. Journal of Scientific Research and Reports, 30 (9). pp. 707-718. ISSN 2320-0227

[thumbnail of Varma3092024JSRR123076.pdf] Text
Varma3092024JSRR123076.pdf - Published Version

Download (673kB)

Abstract

Model-assisted estimators have gained significant attention due to their ability to efficiently utilize auxiliary information during the estimation process. These estimators rely on a working model that links the survey variable to the auxiliary variables, which is then fitted to the sample data to generate predictions. These predictions are subsequently integrated into the estimation procedures. In this study, were explores various model-assisted estimators including Generalized Regression (GREG), Ridge regression, Lasso regression, CART (Classification and Regression Tree), Random Forest, Cubist and Principal Components Regression (PCR) estimator. The analysis involved 2,000 samples of size 50 (n/N ≈ 10%) and employed a stepwise variable selection method to determine the most significant auxiliary variables, incrementally adding them to the model. The performance of these estimators was assessed using relative bias (RB), relative root mean square error (RRMSE) and relative efficiency (RE). Our findings reveal that tree-based models like CART and Random Forest and penalized regression estimators such as Ridge and Lasso display robustness with increased number of auxiliary variables. Among all the estimators, Random Forest consistently yielded the lowest RRMSE, particularly with five auxiliary variables, demonstrating superior efficiency. Conversely, the GREG estimator exhibited poor performance as the number of auxiliary variables increased. This study underscores the importance of selecting suitable model-assisted estimation procedures tailored to the data characteristics and the relationship between survey and auxiliary variables within this high-dimensional dataset.

Item Type: Article
Subjects: Archive Digital > Multidisciplinary
Depositing User: Unnamed user with email support@archivedigit.com
Date Deposited: 11 Sep 2024 08:01
Last Modified: 11 Sep 2024 08:01
URI: http://eprints.ditdo.in/id/eprint/2304

Actions (login required)

View Item
View Item