Paper Title
SELECTING PROXIES FOR INPUTS WITH LIMITED DATA IN DATA ENVELOPMENT ANALYSIS

Abstract
Abstract - Model selection is an important issue in Data Envelopment Analysis. A specific case is choosing proxies for inputs/ outputs when the required data are not available. When there are several potential candidates in the data that can capture the characteristics of a theoretical variable, the researcher usually decides a proxy by experience. However, choosing by experience is usually seen as subjective decisions and lack of theoretical grounds. This paper adopts the principle of the benefit of doubt to explore systematic ways of selecting a proper proxy for an input/ output. We observe that this line of literature selects a proxy by choosing the candidate that causes the data closer to the empirical production frontier. Following this line of research, this paper suggests three approaches to find a proxy from several candidates. When a candidate dominates other candidates as a proxy for a variable, our method will select this candidate objectively. All approaches discussed in this paper are applied to 3 industries in China from 2017 to 2019. To select an input proxy for capital, there are three alternatives: total assets, non-current assets and current assets. Although non-current assets may be expected to be an appropriate proxy for capital, it is overwhelmingly outperformed by total assets and current assets. Since these three data variables are the most common data available in published data as proxies for capital, our empirical results are valuable to applied researchers of the Chinese economy. Keywords - Model selection; goodness-of-fit measure; selecting input/ output proxy; Data Envelopment Analysis