TOP-CODED

In econometrics and statistics, a 'top coded' dataset is one for which the upper bound is not known. This is often done to preserve the anonymity of people participating in the survey (for example, if a survey included a person with wealth of $51 billion, it would not be anonymous because people would know it is Bill Gates).

Contents
Example: Top Coding of Wealth
See Also
References

Example: Top Coding of Wealth


id age income
1 26 24778 ''exact value''
2 32 26750 ''exact value''
3 45 26780 ''exact value''
4 32 '30000+' ''top coded''
5 45 '30000+' ''top coded''

==Implications for Ordinary Least Squares==

★ If the lower bound of the top coded group is used as a regressor value (30000 in the example above), OLS is biased and inconsistent.

★ The top-coded group can be omitted from the regression entirely. Provided there are no systematic differences between the omitted group and the included groups, OLS is consistent and unbiased.

★ The Tobit procedure is robust to top coding, and gives unbiased estimates.

See Also



Tobit model

Heckitt model

Truncated data

References



★ Tobin, James (1958). "Estimation for relationships with limited dependent variables". Econometrica 26 (1), 24–36.

This article provided by Wikipedia. To edit the contents of this article, click here for original source.

psst.. try this: add to faves