Fractional outcomes are common. You might be modeling participation rates in a 401(k) pension plan, the pass rate on standardized tests, expenditure shares, or the like.
Fractional response models are a flexible and intuitive way to model outcomes that lie between 0 and 1. They do not have the problem of linear models that will yield predictions outside 0 and 1 or the problem of log-odds models that are undefined at 0 and 1. Fractional response models can be fit using the fracreg command.
What if you are concerned that one or more of your model covariates are endogenous? With the new ivfprobit command, you can fit a model for a fractional dependent variable and account for endogeneity in one or more of the covariates.
Let’s see it work
We want to study 401(k) participation rate (prate). We believe that corporate employment size (ltotemp) and its square are determinants of participation rates, as are an indicator of whether the 401(k) is the sole pension plan (sole) and the plan matching rate (mrate). We believe, however, that the plan matching rate is endogenous. In other words, there are unobserved determinants of participation rates that also affect the plan matching rate. For instance, matching rate and participation rate might be associated with industry practices and regional practices not observable in the data. To address endogeneity, we instrument matching rate using the age of the plan (age) and its square.
We type
. ivfprobit prate c.ltotemp##c.ltotemp i.sole (mrate = c.age##c.age)
Inside the parentheses is the endogenous variable along with the instrumental variables we used to model it. Outside the parentheses are the exogenous variables, that affect prate directly. We get
We find a positive effect of the matching rate on the participation rate. Additionally, we see that the estimated correlation between the unobservables, corr(e.mrate, e.prate), is different from zero. This means there is evidence to support our endogeneity conjecture.