In practice, we start with a relatively high order (value of r), say 4, estimate the parameters and then conduct a t-test of H0: blr = 0. If we cannot reject H0: blr = 0, we drop Xlr from the regressors and run a ren regression with a polynomial of order r - 1 and conduct a test of H0: blr-1 = 0, and so on...
We should not let R2 drive our specification. Adding powers of Xl will mechanically increase R2.
Assume that we have two regressors, for instance age (X1) and gender (X2 = 1 if male and 0 if female). The dependent variable Y is the wage.
If we regress Y = b0 + b1X1 + b2X2 + U, we are assuming that the returns to experience are the same for men and women (and equal to b1). This may be a strong assumption, wages may increase more of less quickly with age for men than for women.
To allow for this, we may interact the age and gender variables and then estimate the following linear model by OLS:
Y = b0 + b1X1 + b2X2 + b12X1X2 + U
we then have (assuming exogeneity):
∂E(Y | X1 = x1, X2 = x2 ) / ∂x1 = b1 + b12x2.
the return to age now depends on gender (on x2),
If a regressor is in log, its associated coefficients is not affected by a scale (or unit) change. The constant however, will be affected:
y = b0 + bln(αx) = b0 + bln(α) + bln(x).
If both regressors X and the dependent variable Y are in logs, the regression coefficient can be interpreted as the elasticity of Y w.r.t. X: