For function estimation using penalized squared error criteria, we derive generally applicable risk bounds, showing the balance of accuracy of approximation and penalty relative to the sample size. Attention is given to linear combinations of terms from a given class (such as used in neural network models, projection pursuit regression, function aggregation and multiple linear regression). The risk bounds apply to forward stepwise selection and other relaxed greedy algorithms with penalty on the number of terms, and to ℓ1-penalized least squares, for which we develop a fast algorithm. |