This paper studies sequential forward feature selection that uses the scatter-matrix-based class separability measure. We find that by adding a scale factor to each iteration of the conventional sequential selection, a sequential selection that guarantees the global optimum can be attained. We give a thorough theoretical proof of its optimality via a novel geometric interpretation, and this leads to a unified framework including the optimal sequential selection, the conventional sequential selection and the best-individual-N selection. In addition, we show that with our formulation, feature selection can be treated as a linear fractional maximization problem, and it can be efficiently solved by algorithms well developed in the literature. This gives a non-sequential globally optimal feature selection algorithm. Both theoretical and experimental study demonstrate their efficiency.