Classification can often benefit from efficient feature selection. However, the presence of linearly nonseparable data, quick
response requirement, small sample problem, and noisy features makes the feature selection quite challenging. In this work, a class
separability criterion is developed in a high-dimensional kernel space, and feature selection is performed by the maximization of this
criterion. To make this feature selection approach work, the issues of automatic kernel parameter tuning, numerical stability, and
regularization for multiparameter optimization are addressed. Theoretical analysis uncovers the relationship of this criterion to the
radius-margin bound of the Support Vector Machines (SVMs), the Kernel Fisher Discriminant Analysis (KFDA), and the kernel
alignment criterion, thus providing more insight into feature selection with this criterion. This criterion is applied to a variety of selection modes using different search strategies. Extensive experimental study demonstrates its efficiency in delivering fast and robust feature selection.