In this paper we propose a method for human action recognition based on a string kernel framework. An action is represented as a string, where each symbol composing it is associated to an aclet, that is an atomic unit of the action encoding a feature vector extracted from raw data. In this way, measuring similarities between actions leads to design a similarity measure between strings. We propose to define this string’s similarity using the global alignment kernel framework. In this context, the similarity between two aclets is computed by a novel soft evaluation method based on an enhanced gaussian kernel. The main advantage of the proposed approach lies in its ability to effectively deal with actions of different lengths or different temporal scales as well as with noise introduced during the features extraction step. The proposed method has been tested over three publicly available datasets, namely the MIVIA, the CAD and the MHAD, and the obtained results, compared with several state of the art approaches, confirm the effectiveness and the applicability of our system in real environments, where unexperienced operators can easily configure it.