Inertial sensors are one of the most commonly used sources of data for human activity recognition (HAR) and exercise detection (ED) tasks. The time series produced by these sensors are generally analyzed through numerical methods. Machine learning techniques such as random forests or support vector machines are popular in this field for classification efforts, but they need to be supported through the isolation of a potentially large number of additionally crafted features derived from the raw data. This feature preprocessing step can involve nontrivial digital signal processing (DSP) techniques. However, in many cases, the researchers interested in this type of activity recognition problems do not possess the necessary technical background for this feature-set development. The study aimed to present a novel application of established machine vision methods to provide interested researchers with an easier entry path into the HAR and ED fields. This can be achieved by removing the need for deep DSP skills through the use of transfer learning. This can be done by using a pretrained convolutional neural network (CNN) developed for machine vision purposes for exercise classification effort. The new method should simply require researchers to generate plots of the signals that they would like to build classifiers with, store them as images, and then place them in folders according to their training label before retraining the network. We applied a CNN, an established machine vision technique, to the task of ED. Tensorflow, a high-level framework for machine learning, was used to facilitate infrastructure needs. Simple time series plots generated directly from accelerometer and gyroscope signals are used to retrain an openly available neural network (Inception), originally developed for machine vision tasks. Data from 82 healthy volunteers, performing 5 different exercises while wearing a lumbar-worn inertial measurement unit (IMU), was collected. The ability of the proposed method to automatically classify the exercise being completed was assessed using this dataset. For comparative purposes, classification using the same dataset was also performed using the more conventional approach of feature-extraction and classification using random forest classifiers. With the collected dataset and the proposed method, the different exercises could be recognized with a 95.89% (3827/3991) accuracy, which is competitive with current state-of-the-art techniques in ED. The high level of accuracy attained with the proposed approach indicates that the waveform morphologies in the time-series plots for each of the exercises is sufficiently distinct among the participants to allow the use of machine vision approaches. The use of high-level machine learning frameworks, coupled with the novel use of machine vision techniques instead of complex manually crafted features, may facilitate access to research in the HAR field for individuals without extensive digital signal processing or machine learning backgrounds.
Maynooth University ->