In the previous section, we introduced studies that efficiently and accurately calibrate soft sensors using machine learning techniques. This section deals with applications aimed to perform purposeful tasks based on tactile or human-related information obtained from sensor data other than just calibrating sensors.
First, soft sensors have been widely employed to obtain tactile information, such as single- or multi-point contact pressure, vibration, during physical interactions with the environment by mimicking the functionalities and properties of skin. The tasks that involve the use of soft tactile sensors are not limited to contact localization and magnitude estimations. They also include extended applications, such as contact stability estimation, object type or shape recognition, and material classification, especially when they are integrated with grippers. Since these tasks need to process large and non-intuitive sensing datasets to extract meaningful features and required results, various appropriate machine learning techniques have been actively applied. To recognize objects in contact, classification algorithms such as SVM and kNN are used. Given that typical soft tactile sensors are composed of multiple sensing nodes like human skin, the data collected from the sensors are similar to multi-dimension image data. Hence, a CNN, which is one of the deep learning algorithms that are specialized with respect to image processing, is generally used.
Roberge et al. conducted a study on the classification of gripping states using a soft pressure sensing pad, to establish whether contacted objects were stably gripped or subject to slippage based on magnitude and frequency information of contact force. Sparse coding, which is a statistical model that can be learned using only a small amount of data, was used to train the classifier with the sensing data. The classifier was then re-trained using SVM based on the initial training results. Thereafter, the gripping states were estimated [55]. Larson et al. proposed a soft tactile interface that can recognize human gestures and contact location based on a capacitive-type tactile sensor array made of stretchable carbon nanotube dielectric elastomer embedded in a rubber membrane [30]. To determine the features including gestures and contact locations, the sensor data were trained using a three-dimensional CNN (3D-CNN) model for gesture recognition, and a 2-dimensional CNN model for contact localization. Calandra et al. attached GelSight high-resolution pressure mapping sensors to a fingered gripper for the analysis of the tactile information upon contact between the gripper and the object [56]. Then, the efficient and stable grasping adjustment for the most promising grasping motions was predicted through the proposed end-to-end action-conditional model based on a deep multi-modal convolutional network. The model overcomes the limitations of traditional gripping strategies that are primarily dependent on visual information. It provides a strategy for reliable gripping without the requirement of complex sensor calibrations or analytical contact force modeling. Zimmer et al. also conducted a study to estimate effective grasping of a shape-memory actuated gripper using multiple machine learning methods such as LSTM, SVM, spatiotemporal hierarchical matching pursuit (ST-HMP) [57], and a feed-forward neural network (FNN) [58].
Furthermore, Yuan et al. estimated the shore hardness of contacted objects by obtaining features of image frames based on pressure distribution data via the GelSight soft sensor by using a CNN and LSTM [26]. Baishya et al. conducted a study on material classification by attaching a flexible tactile skin to a robot hand. They used a CNN algorithm, whose performances were then compared with those of several learning algorithms upon two datasets that have different features [59]. A pressure-mapping sensor (Tekscan; Grip VersaTek 4256E) was used to gather spatiotemporal signals. Six types of materials were classified using CNN. Thereafter, the performance of proposed CNN algorithm was compared with those of various classification algorithms including Gaussian classification, kNN, and SVM; the CNN algorithm showed better classification accuracy. Polic et al. conducted a study to determine object shape, edge position, orientation, and indentation depth information required for object manipulation using an optical-based tactile sensor (TacTip) attached to the end effector of a robotic arm based on a CNN algorithm [60]. The main contribution of this study was the implementation of an unsupervised feature extraction method using a CNN autoencoder. This model allowed for the extraction of sufficient features from a small size dataset in addition to rapid model training due to its simple architecture. Masaki et al. conducted a study on the estimation of surface undulation using a strain gauge and an artificial neural network [61]. A system for the estimation of the surface undulation was then implemented by attaching the strain gauge covered with the silicone rubber layer to an index finger. The signal from the strain gauge was pre-processed and inputted into an FNN for the estimation of the surface undulation levels.
There also are various cases that multiple soft sensors are used in wearable devices to recognize human motions. Soft pressure sensors have been attached to soft gloves and insoles to recognize gripping states and walking motions by detecting in-contact with objects or ground. Soft strain sensors primarily estimate upper or lower limb motions, gait or hand motions by being attached to the joints with a single DOF or multiple DOF, i.e., finger, elbow, shoulder, knee, and ankle joints [62–65]. For these applications, the data obtained from soft sensors are correlated with the human biomedical and kinematics information such as the gait pattern. However, the relationship is not linear, thus increasing the complexity of the modeling and sensor calibration. Learning-based methods have been recently proposed to overcome such limitations.
Kim et al. proposed a deep full-body motion network (DFM-Net) for calibrating human motions. In the study, using a wearable sensing suits with 20 strain sensors; an encoder-decoder structure was proposed for encoding sensor information based on LSTM, and the decoding kinematic information using an FNN [66]. Kim et al. also proposed a gait motion generation method based on two multi-fluidic soft strain sensors [67]. The objective of the algorithm is to decrease in the amount of data based on a semi-supervised approach. In particular, the gait motion was embedded using an autoencoder, and decoded using an FNN.
Various studies were also conducted related to human hand. Glauser et al. employed neural networks for the analysis of strain sensor data and the recognition of hand motion [68]. In particular, various deep learning-based algorithms, which included a fully convolutional network (FCN), LSTM, residual neural network (ResNet) [69], U-Net [70], and conditional generative adversarial network [71] were used; and U-Net yielded the highest accuracy with respect to the reconstructions of hand motions. In addition, Sundaram et al. conducted a study related to grasping using a scalable tactile sensor glove with 548 sensors [25]. They estimated the grasped objects, and their weights were determined using a CNN. It also explained the key correspondences of different human hand regions by measuring tactile patterns during grasping.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.