Sunday, May 11, 2008

TIKL: Development of a Wearable Vibrotactile Feedback Suit for Improved Human Motor Learning

[Summary]

The goal of this paper is to create a vibrotactile feedback suit that teach people to learn motions. Based on this idea, they build a system called TIKL (Tactile Interaction for Kinesthetic Learning). With this suit, the learner can learn using multiple channels because it can give tactile feedback on every joint simultaneously. In constrast, a traditional teacher can only teach the people by correcting each joint one by one. Such a system can be used in sports training, motor rehabilitation, dance, postural retraining for health, etc. The feedback system consists of four main modules: users, vicon motion capture system, control software and motor-system feedback.

They tested it with a simple motion: holding a fixed position with their right arm. 40 people were tested on this motion, among which only 20 of them provided with the additional vibration feedback. Their results using a 5-DOF robotic suit show a 27% improvement in accuracy while performing the target motion, and an accelerated learning rate of up to 23%.

[Discussion]

I like the idea of this paper. It would be particularly useful for training motions for a group of people. However, I have a few concerns on this paper. First, the experiment they included in the paper was very simple, and it is hard to imagine how the system would help to improve learning for a more complex dynamic gesture. Second, different people have different skeleton sizes, so how to retarget the teacher's reference motion to fit learners' skeleton. This retargeting is needed because, for different users, the same joint values won't guarantee the motions look like the same or generate the same effect.

FreeDrawer – A Free-Form Sketching System on the Responsive Workbench

[Summary]

This paper presents a sketching system for spline-based free-form sufaces based on a 'Responsive Workbench'. They propose 3D tools for curve drawing and deformation techniques for curves and surfaces. The user directly draws curves in the virtual environment, using a tracked stylus as an input device.

They claim their interface has the following advantages: closed-form parametric representations, easy transfer into standard CAD packages, fast triangulation and evaluation algorithms, infinitesimal smoothness of curves and surfaces and efficient deformation algorithms based on variational modeling.

A drawer can draw in the virtual 3D space freely to create a curve network. One can also change it after creation. After this, surfaces can be filled based on these created curves. They also offer a variety of modification tools: curve smoothing and sharpening, curve dragging and surface sculpting. In this paper, they also demonstrate how a drawer can create a seat by going through the above steps.

[Discussion]

I like this paper because the usage of splines is pretty clever, which avoids possible undesired curve shapes when drawing in 3D. But the biggest problem, I think, is that how a naive user is able to know which curve he or she should draw. Take look at the teapot result, it's probably hard for me to draw this. In addition, it seems this system can only generate some simple 3D objects.

American Sign Language Finger Spelling Recognition System


[Summary]


This paper presents a system that recognizes letters in American Sign Language, and show them on the screen with the sound of the recognized letter. They use neural network to train the postures. And each posture is a 18-dimensional vector, collected by a CyberGlove.


The neural network they use is a perceptron network because they claim that this generates the best recognition results. The perceptron network was trained using an 18x24 input matrix and a 24x24 target matrix, a identity matrix. Because the letters 'J' and 'Z' are not static postures, they just omitted them in the training set.


Their testing results showed the perceptron network got recognition accuracy of 90%, which is for user-dependent testing. And they didn't train and test the perceptron for user-independent case.


[Discussion]


This paper is only a two-page paper, so not a lot information we can gain from it. Everything they showed here is pretty simple. The obvious drawback is that they omitted the two letters ('J' and 'Z'), which make the recognition system would not work for the whole set of ASL letters. Another thing is that they only trained and tested postures for user-dependent case, so it is hard to say how high the accuracy would be for general users.

Invariant features for 3-D gesture recognition

[Summery]

This paper reports the recognition results of 10 different feature vectors for gesture recognition. To investigate this, they compared recognition performance on a set of 18 T’ai Chi gestures. They construct a training set of 54 six-gesture “sentences;” and test on another 54 sentence set. A “sentence” is a set of six gestures performed in sequence which captures the co-articulation exhibited by T’ai Chi.

The recognition method they use is HMM. It has 5 states, forward chaining with jumps possible to the same state or each of the next two states.So every kind of feature is trained with HMM. The two kinds of feature vectors are the raw position , the Cartesian velocity, the polar velocity with angular velocity term , the polar velocity with tangential velocity term, and two sets with instantaneous speed and local curvature.

They separate the testing data into three groups: 'original', 'shifted', 'rotated'. The results showed that the feature vector (dr,dtheta,dz) had the best overall recognition rates (95%), while the raw feature vector (x,y,z) gave the lowest recogntion rates (34%). In addition, all feature sets perform worse on shifted and rotated data.

[Discussion]

They only reported their results on a specific HMM topology, so I was wondering how the results will be if using other topologies. In addition, their reports on different kinds of feature vectors is only for two-hand-and-head gestures. However, my biggest concern is why they use T'ai Chi gestures as their testing gestures. From the images in the paper, all the gestures are performed by a man just sitting in a chair, but actually T'ai Chi should be performed with the full-body.

Cyber Composer: Hand Gesture-Driven Intelligent Music Composition and Generation

[Summery]

This paper presents an interface, Cyber Composer, that lets user to control the tonality and the melody of the music by hand motion and gestures that they generate. Pitch, rhythm and volume of the melody can be controlled and generated in real-time by wearing a pair of CyberGloves and a Polhemus Fastrak.

The Cyber Composer system is composed of several interface modules: the music interface, CyberGlove interface, background music generation module, melody generation module, and the main program which links all the components together. Seven musical expressions are mapped to specific gestures: rhythm, pitch, pitch-shifting, dynamics, volume, dual-instrument mode, and cadence.

[Discussion]

The biggest problem of reading this paper is that they didn't mention how they did the gesture recognition and how the results looked like. They only defined a set of gestures to control the characteristic of music. Although the gestures defined in this paper looks reasonable, it's hard to say it is practical to use without any experiments. In addition, we need to notice that they constrained the number of notes that a user can perform by specifying a overall tonal base, which would make the system works only for some simple music.

Thursday, May 1, 2008

Using Ultrasonic Hand Tracking to Augment Motion Analysis Based Recognition of Manipulative Gestures

[Summary]

Instead of using data gloves or accelerometers, this paper presents a novel hand gesture capture method by adopting ultrasonic sensors. Because using only the ultrasonic sensors would arise some issues, such as reflections and occlusions. The authors actually combine the ultrasonic sensors with accelerometer and gyroscope sensors.

The authors tried different classification techniques: model based classification, frame based classification and fused classification.

For the model based calssification, a set of left-right HMMs are trained to recognize gestures. For frame based classification, they tried a so called C4.5 classifier and a k-nearest neighbor classifier. For the fused classification, the final lassification is then based on a combination of above two classifiers' rankings and the associated probabilities.

To validate their method of fusing inertial and ultrasonic sensor data, they set up an experiment comprising various manipulative gestures based on a bicycle repair task. Within this experiment: a set of sensors are used: (a) ultrasonic senors for distance measurement, (b) acceleration sensors and (c) gyroscopes, the latter two types to capture the motion of relevant body parts of the user. Three users were asked to perform 21 bicycle repair tasks three different times. The results showed that frame-based classification with the accelerometer and gyroscope data produced a 84-percent accuracy. A Model-Based Time Series classification approach gave 65-percent accuracy. Fusion classification results improved the classification accuracy. Use of the ultrasonic sensor data resulted in a 90% accuracy.

[Discussion]

I like this paper since it is the first paper I read using ultrasonic sensors to help collect hand gesture data. I know there's a Siggraph paper talking about using ultrasonic sensors to capture human body motion last year. I think this paper is pretty novel since the authors in this paper said they are the first one using ultrasonic sensors to classify gestures. But I don't know if this is the first paper to use ultrasonic to capture motion data. This paper also pointed out the problems of using ultrasonic sensors alone, which are kind of limit the application of ultrasonic sensors.

Enabling Fast and Effortless Customisation in Accelerometer Based Gesture Interaction

[Summary]

The purpose of this paper is to make the training of HMMs are more easy and efficient. The most time-consuming part of HMMs' training may be data collection. To get high accruacy, the training data also need to be segmented, which is very expensive since there is no automatical and practical segmentation method so far. This paper tries to alleviate this problem by adding noises to the captured data to generate a largger set of training data.

In this paper, accelerometers are used as gesture capturing device. They use a vector quantized codebook of size 8 and then perform recognition using HMMs. They experimented with different types of noise: uniformly distribution noise and Gaussian distribution noise.

For experiments, the authors tested 8 gestures used to control a DVD player. For this set of eight gestures, each trained with two original gestures and with two Gaussian noise-distorted duplicates, the average recognition accuracy was 97%, and with two original gestures and with four noise-distorted duplicates, the average recognition accuracy was 98%, cross-validated from a total data set of 240 gestures. And they also found that the Gaussian distributed noise is slightly better than the uniformly distributed noise.

[Discussion]

I think it was a good idea adding some noise to enlarge the training data set. And this work's comparison between different noises also give us some experimental proofs for this idea.

However, there may be a few things need to be considered. For example, how to determine the parameters of the noise efficiently, how to determine the number of noised data that should be added and whether it would be possible to decrease the recognition accuracy for some other gestures.