<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7226341358299927342</id><updated>2011-07-28T10:55:35.198-07:00</updated><title type='text'>Kevin's Blog for Gesture Recognition</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>40</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8401154899672430155</id><published>2008-05-11T11:26:00.001-07:00</published><updated>2008-05-11T11:26:50.904-07:00</updated><title type='text'>TIKL: Development of a Wearable Vibrotactile Feedback Suit for Improved Human Motor Learning</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;The goal of this paper is to create a vibrotactile feedback suit that teach people to learn motions. Based on this idea, they build a system called TIKL (Tactile Interaction for Kinesthetic Learning). With this suit, the learner can learn using multiple channels because it can give tactile feedback on every joint simultaneously. In constrast, a traditional teacher can only teach the people by correcting each joint one by one. Such a system can be used in sports training, motor rehabilitation, dance, postural retraining for health, etc. The feedback system consists of four main modules: users, vicon motion capture system, control software and motor-system feedback.&lt;br /&gt;&lt;br /&gt;They tested it with a simple motion: holding a fixed position with their right arm. 40 people were tested on this motion, among which only 20 of them provided with the additional vibration feedback. Their results using a 5-DOF robotic suit show a 27% improvement in accuracy while performing the target motion, and an accelerated learning rate of up to 23%.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I like the idea of this paper. It would be particularly useful for training motions for a group of people. However, I have a few concerns on this paper. First, the experiment they included in the paper was very simple, and it is hard to imagine how the system would help to improve learning for a more complex dynamic gesture. Second, different people have different skeleton sizes, so how to retarget the teacher's reference motion to fit learners' skeleton. This retargeting is needed because, for different users, the same joint values won't guarantee the motions look like the same or generate the same effect.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8401154899672430155?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8401154899672430155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8401154899672430155' title='43 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8401154899672430155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8401154899672430155'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/tikl-development-of-wearable.html' title='TIKL: Development of a Wearable Vibrotactile Feedback Suit for Improved Human Motor Learning'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>43</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-3915632653829194987</id><published>2008-05-11T10:41:00.000-07:00</published><updated>2008-05-11T10:42:00.551-07:00</updated><title type='text'>FreeDrawer – A Free-Form Sketching System on the Responsive Workbench</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper presents a sketching system for spline-based free-form sufaces based on a 'Responsive Workbench'. They propose 3D tools for curve drawing and deformation techniques for curves and surfaces. The user directly draws curves in the virtual environment, using a tracked stylus as an input device.&lt;br /&gt;&lt;br /&gt;They claim their interface has the following advantages: closed-form parametric representations, easy transfer into standard CAD packages, fast triangulation and evaluation algorithms, infinitesimal smoothness of curves and surfaces and efficient deformation algorithms based on variational modeling.&lt;br /&gt;&lt;br /&gt;A drawer can draw in the virtual 3D space freely to create a curve network. One can also change it after creation. After this, surfaces can be filled based on these created curves. They also offer a variety of modification tools: curve smoothing and sharpening, curve dragging and surface sculpting. In this paper, they also demonstrate how a drawer can create a seat by going through the above steps.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I like this paper because the usage of splines is pretty clever, which avoids possible undesired curve shapes when drawing in 3D. But the biggest problem, I think, is that how a naive user is able to know which curve he or she should draw. Take look at the teapot result, it's probably hard for me to draw this. In addition, it seems this system can only generate some simple 3D objects.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-3915632653829194987?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/3915632653829194987/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=3915632653829194987' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3915632653829194987'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3915632653829194987'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/freedrawer-free-form-sketching-system.html' title='FreeDrawer – A Free-Form Sketching System on the Responsive Workbench'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-3534777792056911482</id><published>2008-05-11T10:08:00.001-07:00</published><updated>2008-05-11T10:08:53.079-07:00</updated><title type='text'>American Sign Language Finger Spelling Recognition System</title><content type='html'>&lt;p&gt;&lt;br /&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper presents a system that recognizes letters in American Sign Language, and show them on the screen with the sound of the recognized letter. They use neural network to train the postures. And each posture is a 18-dimensional vector, collected by a CyberGlove. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;The neural network they use is a perceptron network because they claim that this generates the best recognition results. The perceptron network was trained using an 18x24 input matrix and a 24x24 target matrix, a identity matrix. Because the letters 'J' and 'Z' are not static postures, they just omitted them in the training set.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Their testing results showed the perceptron network got recognition accuracy of 90%, which is for user-dependent testing. And they didn't train and test the perceptron for user-independent case.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper is only a two-page paper, so not a lot information we can gain from it. Everything they showed here is pretty simple. The obvious drawback is that they omitted the two letters ('J' and 'Z'), which make the recognition system would not work for the whole set of ASL letters. Another thing is that they only trained and tested postures for user-dependent case, so it is hard to say how high the accuracy would be for general users.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-3534777792056911482?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/3534777792056911482/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=3534777792056911482' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3534777792056911482'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3534777792056911482'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/american-sign-language-finger-spelling.html' title='American Sign Language Finger Spelling Recognition System'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8776203685664831632</id><published>2008-05-11T09:54:00.003-07:00</published><updated>2008-05-11T09:54:57.380-07:00</updated><title type='text'>Invariant features for 3-D gesture recognition</title><content type='html'>[Summery]&lt;br /&gt;&lt;br /&gt;This paper reports the recognition results of 10 different feature vectors for gesture recognition. To investigate this, they compared recognition performance on a set of 18 T’ai Chi gestures. They construct a training set of 54 six-gesture “sentences;” and test on another 54 sentence set. A “sentence” is a set of six gestures performed in sequence which captures the co-articulation exhibited by T’ai Chi.&lt;br /&gt;&lt;br /&gt;The recognition method they use is HMM. It has 5 states, forward chaining with jumps possible to the same state or each of the next two states.So every kind of feature is trained with HMM. The two kinds of feature vectors are the raw position , the Cartesian velocity, the polar velocity with angular velocity term , the polar velocity with tangential velocity term, and two sets with instantaneous speed and local curvature.&lt;br /&gt;&lt;br /&gt;They separate the testing data into three groups: 'original', 'shifted', 'rotated'. The results showed that the feature vector (dr,dtheta,dz) had the best overall recognition rates (95%), while the raw feature vector (x,y,z) gave the lowest recogntion rates (34%). In addition, all feature sets perform worse on shifted and rotated data.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;They only reported their results on a specific HMM topology, so I was wondering how the results will be if using other topologies. In addition, their reports on different kinds of feature vectors is only for two-hand-and-head gestures. However, my biggest concern is why they use T'ai Chi gestures as their testing gestures. From the images in the paper, all the gestures are performed by a man just sitting in a chair, but actually T'ai Chi should be performed with the full-body.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8776203685664831632?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8776203685664831632/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8776203685664831632' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8776203685664831632'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8776203685664831632'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/invariant-features-for-3-d-gesture.html' title='Invariant features for 3-D gesture recognition'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8706810425350644129</id><published>2008-05-11T09:54:00.001-07:00</published><updated>2008-05-11T09:54:29.175-07:00</updated><title type='text'>Cyber Composer: Hand Gesture-Driven Intelligent Music Composition and Generation</title><content type='html'>[Summery]&lt;br /&gt;&lt;br /&gt;This paper presents an interface, Cyber Composer, that lets user to control the tonality and the melody of the music by hand motion  and gestures that they generate. Pitch, rhythm and volume of the melody can be controlled and generated in real-time by wearing a pair of CyberGloves and a Polhemus Fastrak.&lt;br /&gt;&lt;br /&gt;The Cyber Composer system is composed of several interface modules: the music interface, CyberGlove interface, background music generation module, melody generation module, and the main program which links all the components together.  Seven musical expressions are mapped to specific gestures: rhythm, pitch, pitch-shifting, dynamics, volume, dual-instrument mode, and cadence.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;The biggest problem of reading this paper is that they didn't mention how they did the gesture recognition and how the results looked like. They only defined a set of gestures to control the characteristic of music. Although the gestures defined in this paper looks reasonable, it's hard to say it is practical to use without any experiments. In addition, we need to notice that they constrained the number of notes that a user can perform by specifying a overall tonal base, which would make the system works only for some simple music.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8706810425350644129?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8706810425350644129/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8706810425350644129' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8706810425350644129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8706810425350644129'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/cyber-composer-hand-gesture-driven.html' title='Cyber Composer: Hand Gesture-Driven Intelligent Music Composition and Generation'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8133318108984030085</id><published>2008-05-01T23:55:00.000-07:00</published><updated>2008-05-11T09:49:01.874-07:00</updated><title type='text'>Using Ultrasonic Hand Tracking to Augment Motion Analysis Based Recognition of Manipulative Gestures</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;Instead of using data gloves or accelerometers, this paper presents a novel hand gesture capture method by adopting ultrasonic sensors. Because using only the ultrasonic sensors would arise some issues, such as reflections and occlusions. The authors actually combine the ultrasonic sensors with accelerometer and gyroscope sensors.&lt;br /&gt;&lt;br /&gt;The authors tried different classification techniques: model based classification, frame based classification and fused classification.&lt;br /&gt;&lt;br /&gt;For the model based calssification, a set of left-right HMMs are trained to recognize gestures. For frame based classification, they tried a so called C4.5 classifier and a k-nearest neighbor classifier. For the fused classification, the final lassification is then based on a combination of above two classifiers' rankings and the associated probabilities.&lt;br /&gt;&lt;br /&gt;To validate their method of fusing inertial and ultrasonic sensor data, they set up an experiment comprising various manipulative gestures based on a bicycle repair task. Within this experiment: a set of sensors are used: (a) ultrasonic senors for distance measurement, (b) acceleration sensors and (c) gyroscopes, the latter two types to capture the motion of relevant body parts of the user. Three users were asked to perform 21 bicycle repair tasks three different times. The results showed that frame-based classification with the accelerometer and gyroscope data produced a 84-percent accuracy. A Model-Based Time Series classification approach gave 65-percent accuracy. Fusion classification results improved the classification accuracy. Use of the ultrasonic sensor data resulted in a 90% accuracy.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I like this paper since it is the first paper I read using ultrasonic sensors to help collect hand gesture data. I know there's a Siggraph paper talking about using ultrasonic sensors to capture human body motion last year. I think this paper is pretty novel since the authors in this paper said they are the first one using ultrasonic sensors to classify gestures. But I don't know if this is the first paper to use ultrasonic to capture motion data. This paper also pointed out the problems of using ultrasonic sensors alone, which are kind of limit the application of ultrasonic sensors.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8133318108984030085?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8133318108984030085/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8133318108984030085' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8133318108984030085'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8133318108984030085'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/using-ultrasonic-hand-tracking-to.html' title='Using Ultrasonic Hand Tracking to Augment Motion Analysis Based Recognition of Manipulative Gestures'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-667736297890984734</id><published>2008-05-01T22:35:00.000-07:00</published><updated>2008-05-11T09:49:16.506-07:00</updated><title type='text'>Enabling Fast and Effortless Customisation in Accelerometer Based Gesture Interaction</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;The purpose of this paper is to make the training of HMMs are more easy and efficient. The most time-consuming part of HMMs' training may be data collection. To get high accruacy, the training data also need to be segmented, which is very expensive since there is no automatical and practical segmentation method so far. This paper tries to alleviate this problem by adding noises to the captured data to generate a largger set of training data.&lt;br /&gt;&lt;br /&gt;In this paper, accelerometers are used as gesture capturing device. They use a vector quantized codebook of size 8 and then perform recognition using HMMs. They experimented with different types of noise: uniformly distribution noise and Gaussian distribution noise.&lt;br /&gt;&lt;br /&gt;For experiments, the authors tested 8 gestures used to control a DVD player. For this set of eight gestures, each trained with two original gestures and with two Gaussian noise-distorted duplicates, the average recognition accuracy was 97%, and with two original gestures and with four noise-distorted duplicates, the average recognition accuracy was 98%, cross-validated from a total data set of 240 gestures. And they also found that the Gaussian distributed noise is slightly better than the uniformly distributed noise.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I think it was a good idea adding some noise to enlarge the training data set. And this work's comparison between different noises also give us some experimental proofs for this idea.&lt;br /&gt;&lt;br /&gt;However, there may be a few things need to be considered. For example, how to determine the parameters of the noise efficiently, how to determine the number of noised data that should be added and whether it would be possible to decrease the recognition accuracy for some other gestures.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-667736297890984734?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/667736297890984734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=667736297890984734' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/667736297890984734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/667736297890984734'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/enabling-fast-and-effortless.html' title='Enabling Fast and Effortless Customisation in Accelerometer Based Gesture Interaction'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-2983095489684215261</id><published>2008-05-01T21:54:00.000-07:00</published><updated>2008-05-11T09:49:25.110-07:00</updated><title type='text'>3D Visual Detection of Correct NGT Sign Production</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;In this paper, the authors create a system that can help people in learning Dutch Sign Language (DSL). The recognition system is vision-based. Two calibrated video cameras are set on top on a table where people perform their hand gestures. The user's head and hands are tracked based on following skin-colored segments of the image from frame to frame. The head is used as a stationary reference point. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;The adaptive chrominance model can work with different lighting and backgrounds. Skin color is modeled by a 2D Gaussian per-pendicular to the main direction of the distribu-tion of the positive skin samples in RGB space. Tracking the hands and head is done separately in both cameras by following their respective blobs over consecutive frames or, when hand blobs cannot be separated (due to occlusion), by performing a template search over skin areas using the gray image of the hand in the pre-vious frame. The hands and head locations are reinitialized by their position using the three largest skin blobs in the image and tracked by finding the nearest blob or best template match in the next frame.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;For classification, fifty different properties have been derived that are related to the 2D/3D location and movement of the hands. These properties are measured in each frame. And each property is trained as one classifier.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;A set of 120 different NGT signs performed by 70 individuals are used to test the sign classification. They also perform the cross validation, and the overall recognition accuray is 95%. They compare their results with linear time warping and dynamic time warping. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;One problem of this vision-based recognition system is that they cannot recognize the hand-crossing gestures correctly since they use simply the left blob as the left hand and right blob as the right hand. There's another problem which is also a common problem for vision-based tracking is that the occlusion problem. Becuase they put the video camera very close to each other (15cm) and both pointing at the hands from the similar direction, it is hard to avoid that there would be some occlusion parts in video.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Another thing is that I really don't think train each feature separately is a good idea since differen feature are related rather than independent. In addition, 50 features may be too many, why they didn't consider using some feature selection techniques.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-2983095489684215261?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/2983095489684215261/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=2983095489684215261' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2983095489684215261'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2983095489684215261'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/3d-visual-detection-of-correct-ngt-sign.html' title='3D Visual Detection of Correct NGT Sign Production'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-2290537313849828910</id><published>2008-05-01T21:01:00.000-07:00</published><updated>2008-05-11T09:49:46.415-07:00</updated><title type='text'>Gesture Recognition Using an Acceleration Sensor and Its Application to Musical Performance Control</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper presents a hand gesture recognition system that is applied to the music performance control. The gestures are captured with an 3D accelerometer. The author extract a new set of feature parameters from the 3D acceleration instead of use this acceleration directly. The new feature space is still 3-dimensional, and it is just different combinations of 2D projections.&lt;br /&gt;&lt;br /&gt;In order to recognize a gesture from the acceleration time series, the start of the gesture must be detected. They used a user-dependent magnitude of acceleration to identify the start of gestures. The musical performance control also need recognization of mucical tempo, which are recognized by simply looking at the y-z accelerations. In addition, the rhythm points can be identified in real time by detecting the maxima that appear most periodically.&lt;br /&gt;&lt;br /&gt;In this paper, they also tested the musical performance control with 10 gestures. They achieved an accuracy of 100% for the user-dependent training, and 70%~100% accuracy for the user-independent training.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I think this is an interesting paper in term of their application of the hand gesture recognition. A big challenge for recognize sequential gestures is to segment the data streams. This paper did this by detecting the starting point of a gesture. From their experiments, the recognition performance is pretty good. But I guess it is probably because the starting points of the testing gestures are easy to detect. Since I don't believe there're simple and practical method to do the segmentation and reocognition jobs on complicated gestures so far.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-2290537313849828910?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/2290537313849828910/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=2290537313849828910' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2290537313849828910'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2290537313849828910'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/gesture-recognition-using-acceleration.html' title='Gesture Recognition Using an Acceleration Sensor and Its Application to Musical Performance Control'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8943312596667076119</id><published>2008-05-01T13:14:00.001-07:00</published><updated>2008-05-11T09:49:55.449-07:00</updated><title type='text'>Hand gesture modelling and recognition involving changing shapes and trajectories, using a Predictive EigenTracker</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper presents an approach to recognize hand gestures in video. Both the hand shape and hand position information are involved into the recognition process. They call their method as "Predictive EigenTracker". In addition, this method also allows users to choose a gesture vocabulary so as to maximize recognition accuracy.&lt;br /&gt;&lt;br /&gt;Basically, they employ a Particle Filtering (condensation) predictive framework to track the hand first. The dynamic model used in this tracker is a second-order Markov chain with noise. The tracker is initialized by detecting the hand skin color.&lt;br /&gt;&lt;br /&gt;After tracking the hand position in video, a shape-trajectories eigenspace is modeled by principle components analysis. And then Mahalanobis distance between gestures are computed to help users to select a proper gesture set with highest accuracy.&lt;br /&gt;&lt;br /&gt;To demonstrate the performance, they showed an application of their tracker and recognizer, controlling an audio player with hand gestures. This application ended up with a 100% accuracy.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I think this system will not be robust. The only result shown in this paper is a set of very simple gestures with totally different color as background, and the performer are in a black shirt, which makes the tracking problem very easy. Moreover, this system cannot work on a larger set of gestures. This paper only showed us a set with 8 gestures.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8943312596667076119?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8943312596667076119/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8943312596667076119' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8943312596667076119'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8943312596667076119'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/hand-gesture-modelling-and-recognition.html' title='Hand gesture modelling and recognition involving changing shapes and trajectories, using a Predictive EigenTracker'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8530066161324403499</id><published>2008-05-01T13:10:00.002-07:00</published><updated>2008-05-11T09:50:06.190-07:00</updated><title type='text'>Television Control by Hand Gestures</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper presents a vision-based approach for controlling a television using hand gestures. To make the gesture based interface easier to use, the author use the visual feed back of the television display. This also avoid the problem that the user may need to remember a lot of complicated hand gestures. The user uses only one gesture: the open hand, facing the camera. He controls the television by moving his hand. On the display, a hand icon appears which follows the user's hand. The user can then move his own hand to adjust various graphical controls with the hand icon.&lt;br /&gt;&lt;br /&gt;The open hand presents a characteristic image which the computer can detect and track. They perform a normalized correlation of a template hand to the image to analyze the user's hand. A local orientation representation is used to achieve some robustness to lighting variations.&lt;br /&gt;&lt;br /&gt;They built a real-time prototype by recognize hand gestures on a computer and then translate the gestures into the remote control signal to control a TV. They also demonstrate that it is a tradeoff between the system response time and field-of-view.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I have to say that this television-control application is not that useful since it is so easy to control a television just with a remote control. To make this work, an extra camara must go with the television, increasing the cost of the television. In addition, it is obvious that performing hand gestures to control TVs is much more tiring. More importantly, it is hard to guarantee that the recognition process is robust enough to do this control.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8530066161324403499?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8530066161324403499/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8530066161324403499' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8530066161324403499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8530066161324403499'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/television-control-by-hand-gestures.html' title='Television Control by Hand Gestures'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-3686474740045426530</id><published>2008-05-01T13:10:00.001-07:00</published><updated>2008-05-11T09:50:15.415-07:00</updated><title type='text'>A Method for Recognizing a Sequence of Sign Language Words Represented in a Japanese Sign Language Sentence</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper presents a recognition method for sequential Japenese sign language words as a part of automatical JSL interpretion. They actually extended previous work for sequential sign language recognition by the following techniques: 1) a method to detect the borders of signed words from ordinary sign-language gestures, (2) a method to detect whether the signed gesture is represented by one hand or both hands, and (3) a method for segregating the segments representing the singed words from the transitional gesture segments.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;A glove-based input device (CyberGlove) is used as input. The word-recognition step identifies each signed word represented in an inputted gesture. The signed words are described by combining the gesture primitives, such as hand shape, palm direction, linear motion, and circular motion. During the recognition process, the gesture primitives are identified from the inputted gesture, then the signed word is recognized by the time and spatial relationship between the gesture primitives. The gesture-segmentation step detects the borders of the signed words and divides the gestures into severalsegments representing words or transitions. The hand-determinationstep determines whether the gesture in each segment is represented by one or both hands. The wordtransition distinction step differentiates between the gestures that represent words and those that representtransitions. The word-allocation step analyzes the relationshipbetween the recognized signed-words and the segments, then assigns the words to the segments. Finally, the sequence-generation step combines the recognized signed-words and generates a sequence of words.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;They collected 200 samples of JSL sentences. The samples included 960 words. Among them, 100 sentences were used to determine the parameters, and the other 100 sentences were used to evaluate the methods. The results show that the accuracy for the word was improved from 77.6% to 86.6%, and the accuracy for the sentence was improved from 46.0% to 58.0% by using the developed methods. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;As an extension work from previous work, this paper added three extra techniques that help to improve the sequential gesture data. These three techniques are very intuitive and reasonable to add into the system, so the recognition accuracies increased. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;However, the recognition performance (58%) for a signed sentence is far from adequate for a practical system. And the author also mention this, and they suggested that this problem might be solved by improving recognition accuracy for the signed word as well as developing a method to recognize non-manual gestures such as nods, glances, and facial expressions, which are used to convey grammaticalinformation in sign language.&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-3686474740045426530?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/3686474740045426530/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=3686474740045426530' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3686474740045426530'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3686474740045426530'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/method-for-recognizing-sequence-of-sign.html' title='A Method for Recognizing a Sequence of Sign Language Words Represented in a Japanese Sign Language Sentence'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6304111068347161460</id><published>2008-05-01T13:09:00.004-07:00</published><updated>2008-05-11T09:50:27.877-07:00</updated><title type='text'>American Sign Language Recognition in Game Development for Deaf Children</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper presents an American Sign Language (ASL) game, CopyCat, that helps deaf children to learn and practice ASL. The system recognize hand postures performed by children and control an animated character with these recognized sign language. The database of signing samples was collected from user studies of deaf children playing aWizard of Oz version of the game at the Atlanta Area School for the Deaf (AASD). The dataset consisted of 541 phrase samples and 1,959 individual sign samples of five children signing game phrases from a 22 word vocabulary.&lt;br /&gt;&lt;br /&gt;The children wear small colored gloves with wireless accelerometers mounted on the back of their wrists. The hand shape information is captured from a three-axis accelerometer data and computer-vision-based method. The collected data is used as features to train hidden Markov models for recognition.&lt;br /&gt;&lt;br /&gt;Their recognition approach uses color histogram adaptationfor robust hand segmentation and tracking. The vision data are combined with (x, y, z) values from each accelerometer. And this feature vector is then fed into a 4-state, left-right HMM that was implemented with the Georgia Tech Gesture Toolkit.&lt;br /&gt;&lt;br /&gt;They evaluated our approach by using leave–one–out validation; this technique iterates through each child, training on data from four children and testing on the remaining child's data. They achieved average word accuracies per child ranging from 91.75% to 73.73% for the user–independent models, while the average sentense accuracies are very low, 68% for user-dependent models and 50% for user-independent models.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;I like this paper's idea that use a game to help deaf children learn ASL. I think it is a very good application for hand gesture recognition. And they have relatively complete user-study results.&lt;br /&gt;&lt;br /&gt;Their implementation is based on the Georgia Tech Gesture Toolkit, and the word recognition accuracies are good while sentense recognition accuracies are poor (50% for user-independent model). It indicates the problem of the hidden markov model based recognition, that is, that the performance is not good enough for unsegmented data.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6304111068347161460?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6304111068347161460/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6304111068347161460' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6304111068347161460'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6304111068347161460'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/american-sign-language-recognition-in.html' title='American Sign Language Recognition in Game Development for Deaf Children'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-7676702015507273913</id><published>2008-05-01T13:09:00.003-07:00</published><updated>2008-05-11T09:50:37.885-07:00</updated><title type='text'>A Hidden Markov Model Based Sensor Fusion Approach for Recognizing Continuous Human Grasping Sequences</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper uses Hidden Markov Model to recognize human grasping motions. This system is a part of Programming by Demonstration (PbD), aiming at teaching a robot to accomplish a task by learning from a human demonstration. In this paper, they want robots to 'understand' what a human grasping 'mean'.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;In order to capture the human grasps, they use an 18-sensor CyberGlove and 16 pressure sensitive sensors. Because computer vision based method usually can neither deal with occlusion problem nor detect contact points between human hands and objects. In addition, these 16 pressure sensitive sensors also give the force information rather than just 'touching or not touching'.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The authors classify grasps according to manipulation primitives defined by the Kamakura taxonomy. This taxonomy distinguishes 14 different grasp types: 5 power grasps, 4 intermediate grasps, 4 precision grasps, and 1 thumbless grasp. They chose to use this taxonomy because it places no restrictions on the handled objects or domain, and also because it focuses more on the hand shape and fingers involved rather than on the purpose of the grasp.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;They learn a Hidden Markov model for each of the 14 grasp primitives and use the outputs from the CyberGlove and pressure sensors as data features. And another HMM is also created to represent the rest motion. Beside these, a 'garbage model' is created to model other unwanted motions. Each HMM has 9 states, and it is a flat topology model.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The training is based on 112 example motions. They also test the HMMs on other 112 testing examples. An accuracy of up to 92.2% for a single user system, and 90.9% for a multiple user system could be achieved.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Grasping motion recognition is a hard problem, since extra contacting information should be modeled, besides the gestures itself. For just a single object, there would be a lof of different grasps. And this grasp recognition can be a part of an augmented reality interface. Most of hand gesture applications just focus on the gesture itself without considering the interact between human and the environment. Those hand gesture recognition systems are more like a 'commanding system'. But one can easily imagine that an actually augmented reality interface should make the interaction between human and visual enviroment as real as possible, so grasping motion is a topic that very essential.&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-7676702015507273913?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/7676702015507273913/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=7676702015507273913' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/7676702015507273913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/7676702015507273913'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/hidden-markov-model-based-sensor-fusion.html' title='A Hidden Markov Model Based Sensor Fusion Approach for Recognizing Continuous Human Grasping Sequences'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6815010716038325746</id><published>2008-05-01T13:09:00.001-07:00</published><updated>2008-05-11T09:50:54.226-07:00</updated><title type='text'>Computer Vision-Based Gesture Recognition for an Augmented Reality Interface</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper presents a computer vision-based gesture recognition system as a part of an augmented reality system. It can recognize a 3D pointing gesture, a click gesture, and five static gestures. Each of these five static gestures has one several fingers outstretched, e.g., the third gesture has three fingers outstretched. Choosing these gestures is because the author think these gestures would be the minimum requires for a AR interface and also they are easy to recognize.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The task of the low-level segmentation is to detect andrecognise the above mentioned PHO and pointers, as wellas hands in the 2D images captured with the HMC. They use normalised RGB, also called chromaticities, to achieve invariance to the intensity, which are calculated by dividing the RGB elements with their first norm. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;After having segmented the hand pixels from the image, they first detect the number of outstretched fingers to recognize the above five static gestures and then handles the point and click gestures. They simply count the number of "rectangles" which correspond to the fingers. The "point and click" gesture is recognized simply by being in the state where a single finger is extended and the user "clicks" by quickly extending and bringing back in the thumb. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The biggest problem of this paper is that they didn't really provide any recognition result, even for this very simple gesture set. They only claimed their recognition was "sufficient" to be used in a AR interface.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;And their recognition method is totally heuristic, and it is impossible to use it in a larger gesture set. Most of the efforts are actually on the computer-vision based segmentation.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6815010716038325746?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6815010716038325746/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6815010716038325746' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6815010716038325746'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6815010716038325746'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/computer-vision-based-gesture.html' title='Computer Vision-Based Gesture Recognition for an Augmented Reality Interface'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-7969262224106829522</id><published>2008-05-01T13:08:00.001-07:00</published><updated>2008-05-11T09:51:04.923-07:00</updated><title type='text'>A Spatio-temporal Extension to Isomap Nonlinear Dimension Reduction</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;This paper extended the standard Isomap algorithm for data with both spatial and temporal relationships. Basically, a embedding space is learned from the tempo-spatial data, and the similar gestures are put in close locations. Two instantiations of ST-Isomap are presented for sequentially continuous and segmented data. Continuous ST-Isomap is suited for uncovering spatio-temporal manifolds of data exhibiting temporal coherence, where sequentially adjacent samples are incrementally different. Segmented ST-Isomap is suited for uncoveringspatio-temporal clusters in segmented data, where the input data is prepartitioned.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Two added two more techniques into the standard Isomap: proximal disambiguation of spatially proximal data points in the input space that are structurally different, and distal correspondence of spatially distal data points in the input space that share common structure. With these two extra steps, the algorithm can distinguish behaviors (e.g., "wave left" and "wave right") that should be separated to distal locations in the embedding space, and can also put close similar gestures (e.g., "low wave" and "high wave") that should be placed into proximity in the resulting embedding.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;They tested the ST-Isomap on robonaut sensor data and human motion data. For the robonaut sensor data, they applied continuous ST-Isomap. The data are 57-dimensional grasping motions. They compare the embeddings between from PCA and from ST-Isomap, and ST-Isomap gives more reasonable embedding. For the human motions, ST-Isomap also gives an embedding with obvious structures. These tests demonstrate that ST-Isomap should be a better dimensionality reduction algorithm for spatial-temporal data.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;I selected this paper because I thought it would give some hint to deal with temparal information embedded in hand gestures. We already see some papers using PCA, but PCA is not really suitable for the hand gestures, which are not static. And Isomap is superior to the PCA, so I thought this extension from standard Isomap would help.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;From their results, I did see some advantages that other dimensionality reduction algorithms cannot obtain. And after the dimensionality reduction, one can perform recognition in the low-dimensional space more efficiently and easier, since, actually, the dimensionality reduction itself would cluster the similar hand gestures.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;However, there are some obvious disadvantages for ST-Isomap. For example, there are two very important parameters (C_CTN and C_ATN) that are hard to specify. These two parameters actually balance spatial similarity and temporal similarity.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-7969262224106829522?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/7969262224106829522/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=7969262224106829522' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/7969262224106829522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/7969262224106829522'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/spatio-temporal-extension-to-isomap.html' title='A Spatio-temporal Extension to Isomap Nonlinear Dimension Reduction'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6140274465603831814</id><published>2008-05-01T13:07:00.000-07:00</published><updated>2008-05-11T09:51:13.505-07:00</updated><title type='text'>Articulated Hand Tracking by PCA-ICA Approach</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper uses PCA and ICA to model hand gestures. PCA can only be used to represent the global features, so they use ICA (independent component analysis) to represent local features. To build the model, they first capture different hand gestures with a data glove. Each frame contains 20 degree of freedom, and they resample 100 frames to get a 2000-dimensional vector. As other old approach, PCA is performed on these 2000-dimensional vectors to reduce their dimensionality. They choose the first 5 principal components which preserve 95% of the data energy. Then they perform ICA on each individual principal component in the PCA subspace.&lt;br /&gt;&lt;br /&gt;They tested the learned PCA-ICA model by tracking a hand on video. In this experiment, they use particle filtering to perform the tracking, and the learned model is used as the dynamic model in the particle filtering.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;The main idea is that the author tried to build a model to represent hand gestures. This might be useful, and this learned model might be used to recognize hand gestures, synthesize new hand motions or do the video compression.&lt;br /&gt;&lt;br /&gt;However, I am actually not convinced by the only vision tracking experiment, since I think the tracking result would be good even without the learned PCA-ICA model. They probably could do some cross validation to show that the learned PCA-ICA model actually can cover the hand gesture space.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6140274465603831814?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6140274465603831814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6140274465603831814' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6140274465603831814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6140274465603831814'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/05/articulated-hand-tracking-by-pca-ica.html' title='Articulated Hand Tracking by PCA-ICA Approach'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-5726013651728711045</id><published>2008-04-28T11:37:00.002-07:00</published><updated>2008-05-11T09:51:24.384-07:00</updated><title type='text'>Toward Natural Gesture Speech HCI A Case Study of Weather Narration</title><content type='html'>&lt;p&gt;&lt;br /&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper discusses a case study of weather narration to analyze co-occurrence of different gestures with some spoken keywords. It aims to study the interaction between speech and gesture, demonstrate the power of a gesture and speech-based HCI, and to show that speech can be used to help the system increase the recognition accuracy for gestures.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The author uses HMM to recognize weatherman's gestures. A left-to-right HMM with three main phases are built. The three phases are: a preparation phase, a retraction phase and an actual stroke phase. The actual stroke phase includes four kinds of gestures, pointing, area, contour, and rest. They chooses a 10 dimensional vector as a gesture feature set, including the distances between the center of hands and the center of the face, the angles between the vertical and the distance vectors, and the velocities of the above parameters. The training process a set of 20 well formed isolated gestures samples, and tested on 12 test samples of isolated gestures and 4 continuos data.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;A co-occurrence analysis is performed on certain keywords, such as "here", location-related words and direction-related words. The results show that gestures and speech are correlated closely. For example, "here" was during a gesture phase in a probability of 83%.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;They then use speech to improve the performance of the gesture recognition. The four tested sequences show that, the recognition accuracy of three increases in around 10% with the help of spoken keywords.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;I like this paper's idea, even though it's just a case study. The results from their co-occurrence between gestures and speech can help people, to some degree, to design a real gesture/speech based HCI although they didn't come up with some principles or rules for the design. But, for this case study itself, I think it would be better if they could test it on more sequences since four testing data might not be very convincing.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-5726013651728711045?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/5726013651728711045/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=5726013651728711045' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/5726013651728711045'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/5726013651728711045'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/toward-natural-gesture-speech-hci-case.html' title='Toward Natural Gesture Speech HCI A Case Study of Weather Narration'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6142416240243060579</id><published>2008-04-28T11:37:00.001-07:00</published><updated>2008-05-11T09:51:33.531-07:00</updated><title type='text'>Georgia Tech Gesture Toolkit: Supporting Experiments in Gesture Recognition</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper introduces a HMM-based gesture recognition library, called Georgia Tech Gesture Toolkit, which leverages Cambridge University's speech recognitiontoolkit, HTK. It abstracts the lower level details of the HMM process and allows users to focus instead on high level gesture recognition concepts. Georgia Tech Gesture Toolkit provides users with tools for preparation, training, validation and recognition. It also provides tools allowing novice users to automatically generate models with different topologies.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;All the data put into this library must be annotated by the user, and each gesture is modeled using a separate HMM. In addition, GT2K accepts a rule-based or stochastic grammar to make use of knowledge about the structure of data. It provides two kinds of traning/validation techniques: cross-validation and leave-one-out validation. &lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper also shows four applications of GT2K. The first one is "Gesture Panel", which provides a gesture recognition in automobiles to let users control a radio. It employs a black and white camera and a grid of 72 infrared lights. It has an recognition accuracy of 99.20%. The second is for blink pattern recognition, named "Prescott". It aims to use "blinkprint" as a way to identify people in a restricted area. The next system, "TeleSign", is a sign language recognition system for mobile environments. It achieved an accuracy of 90.48%. The fourth application is recognizing human activities, such as sawing, hammering, drilling, etc, in a workshop. It achieved an accuracy of 93.33%.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;I welcome this kind of libraries since it hides low-level details of HMM designs, which makes it easy to build prototype systems. And I also like the idea of specifying grammars to define data structures, thus helping recognize data streams. However, since it uses HMMs, GT2K may suffer the same problems as HMM does. For example, it may not be able to deal with data set having a large number of categories, and may have low accuracy on un-segmented data.&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6142416240243060579?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6142416240243060579/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6142416240243060579' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6142416240243060579'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6142416240243060579'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/georgia-tech-gesture-toolkit-supporting.html' title='Georgia Tech Gesture Toolkit: Supporting Experiments in Gesture Recognition'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-2786561165003029204</id><published>2008-04-28T11:35:00.002-07:00</published><updated>2008-05-11T09:51:41.763-07:00</updated><title type='text'>A Survey of Hand Posture and Gesture Recognition Techniques and Technology</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper summerizes existing common algorithms for hand posture and gesture recognition, and discusses possible applications. The author classifies the techniques into three categories: Feature extraction, statistics and models; Learning algorithms; and Miscellaneous techniques.&lt;br /&gt;&lt;br /&gt;Template matching can be used in both glove-based and vision-based solutions. It has two parts. The first is to create the templates by collecting data values for each posture in the posture set. The second part is to compare the current sensor readings with the given set of templates to find the posture template most closely matching the current data record. Template matching is simple to implement, and accurate for small set of postures, but it is not suited for hand gestures. Feature extraction method analyzes low-level information from the raw data to produce higher-level semantic information and then used higher-level information to recognize postures and gestures. It can recognize both postures and gestures. Performing PCA on images can be used to recognize 25 to 35 kinds of postures, but it requires training by more than one person for accurate results. Neural Networks can be used to recognize relatively larger posture and gesture data set. The data could be either from data-glove or from images. It requires adequate training to ge high accuracy. One disadvantage of neural network is that it is hard to determine which configuration is best without implementing them. Hidden Markov Models are trained separately for each gesture class, and then their probabilities are evaluated for each new gesture. It can be used in either a vision-based or glove-based solution, and can recognize relatively larger data sets. Instance-based learning uses k-nearest neighbors to determine the category of a new posture. It requires more time and memory space, only works on postures.&lt;br /&gt;&lt;br /&gt;Then the author discussed possible applications: Sign Language, Gesture-to-Speech, Presentations, Virtual Environments, Television Control, 3D Modeling, Multimodal Interaction and Human/Robot Manipulation and Instruction.&lt;br /&gt;&lt;br /&gt;[Discussion]&lt;br /&gt;&lt;br /&gt;First, I think it would be better if the author separate gestures and postures into two parts. Because they are really very different in terms of data and algorithms. Like HMM can only be used to recognize time-series data, rather than just the static postures. Second, the accuracy of each algorithm the auther had in this paper didn't really give much information to me which algorithm is better since all the testing are based on different data set and different classes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-2786561165003029204?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/2786561165003029204/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=2786561165003029204' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2786561165003029204'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2786561165003029204'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/survey-of-hand-posture-and-gesture.html' title='A Survey of Hand Posture and Gesture Recognition Techniques and Technology'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-8444339753465828411</id><published>2008-04-28T11:35:00.001-07:00</published><updated>2008-05-11T09:51:48.469-07:00</updated><title type='text'>A Dynamic Gesture Interface for Virtual Environments Based on Hidden Markov Models</title><content type='html'>&lt;p&gt;[Summary]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper adopts Hidden Markov Model to recognize continuous dynamic gestures.The gesture data are collected by CyberGlove giving 20 degree of freedoms. They change a gesture represented by a 20 x t vector to a 20 x 1 vector by perform standard deviation on each dimension during time. They claim this representation will be helpful to solve the spotting problem which which is the task of segmenting meaningful gesture patterns from non-gesture parts in a continuous sequence of hand motions.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Three simple gestures are used to control the rotation of a cube. Each type of gestures are trained with one HMM that has 20 hidden states. But no testing accuracy is given in this paper.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;[Discussion]&lt;/p&gt;&lt;p&gt;&lt;br /&gt;First, HMM is designed for time-series, rather than static data. If standard deviation can separate the data set, why not just use a very simple linear classifier to do this recognition task. Second, their experiment said almost nothing about their method. No testing results, very simple testing data set (only three types).&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-8444339753465828411?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/8444339753465828411/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=8444339753465828411' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8444339753465828411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/8444339753465828411'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/dynamic-gesture-interface-for-virtual.html' title='A Dynamic Gesture Interface for Virtual Environments Based on Hidden Markov Models'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-453291718524769926</id><published>2008-04-21T21:46:00.001-07:00</published><updated>2008-05-11T09:51:59.545-07:00</updated><title type='text'>Real-time Locomotion Control by Sensing Gloves</title><content type='html'>&lt;strong&gt;[Summary]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This paper proposed an intuitive character control using data gloves. Users are allowed to control humans or animals (e.g.: dogs) in real-time. A P5 glove is used to collect hand gestures. Then the collected hand gestures are mapped to the locomotion of 3D characters at runtime via a mapping function.&lt;br /&gt;&lt;br /&gt;The method we propose in this research can be divided into the calibration stage and the control stage. In the calibration stage, the mapping function that defines the relationship between the motion of the fingers and the character is generated by mimicing the reference character motions using the hand. In the control stage, the player performs a new movement by the hand to generate a new motion.&lt;br /&gt;&lt;br /&gt;They tested this system with two different characters: the human and the dog. A human walking and a dog trotting were used as reference motion to build the mapping functions. After calibration, the user can perform new motions by moving the index and middle fingers. In addition, the hopping motion is tested in a 3D environment which requires the player to control a robot to jump over obstacles.&lt;br /&gt;&lt;br /&gt;Four users performed the same task controlling a human character to run through the maze and reach the goal as quick as possible without hitting the walls and obstacles. The results showed that the average time needed to accomplish the task was longer and the number of collisions is less when using the data glove.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;First, I'd like to say that this hand gesture-based character control is very intuitive. This recalled me that I used this kind of gestures to mimic the character motion when I was a child.&lt;br /&gt;&lt;br /&gt;However, it would have very limited controllability, since our hands have much less joints and degree of freedoms than the full human body. And that is why it can only generate some coarse locomotions. In other word, it may generate walking, running and hopping with different step sizes and speeds, but there's no way for this interface to produce more detailed motions, such as upper-body motions.&lt;br /&gt;&lt;br /&gt;And also, I think keyboard would be a better interface for controlling game characters since this data glove interface would be really tiring to use. Moreover, there would be another issue which is how to map the physical space to the virtual space.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-453291718524769926?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/453291718524769926/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=453291718524769926' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/453291718524769926'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/453291718524769926'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/real-time-locomotion-control-by-sensing.html' title='Real-time Locomotion Control by Sensing Gloves'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-5113878472235645912</id><published>2008-04-21T10:48:00.001-07:00</published><updated>2008-05-11T09:52:07.539-07:00</updated><title type='text'>Wiizards 3D Gesture Recognition for Game Play Input</title><content type='html'>&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;[Summary]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This paper presents a two player zero-sum game called Wiizards, using realtime user-performed gesture as input. The game allows users to cast spells, including three kinds: Actions, Modifiers and Blockers. The gestures are captured by a Wiimote's accelerometers. A sequence of a 3-dimensional accelerometer vector is collected when the user perform gestures.&lt;br /&gt;&lt;br /&gt;They then just put the collected data into a set of learned Hidden Markov Models to test the probability of the input data being in each of HMM, thus classifying the gesture by comparing the probability scores. They gathered training data from 7 different users. Each user was presented with images of the gestures from the game, and performed each gesture over 40 times. HMM was created with the data from all of the users. A recognition rate of over 90% was achieved with ten states, and 93% recognition rate with 15 states. But the learned HMMs can only get 50% accuracy when tested on the gestures collected from users that were not in the training performers.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;First, I am wondering how they did the segmentation on a stream of gesture data. Is there any button used to seperate them when performing gestures? Second, I don't how novel this paper was, since HMM-based hand gesture recognition had been widely used. But, at least, I would like to say this sepcific game application is sort of interesting.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-5113878472235645912?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/5113878472235645912/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=5113878472235645912' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/5113878472235645912'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/5113878472235645912'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/wiizards-3d-gesture-recognition-for.html' title='Wiizards 3D Gesture Recognition for Game Play Input'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-7835666296007272167</id><published>2008-04-21T00:38:00.001-07:00</published><updated>2008-05-11T09:52:34.083-07:00</updated><title type='text'>The 3D Tractus: A Three-Dimensional Drawing Board</title><content type='html'>&lt;strong&gt;[Summary]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper tries to build a 3D-drawing system to let users draw 3D curves or models directly in the physical space. The drawing system consists of two components: a tablet PC and a table that can be moved up and down. The user is allowed to draw with a pen on the tablet PC. And the depth of the drawing is control by the height of the moving table which has a height sensor to measure the actual physical depth. On the software side, a overview window is provided to let users know what they've drawn.&lt;br /&gt;&lt;br /&gt;They also tried different visual cues to show the drawn curves: the gray scale intensity, the color scale intensity. But, finally, they chose to not show the parts above the interaction surface. They also considered different projection: orthographic and perspective. They chose perspective projection. A deletion tool is also provided to users.&lt;br /&gt;&lt;br /&gt;Three people were tested on the 3D Tractus. And 4 drawing results are shown: gum package, Aibo Bone, game controller and stuffed animal.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;It's hard to imagine to draw on a surface and, at the same time, the surface is moved up and down. I don't think that would be easy to use, because it probably needs some "skills" to get a 3D drawing that you really want. Just imagine how you draw a straight line that is neither perpendicular nor parallel to the drawing surface. After taking look at the drawing results, especially the 'stuffed animal', I don't think this system would be practical at all.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-7835666296007272167?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/7835666296007272167/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=7835666296007272167' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/7835666296007272167'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/7835666296007272167'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/3d-tractus-three-dimensional-drawing.html' title='The 3D Tractus: A Three-Dimensional Drawing Board'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6273546467474510989</id><published>2008-04-20T21:50:00.000-07:00</published><updated>2008-04-20T21:51:26.773-07:00</updated><title type='text'>Taiwan sign language (TSL) recognition based on 3D data and neural networks</title><content type='html'>&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This paper performs Taiwan sign language recognition using neural networks. Only 20 static postures are used as reference examples. The training and testing data are collected by the Vicon system, which is an optical capture system with multiple cameras capturing the 3D position of reflective markers attached on performers' one hand.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Fifteen geometirc distances are adopted as the feature representation of different hand gestures. For example, the distances between the finger tip and the palm, the distances between finger tips. The posture data are collected from 10 students, each performing the 20 hand gestures 15 times. And all the performed gestures are started with gesture '0' and ending with the assigned gesture.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;A back-propagation neural network was adopted in their recognition. It has 15 neurons in the input layer, 20 neurons in the output layer and two hidden layers. Their results showed that the recognition accuracy rose if the number of the hidden neurons increased. They obtained the highest recognition accuracy of 94.65% when using 250 hidden neurons in each hidden layers. And they said their recognition algorithm was robust because the recognition accuracies on the testing data and on the training data are similar, 94.65% and 98.5% respectively.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I don't know how novel this paper is. All that the they did is putting posture data into a neural network, and testing on different number of hidden neurons. However, they did use a different feature set (geometric distances), but they even didn't say why they select this feature set. I guess it's only because this is the simplest coordinate-invarient feature set.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6273546467474510989?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6273546467474510989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6273546467474510989' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6273546467474510989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6273546467474510989'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/taiwan-sign-language-tsl-recognition.html' title='Taiwan sign language (TSL) recognition based on 3D data and neural networks'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-377308180695431106</id><published>2008-04-20T17:37:00.001-07:00</published><updated>2008-05-11T09:52:17.749-07:00</updated><title type='text'>Feature selection for grasp recognition from optical markers</title><content type='html'>[Summary]&lt;br /&gt;&lt;br /&gt;This paper presents a method to reduce the number of features for hand postures used in grasp recognition. They use marker-based method to capture hand postures. The number of markers is 30, and, after feature selection, this number is reduced to 5 while the recognition accuracy retains at least 92% of the prediction accuracy of classifiers trained on a full feature setof thirty markers.&lt;br /&gt;&lt;br /&gt;All the thirty markers are first translated to a local coordinate system attached to the back of the hand. The full number of dimensions is 90 without feature selection. The hand postures need to be classifed into 6 kinds of grasps.&lt;br /&gt;&lt;br /&gt;They use a linear logistic regression classifier for evaluating candidate marker sets in supervised feature selection and then for predicting grasp from the final trained model. For the supervised feature selection, the sequential wrapper algorithm is used to evaluate the addition or removal of a single feature at a time for locally-optimal feature selection. Two versions of this algorithm are tested, forward and backward. Their results showed that the full 30-marker set had a accuracy of 91.5%, but with only five markers the model could stillcorrectly predict 86% of the grasp examples.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Instead of using unsupervised dimensionality redunction method (e.g. PCA), this paper uses a supervised dimensionality redunction to select a feature set. I think hand grasp is an interesting topic, although this paper may not that good. Markers may be hard to attached to the hand, while data glove is easier to use. In addition, I would like to see what the result would be if we have more kinds of grasp classes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-377308180695431106?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/377308180695431106/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=377308180695431106' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/377308180695431106'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/377308180695431106'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/feature-selection-for-grasp-recognition.html' title='Feature selection for grasp recognition from optical markers'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-795330370100255611</id><published>2008-04-20T17:36:00.001-07:00</published><updated>2008-04-20T17:36:53.880-07:00</updated><title type='text'>Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes</title><content type='html'>&lt;p&gt; &lt;/p&gt;&lt;p&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;strong&gt;&lt;br /&gt;&lt;/strong&gt;This paper proposed a "$1 recognizer" for sketches, which is easy, cheap and only contains around 100 lines of code. This recognizer can be applied to mouse gestures in web browsers, PDA interface and so on.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Basically, this $1 recognizer is a template-based recognizer. It has 4 steps: resampling the point path, rotating once based on the "indicative angle", scaling and translating and find the optimal angle for the best score. Because every sketch may have different speed, in order to align templates and new inputs, every sketch first needs to be resampled by N equidistantly spaced points. To match the templates, a new input sketch is rotated based on the "indicative angle", the angle formed between the centroid of the gesture and the gesture's first point and then rescaled to a reference square. Finally, a new input is compared with each template via the average distance between corresponding points and a further optimal angle is found.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The author compares their "$1 recognizer" with dynamic time warping and Rubine's recognizer. Their results of 16 different sketches indicate that "$1 recognizer" has a similar recognition accuracy to DTW, but overperforms Rubine's classifier. And it's more efficient than DTW.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I like this paper, because they presented the detailed algorithm and gave relatively complete comparisons and analyses. A obvious limitation is that this recognizer cannot identify gestures depend on specific orientations, aspect ratios, or locations. In addition, we have to notice that their method is only suitable for simple sketches that can be distinguished by spatial information. For more hand gestures which are more complicated and temporal information plays an information role, this method may not apply. &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-795330370100255611?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/795330370100255611/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=795330370100255611' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/795330370100255611'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/795330370100255611'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/gestures-without-libraries-toolkits-or.html' title='Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-477554139485155922</id><published>2008-04-20T17:35:00.001-07:00</published><updated>2008-05-11T09:52:39.565-07:00</updated><title type='text'>RFID-enabled Target Tracking and Following with a Mobile Robot Using Direction Finding Antennas</title><content type='html'>&lt;strong&gt;[Summary]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper builds a RFID-based target following system that can allow a robot to follow another mobile robot in real-time using RFID and direction finding antennas. Two perpendicular antennas are set on the following robot by a rotatable motor. The ratio between the signals' strengths from two antennas is used to adjust the direction of the antennas to make sure the ratio is always 1, which means the angle between the followed robot and the antennas will be 45 degree. Accordingly, the following robot will rotate.&lt;br /&gt;&lt;br /&gt;The strength of the RFID signal is related to the distance and the angle between the transponder and the antenna. The ratio of signals from these two antennas will help to get rid of signal offsets and the distance factor. This paper also shows resulting signal ratios where some obstacles exist. Finally, a target following experiment is done to demonstrate the performance of the system.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;From this paper, I got some ideas how accurate RFID signals would be. The signal itself seems very noisy, and gets even worse when obstacles exist in the environment. Although it would work well for their target-following application, it's obviously there's no way to obtain the real orientation data based on this RFID system. So it's hard to apply it to the hand gesture problems.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-477554139485155922?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/477554139485155922/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=477554139485155922' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/477554139485155922'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/477554139485155922'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/rfid-enabled-target-tracking-and.html' title='RFID-enabled Target Tracking and Following with a Mobile Robot Using Direction Finding Antennas'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-9083282366831966181</id><published>2008-04-20T17:33:00.000-07:00</published><updated>2008-04-20T17:34:45.612-07:00</updated><title type='text'>Activity Recognition using Visual Tracking and RFID</title><content type='html'>&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper presents an activity recognition system combining visual tracking and radio frequency identification tags. The goal of this system is dealing with interaction between human and objects, because the existing tracking technique will fail on small objects.&lt;br /&gt;&lt;br /&gt;The framework contains two tracking components: visual-based human tracking and RFID-based object tracking. The visual tracking part is just a standard particle filtering tracker using a skin-color based likelihood model. It tracks the position of human's two hands and head. Two cameras are used to get their 3D positions.&lt;br /&gt;&lt;br /&gt;The other component is RFID-based object tracking, which is able to detect the presence, movement and orientation of RFID tags. Basically, a RFID tag serves as a small capacitor, containing an antenna and a IC chip. It can be charged through field coils (another antennas) and periodically switch off to emit data back to the field coils. Three antennas are used to estimate the orientation of RFID tags.&lt;br /&gt;&lt;br /&gt;These two components work together with an agent-based regonizer to identify high-level interactions between human and objects by examining low-level activities detected by the visual tracker and RFID tracker. Only one simple result is showed in this paper.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This is the first paper I read talking about using RFID tags as tracking sensors. From the high-level , I think it is a good idea. However, from the only example in the paper, I cannot estimate how accurate the RFID orientation information is. And I also wonder how big the working area is making sure the data emitted from RFID tags can be detected by the RFID reader. Another concern is that it probably nesecessary to get two different trackers synchronized for better results and other applications. If these concerns are not problems, I think RFID could be used in more areas, such as tracking, motion estimation and game interface.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-9083282366831966181?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/9083282366831966181/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=9083282366831966181' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/9083282366831966181'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/9083282366831966181'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/activity-recognition-using-visual.html' title='Activity Recognition using Visual Tracking and RFID'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-3016347720640646305</id><published>2008-04-20T17:32:00.000-07:00</published><updated>2008-04-20T17:33:10.525-07:00</updated><title type='text'>A Dynamic Gesture Recognition System for the Korean Sign Language (KSL)</title><content type='html'>&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper presents a system which recognizes the Korean Sign Language (KSL) and translates into a normal Korean text. A fuzzy min-max neural network is adopted for pattern recognition in real time.&lt;br /&gt;&lt;br /&gt;They use two VPL Data-Gloves, with a polhemus sensor system attached to the back of each glove which measures the z, y, z, yaw, pitch, and roll of the hand relative to a fixed source. So, 10 flex angles, 3 position data (z, y, z), and 3 orientation data (roll, pitch, yaw) are obtained from each of the two Data-Gloves.&lt;br /&gt;&lt;br /&gt;In this paper, they selected 25 reference gestures, which contain 10 basic direction types of motion patterns. In order to classify these motion patterns, they divided the x-axis and y-axis into 8 regions, and each motion can be reduced to a smaller set of region data. Each hand posture is recognized by applying the technique of Fuzzy Min-Max Neural Network. A fuzzy set hyperbox is a 10-dimensional box in our studydefined by a min point (V) and a max point (W) with a corresponding membership function. The initial min-max values (V, W) of the network are determined based on experimental data of many individuals whose flex angles may show wide range of varying values. They can obtain classification accuracy of around 85%.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;I am wondering why they just include 25 gestures while saying there are 31 basic gestures in KSL. Does that mean it didn't work well for other 6 gestures? And the classification accuracy was relatively low. Another thing is they didn't mention how they determined the number of nodes in each layer of the Fuzzy Min-Max Neural Network.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-3016347720640646305?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/3016347720640646305/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=3016347720640646305' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3016347720640646305'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/3016347720640646305'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/dynamic-gesture-recognition-system-for.html' title='A Dynamic Gesture Recognition System for the Korean Sign Language (KSL)'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-4230017379063806543</id><published>2008-04-20T17:31:00.000-07:00</published><updated>2008-04-20T17:32:09.509-07:00</updated><title type='text'>Shape Your Imagination: Iconic Gestural-Based Interaction</title><content type='html'>&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;/p&gt;&lt;strong&gt;&lt;/strong&gt;&lt;p&gt;&lt;br /&gt;This paper presents the results of a study to employ iconic hand gestures as a human computer interaction technique. The goal of this study is to establish whether we use iconic hand gestures during the non-verbal communication of shapes and objects.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This study involved 12 subjects, 5 males and 7 females made up of undergraduate, post-graduate, PhD, and post-doctorate. 15 shapes and objects were tested, including 2D and 3D shapes, as follows: square, cube, cylinder, table-lamp, car, chair, table, house, football, French baguette, vase, pyramid, sphere, triangle, circle. These shapes and objects were in two categories: 1) primitive and 2) complex and compound.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The study result showed that, for the primitive group, subjects preferred to use virtual depiction over substitutive depiction, and preferred to use two-handed iconic hand gestures over one-handed iconic hand gestures. The same conclusions are drawn from the complex and compound group. Some pantomimic and body gestures are also used to accompany iconic hand gestures in the complex and compound group.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;This is a short and simple paper giving some user-study results on what kind of gestures people choose to show shapes and objects. But, it lacks necessary analyses of the results, and, more importantly, it didn't give any suggestion or method on how to define a gesture set. It's only said "the study will be used to inform future work on iconic gestural0based interaction"&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-4230017379063806543?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/4230017379063806543/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=4230017379063806543' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/4230017379063806543'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/4230017379063806543'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/shape-your-imagination-iconic-gestural.html' title='Shape Your Imagination: Iconic Gestural-Based Interaction'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-632708461494423582</id><published>2008-04-20T17:30:00.000-07:00</published><updated>2008-04-20T17:31:20.766-07:00</updated><title type='text'>Gesture Recognition with a Wii Controller</title><content type='html'>&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper uses the accelermeter data obtained from Wiimotes to recognize hand gestures. The recognizer is HMM. They only tested their recognizer on five very simple gestures: 'Square', 'Circle', 'Roll', 'Z' and 'Tennis'.&lt;br /&gt;&lt;br /&gt;Basically, they first perform k-mean based vector quantization via a codebook to make the data be able to put into a single HMM. A HMM is initialized for every gesture andthen optimized by the Baum-Welch algorithm. Left-to-right HMM and Ergodic HMM are tested to determine which model better suits their needs. Their evaluation confirms that neither the number of states nor the concrete HMM instance influence the results all too much. In the end a left-to-right HMM with 8 states is chosen. They also need to filter the data before putting into HMMs.&lt;br /&gt;&lt;br /&gt;The results for the five gestures were Square = 88.8%, Circle = 86.6%, Roll = 84.3%,Z = 94.3%, and Tennis = 94.5%.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Disscussion]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;A bad paper. The reference gestures are too simple. Nothing is new, just put accelermeter data into HMMs. Results are not very convincing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-632708461494423582?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/632708461494423582/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=632708461494423582' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/632708461494423582'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/632708461494423582'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/gesture-recognition-with-wii-controller.html' title='Gesture Recognition with a Wii Controller'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-9025736931826127295</id><published>2008-04-07T08:41:00.000-07:00</published><updated>2008-04-07T09:22:21.248-07:00</updated><title type='text'>SPIDAR G&amp;G: A Two-Handed Haptic Interface for Bimanual VR Interaction</title><content type='html'>&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;Comments I made:&lt;/strong&gt;&lt;br /&gt;&lt;a href="http://awolin.blogspot.com/2008/03/spidar-g-two-handed-haptic-interface.html"&gt;Aaron Wolin&lt;/a&gt;&lt;br /&gt;&lt;a href="http://grpauli.blogspot.com/2008/03/spidar-g-two-handed-haptic-interface.html"&gt;Paul Taele&lt;/a&gt;&lt;br /&gt;&lt;a href="http://pankaj-haptics.blogspot.com/2008/03/spidar-g-two-handed-haptic-interface.html"&gt;Pankaj Rajan&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper proposed a two-handed haptic interface. It allows users to manipulate (translate or rotate) virtual objects with two hands and to feel force and torque interactions generated between hands and virtual objects.&lt;br /&gt;&lt;br /&gt;Two SPIDAR-G devices are used for two-hand interaction, each of which provides 6DOF motion, 6DOF force feedback and 1DOF for spherical grip (grasp or release). The position and orientation are determined by measuring the strings' length, and the force feedbacks are produced by controlling the tension of each string. The system also performs collision detection between virtual objects and user's hands.&lt;br /&gt;&lt;br /&gt;They designed some experiments to test this two-handed haptic system. The required task is putting a pointer into a hole on a sphere. This task is tested on different subjects ("familiar with VR interfaces", "two men and a woman") with one or two SPIDAR-Gs. They concluded that using two hands is faster than using only one hand for this 3D pointing task, and the completion time is shorter with haptic feedback than that without haptic feedback.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;First, its user study is obviously insufficient, with only three people who are familiar with VR interfaces. Second, In this paper, they only showed the case of controlling two objects with two hands. I wonder what it would be like if the user manipulate one virtual object with both hands. I guess the control and the feedback would be more difficult to model in this case.&lt;br /&gt;&lt;br /&gt;Building a two-handed haptic interface is usually hard because of cooperation between two hands and challenges of providing realistic force feedbacks. I think the their idea of this two handed interface in this paper is good, even though there are a lot of limitations on it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-9025736931826127295?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/9025736931826127295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=9025736931826127295' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/9025736931826127295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/9025736931826127295'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/spidar-g-two-handed-haptic-interface.html' title='SPIDAR G&amp;G: A Two-Handed Haptic Interface for Bimanual VR Interaction'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-1598994672565337758</id><published>2008-04-06T22:03:00.000-07:00</published><updated>2008-04-07T08:12:25.520-07:00</updated><title type='text'>Hand gesture modelling and recognition involving changing shapes and trajectories, using a Predictive EigenTracker</title><content type='html'>&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;Blog comments I made:&lt;/strong&gt;&lt;br /&gt;&lt;a href="http://grpauli.blogspot.com/2008/03/hand-gesture-modeling-and-recognition.html"&gt;Paul T&lt;/a&gt;&lt;br /&gt;&lt;a href="http://pankaj-haptics.blogspot.com/2008/03/hand-gesture-modelling-and-recognition.html"&gt;Pankaj Rajan&lt;/a&gt;&lt;br /&gt;&lt;a href="http://paulsonb.blogspot.com/2008/03/hand-gesture-modelling-and-recognition.html"&gt;Brandon Paulson&lt;/a&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;This paper presents an approach to recognize hand gestures in video. Both the hand shape and hand position information are involved into the recognition process. They call their method as "Predictive EigenTracker". In addition, this method also allows users to choose a gesture vocabulary so as to maximize recognition accuracy.&lt;br /&gt;&lt;br /&gt;Basically, they employ a Particle Filtering (condensation) predictive framework to track the hand first. The dynamic model used in this tracker is a second-order Markov chain with noise. The tracker is initialized by detecting the hand skin color.&lt;br /&gt;&lt;br /&gt;After tracking the hand position in video, a shape-trajectories eigenspace is modeled by principle components analysis. And then Mahalanobis distance between gestures are computed to help users to select a proper gesture set with highest accuracy.&lt;br /&gt;&lt;br /&gt;To demonstrate the performance, they showed an application of their tracker and recognizer, controlling an audio player with hand gestures. This application ended up with a 100% accuracy.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;I think this system will not be robust. The only result shown in this paper is a set of very simple gestures with totally different color as background, and the performer are in a black shirt, which makes the tracking problem very easy. Moreover, this system cannot work on a larger set of gestures. This paper only showed us a set with 8 gestures.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-1598994672565337758?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/1598994672565337758/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=1598994672565337758' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/1598994672565337758'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/1598994672565337758'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/04/hand-gesture-modelling-and-recognition.html' title='Hand gesture modelling and recognition involving changing shapes and trajectories, using a Predictive EigenTracker'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-4132998464392453312</id><published>2008-02-11T10:26:00.000-08:00</published><updated>2008-02-11T10:27:06.629-08:00</updated><title type='text'>Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMMs</title><content type='html'>&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This paper presents an approach to segment and recognize the upper-body gestures by using competitive differential observation probability, sliding window and accumulative HMMs. The author defined a competitive differential observation probability by the difference of observation probability between a maximal gesture and a non-gesture.  The start and end point corresponds to the zero crossing from negative to positive and from positive to negative. The gesture recognition is performed by applying accumulative HMMs accepting all possible partial posture segments. The final gesture of the observed gesture segment is determined by applying a majority voting to the gesture type set. They shows a fairly high accuracy (95.42%) on the segmentation and recognition.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The whole idea is very simple in this paper. But I really doubt if the 'accumulative HMM' will work. First, they didn't mention how the accumulative HMMs are trained. If they are trained by partial gesture segments, they will lose a lot of discrimination ability, since the partial gesture segments are usually short. If they are only trained by the entire gesture segments, how will it be reasonable to accept partial gesture segments? And there are also some other problems in this paper. For example, they assume the first posture of the whole gesture stream is the a start point of a gesture segment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-4132998464392453312?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/4132998464392453312/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=4132998464392453312' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/4132998464392453312'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/4132998464392453312'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/02/simultaneous-gesture-segmentation-and.html' title='Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMMs'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-5568457105670321468</id><published>2008-02-11T10:25:00.001-08:00</published><updated>2008-02-11T10:25:52.783-08:00</updated><title type='text'>A Survey of POMDP Applications</title><content type='html'>&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This paper showed the wide range of applications for partially observable Markov decision process. The author presents the basic form of the POMDP model: a state set S, a action set A, a observation set Z, a transition function, an observation function and an immediate reward function. In POMDP, the state is not directly observable. A lot of applications are shown in this paper, including industrial applications, scientific applications, business applications, military applications and social applications. Particularly, the author shows a gesture recognition system using POMDP model. Finally, the laminations are discussed. One is the finite sets and discrete time assumption. Another one is that it requires that all the parameters of POMDP model are specified. And also, the finding optimal policy for a general POMDP model is intractable.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;POMDP model involves Markov decision process, actions and the observation uncertainty, so it is a good approximation for many of processes around us. Naturally, it can also be used in the gesture recognition problem. On the other hand, there are still some questions need to be answered: how to select the proper states for the gesture recognition problem, how to learn the model parameters and what is the action set if we use data glove instead of computer vision method.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-5568457105670321468?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/5568457105670321468/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=5568457105670321468' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/5568457105670321468'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/5568457105670321468'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/02/survey-of-pomdp-applications.html' title='A Survey of POMDP Applications'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-2746341173781524525</id><published>2008-02-11T10:17:00.000-08:00</published><updated>2008-02-11T10:24:37.506-08:00</updated><title type='text'>A Similarity Measure for Motion Stream Segmentation and Recognition</title><content type='html'>&lt;p&gt;&lt;strong&gt;&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;This paper talked about how to use singular value decomposition as a similarity measure for motion segments, and then use it to segment and recognize motion streams. They first compute the covariance matrix for a possible segment, and then do SVD on it to get the first k principle components. The similarity measure is simply computed by the sum of the weighted angles between two corresponding principle components. The segmentation and recognition are done by enumerate all segments with any possible length. Finally, they claims an high accuracy (94.0 and 94.6) by comparing it with other methods.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I don't believe this similarity measure and the segmentation and recognition approach will work well. Because SVD can only reveal the characteristics of posture distribution, and it totally lose the temporal information. For example, this method will treat a gesture as the same as it performed reversely, but actually they could be totally different.&lt;br /&gt; &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-2746341173781524525?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/2746341173781524525/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=2746341173781524525' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2746341173781524525'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2746341173781524525'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/02/similarity-measure-for-motion-stream.html' title='A Similarity Measure for Motion Stream Segmentation and Recognition'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6974399049221530653</id><published>2008-02-04T09:04:00.000-08:00</published><updated>2008-02-04T09:05:38.868-08:00</updated><title type='text'>Hand Tension as a Gesture Segmentation Cue</title><content type='html'>&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;This paper presents a recognition-led approach to do the hand gesture segmentation. The key idea is to use the hand tension as a gesture segmentation cue, because they notice that intentional gestures will be made with a tense hand position rather than a relaxed one, and that the hand tension usually will change between two hand gestures.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;In this paper, they simply model a finger as a light rigid rod of a fixed length, with two light elastic strings attached to the end of the rod. Through this simplified model, the whole hand's tension can be computed by summing each finger's tension up. They classify hand gestures into four classes: SPSL, DPSL, SPDL and DPDL. The hand tension model will help to segment class SPSL; The fingertip acceleration together with the hand tension model make it possible to segment class SPDL; With the help of tension graph shape, the system can segment gestures with dynamic finger movements.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;Finally, they demonstrate the performance of the segmentation approach by analyzing two gestures: "My name" and "My name me".&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;strong&gt;[Discussion]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Although it is kind of verbose, this is, I think, a good paper because they use a 'internal' feature hand tension as the segmentation cue instead of just relying upon statistical methods. Moreover, they successfully approximate the hand tension with a very simply model that works well. In my opinion, using the this kind of internal feature is one of the right ways to solve this segmentation problem, because the gestures are always made by internal forces, which may be one of the proper feature spaces to describe gestures.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6974399049221530653?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6974399049221530653/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6974399049221530653' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6974399049221530653'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6974399049221530653'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/02/hand-tension-as-gesture-segmentation.html' title='Hand Tension as a Gesture Segmentation Cue'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-2358288784214120974</id><published>2008-02-03T21:54:00.001-08:00</published><updated>2008-02-03T21:55:41.690-08:00</updated><title type='text'>A Multi-Class Pattern Recognition System for Practical Finger Spelling Translation</title><content type='html'>&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;[Summery]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;This paper presents a hierarchical classifier, i.e., a decision tree, to recognize the 26 postures from American Sign Language alphabet. The device they used in their system is Accele Glove, which is cheaper than CyberGlove. An Accele Glove is composed of 5 accelerometers (one for each finger), a kind of sensor that can obtain the accelerations along x and y axes, so the total number of signals being measure is 10.&lt;br /&gt;&lt;br /&gt;From this 10-dimensional vector, they extract 3 features: sum of the x position for all fingers, sum of the y position for all fingers, and y position for index finger. After this feature extraction, the 26 postures are classified by the third feature into three subclasses: closed, horizontal and open. After this, members of each subclass are projected onto the plane that defined by the first and the second features. And then a set of hierarchical linear Bayesian classifiers which works as a decision tree is used to recognize the members in each subclass.&lt;br /&gt;&lt;br /&gt;They claim that 21 out of 26 letters reached a 100% recognition rate. The worst case is the letter 'U' with a 78% recognition rate.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;[Discussions]&lt;/strong&gt;&lt;br /&gt;&lt;strong&gt;&lt;/strong&gt;&lt;br /&gt;Although the their system has 21 out of 26 letters reaching a 100% recognition rate, the results for letter 'I', 'Y', 'R', 'U' and 'V' are not good enough to claim the system to be robust. I think the reason might be that their feature extraction process is not designed very well to separate all the postures in the feature space.&lt;br /&gt;&lt;br /&gt;Another comment is that this system is only designed for the American Sign Language. And the hierarchical classifiers highly depend on the postures to be recognized. In other words, for other set of postures, the whole system cannot learn from examples by itself, but needs to be designed again. In addition, the Accele Glove may not capture enough information if we need to discriminate a larger set of postures.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-2358288784214120974?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/2358288784214120974/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=2358288784214120974' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2358288784214120974'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/2358288784214120974'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/02/multi-class-pattern-recognition-system_03.html' title='A Multi-Class Pattern Recognition System for Practical Finger Spelling Translation'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7226341358299927342.post-6859074090993165918</id><published>2008-01-28T10:28:00.000-08:00</published><updated>2008-01-28T10:29:26.248-08:00</updated><title type='text'>An Architecture for Gesture-Based Control of Mobile Robots</title><content type='html'>&lt;strong&gt;Summery:&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This paper presented a way to control robots with hand gestures based on a hidden Markov model spotter. The system includes the following main components: a mobile robot, a CyberGlove, a Polhemus 6DOF position sensor and a geolocation system that tracks the position and orientation of the mobile robot. The user can control the robot either in a local control mode or a global control mode.&lt;br /&gt;&lt;br /&gt;The system can spot and recognize six gestures as following: opening, opened, closing, pointing, waving left and waving right. In addition, non-gestures can also be recognized by this system.&lt;br /&gt;The gesture reorganization progress mainly has three steps. The first step is processing the joint angles to obtain a codeword range from 0 to 31. And then, with the help of hidden Markov model, a sequence of codewords can be handled by the gesture spotter which will determine the type of the gestures. Finally, the gesture interpreter takes the recognized gestures and the palm position and orientation as inputs to generate commands, either in local control mode or global control mode, that the robot can understand.&lt;br /&gt;&lt;br /&gt;There are two modifications for the hidden Markov model used in this paper. One is that it limits the observation sequence to the n most recent observations. The second is the introduction of the "wait state", which can help to recognize the non-gestures.&lt;br /&gt;&lt;br /&gt;And besides, the author set up a experiment to verify the performance of the proposed HMM by comparing it with a HMM with variable window. The proposed HMM recognized gestures with 96% accuracy and had a false-positive rate at 0.0016%.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Discussion:&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Usually, controlling robots with hand gestures is intuitive. But I think there are still some other stuff we should consider about. For example, giving commands for a robot that moves in 3D space would need more types of gestures, which needs the user to remember them all and makes the recognization system less robust. Another example is that the hand-gesture oriented controls will be very complex if the robot is not rigid, such as a human-like character.&lt;br /&gt;&lt;br /&gt;I also have my doubt about the robustness of the system. Six kinds of gestures probably is easy to spot and recognize, but what if we need 30 kinds of gestures, some of which would be very similar to others.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7226341358299927342-6859074090993165918?l=kevinhaptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kevinhaptics.blogspot.com/feeds/6859074090993165918/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7226341358299927342&amp;postID=6859074090993165918' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6859074090993165918'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7226341358299927342/posts/default/6859074090993165918'/><link rel='alternate' type='text/html' href='http://kevinhaptics.blogspot.com/2008/01/architecture-for-gesture-based-control.html' title='An Architecture for Gesture-Based Control of Mobile Robots'/><author><name>Kevin Wei</name><uri>http://www.blogger.com/profile/17730823343016542169</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry></feed>
