[Optimoz] gesture recognition with a hidden markov model
Aaron Lenfestey
lenfestey at gmail.com
Sun Oct 30 07:57:52 EST 2005
Great!
I think starting off in C/C++ is the way to go. It will give us a lot
of headroom and compatibility should really be a non-issue for these
languages. Further, I get to skirt the rather embarassing issue of
having never written a line of JS. My only remaining qualm would be
that the installation process not be substantially more complicated.
We would distribute separate packages for each operating system, or
one package with several different binaries and "do the right thing"
from within javascript?
Sharp: I'd be interested to know what methods you have already
considered. Another approach (which I'm sure others have considered)
would be to use a discriminative model like neural networks or SVMs.
These are common for tasks like handwriting recognition, but its not
clear (to me) how best to extend them label input sequences that they
weren't explicitly trained to recognize. I think using a language to
describe gestures was a great idea. Also, moving away from this
framework would really break compatibility with the current optimoz.
Also, there are subtle differences between what I have proposed, and
how things are currently handled. For example, the current engine
displays to the user the partially completed gesture as it is being
composed. This is great for the current model, because as soon as we
make a mistake we can give up and start over. In the models that I've
suggested, a definitive answer isn't produced until all the data has
been considered. This means that we can display the current guess at
each time step, but it won't be the final word in the sense that
recognizer may later decide to change its belief about what the user
had intended to input at that time
So, we have a few options:
1) display the recognizer's actual state, which will be a probability
distribution, or some concise version of it. this is probably way too
technical to be of value to most people
2) display the model's best current guess, and perhaps some
english/[other natural language] to indicate that the user shouldn't
give up if the guess is wrong
3) display nothing at all/display the guess only when the model is
very confident. If the recognizer actually turns out to work well
(crossing fingers) then maybe there is no need to give the user
real-time notifications
Anyway, I still haven't examined the current source in depth, however
I'm bringing this issue up because its clear that it will need to be
dealt with, that it will affect non-engine parts of the code and how
user's interact with the software.
-Aaron
More information about the Optimoz
mailing list