A Personal Blog
Ultimate Hiptop Device?
I’ve written several times in the past about the kind of “ultimate device” or laptop type replacement. Over the months I’ve become convinced that the keyboard specifically and maybe even the mouse are incredibly unintuitive and un-userfriendly input devices.
That consumers, and users, would much prefer something which is geared to how they work.
As such, I’ve been following advances in hardware and software very closely. Not because I necessarily want to do something rash, but because the possibilities are becoming more and more realistic.
The Idea
I had a conversation with an incredibly smart guy here at work and just brainstorming a bit.
Over nearly a dozen posts I’ve come down to the fact that we could easily do:
- a hiptop device with a 20GB HDD, P4M 1.5GHz processory, 128MB of RAM
- visual output device (glasses or projection-based) which sits in front of the eyes
- input for flash memory sticks, keyboard and mouse
This would give you a true “mobile desktop”. You could take it anywhere, run any app and interact via keyboard and mouse. Your “desktop” would become a “hub”. The problem is that, ultimately, I want something I can walk down the streeth with. I want something I can sit in the airport with. I want something that is more user friendly and more intuitive than a standard OS.
The Issues
As such, there are a few key problems:
- we need better, or more context-sensitive, voice recognition
- we need some kind of “mouse replacement”
- we need a new “shell” (at least) or a new OS (at most)
We need voice recognition because if you’re on the go you need to be able to interact with the device without a keyboard (mobile or otherwise). I’ll touch on context-sensitive a bit later, as it was really today’s “breakthrough”.
We need a mouse replacement, because even with the most intuitive and advanced voice recognition, it’s really, really cumbersome to, for instance, bold specified text without some kind of pointing / selecting mechanism.
And we need a new shell…
The Shell
The idea for this was thrown out somehow in our discussion today. If a user is working in Word, for instance, and already has decent voice recognition (doable) and some kind of imaginary mouse replacement, the only thing that is difficult is doing things like interacting with documents, opening programs, etc.
In a “mobile mode” situation, we thought, there is a very limited scope of things a user needs to be able to do. Either they are working in an application OR they are trying to work with other applications and data.
By taking this concept up a level, we realised that if a user isn’t working in an application, they must then be searching for applications or data. As such, they get to “the Windows environment”. But, really, the Windows environment is an entirely point and click environment. That won’t work for our mobile device.
We need a voice environment. Not a vocal environment, or a voice-feedback environment (though that would be cool as well). So, we just bounced back and forth about what kind of other environments there were besides standard desktop environments, and 2 jumped out at us.
First, the old Mac environment was really “single piont of entry”. Most Mac users didn’t use their desktops a lot, they used the little Apple symbol to access just about everything. Single point of entry. Just the kind of thing we’d need if users could only give vocal input.
Then we realised that many older PDA’s had ‘single point of entry’ type environments for when you weren’t working in an application. They had, in essence, multiple “launchers”.
You could go to your Application Launcher, to your Games Launcher, to your Documents Launcher, etc.
We quickly realised that something like this could be just what we were looking for. If a user was inside Microsoft word and wanted to send an email, the voice commands could easily be:
- “Open Application Launcher. Open Outlook”
Boom, they’re in. Obviously things like voice shortcuts would be useful as well (“Execute Open Outlook” should work on it’s own, for instance, or even “Execute O” maybe).
They could easily navigate between preset “Launchers”, as well as virtual ones.
If you’ve seen the Longhorn Video for Healthcare, you’ve seen the cnocept of meta, or virtual, filesets. Something like this would be incredibly useful for our “Launchers” environment.
Context Sensitive
One of the biggest reasons that this is useful, is that if you are in the “desktop” mode (as opposed to “application” mode I guess); the voice recognition software has a much smaller ‘vocabulary’ it needs to work with. If you’ve only got 20 apps, there are only 100 (ish) words it needs to try and recognize, which increases the chance of it getting the word right exponentially.
A great example of this is the Table PC Demo I pointed to earlier today. They show off just this kind of functionality for text recognition.
The Hub
One of the key features of such a device should be (in my mind anyways); that it could also go into “real desktop” mode: a full OS environment. You sit down, plug in a real keyboard and mouse (and maybe even a monitor) and you use this thing as a desktop replacement.
Summary
I still don’t think we’re at the point where we could realistically build a device like this out of commodity products, but we are much, much closer. The hardware can easily be built for less than 1,000$. The biggest hurdles now are the voice recognition, the mouse replacement and the shell.
I’m confident that if the mouse replacement issue could be solved, and if voice recognition improved (maybe Microsoft Speech Server is the key here?) the shell would be easy to do.
| Print article | This entry was posted by Jeremy Wright on May 18, 2004 at 12:29 pm, and is filed under IT Thoughts. Follow any responses to this post through RSS 2.0. Responses are currently closed, but you can trackback from your own site. |
Comments are closed.
about 7 years ago
I love playing around with voice recognition stuff. However, I realized long ago, it’s not really an efficient user interface. For one thing, it’s slow. I can click faster then I can speak. Shortcuts are fine, but the shorter you go, the harder it is for a computer to understand you (if you’re speaking English that is). I remember reading long ago that Cantonese (or perhaps it’s Mandarin, it was one of those Chinese dialects) was much easier to understand for a computer (less-similar words, homonyms, stuff like that).
Either way, voice commands can be pretty clunky compared to point and click. If you don’t think so, give it a shot, you can create your own shortcuts with most programs now.
As the programs get to know your voice they can get pretty good at taking dictation, but after that I’ve found they’re not that useful (yet). Another thing – you need a private office, or it’s pretty useless. There are few open-space office environments that you’d be able to actually use your voice for anything and not distract others or interfere with their work (or have them interfere with your own). Public use goes without saying (just think of cell phones, but much worse).
Just some random thoughts.
about 7 years ago
I agree completely, which is why such a device would need to have keyboard / mouse for when you are in a place where you can sit down.
Agreed that voice isn’t the ideal either, however we have to be able to get away from keyboard / mouse without getting into implants or eye-based input.
about 7 years ago
the ultimate answer would be something to correlate with the video, perhaps some finger sensors, that would allow you point to specific items in the visor and activate, if enough of them could be implemented, perhaps even a “virtual” keyboard, that would allow you to type much the same way sans the feel.
about 7 years ago
Well, your implant comment just got me thinking about a device I saw in the stores about 4-5 years ago. It was this thing you’d put on your finger and it would “read your mind”. If I remember correctly, it was a few different games. Skiing, pinball, etc. You would think “left” or “right”. I’m not sure if it was able to interpret much else, or how it really did it…
…takes break and Googles…
Ahh, someone asked about it:
http://answers.google.com/answers/threadview?id=302038
Well, if you read towards the bottom it’s not “brain” reading, but skin-reading in a way.
Real thought-reading computer article:
http://www.economist.com/science/tq/displayStory.cfm?story_id=2246298
It still may freak out some, but if you could “think your mouse”, depending on how quickly the thing could work this is a nice little futuristic interface :)
As for today, I’m not quite sure about making an interface that differs from what we have, isn’t eye or implant related and is truly mobile. Perhaps some sort of glove and finger-movement system, but there would probably be a decent learning curve to it.
about 7 years ago
I use a Handspring Treo, it is a cell phone, PDA, and connects to the web for mail or surfing via Blazer at about 70K/sec via Sprint PCS ( $10/month)
It has a small QWERTY keyboard ( thumb board?) so you can use it wherever to type, take notes, surf, check email, (Eudora is FREE for example 4 Palm OS)
So, we have a small screen 3.5″ x 2.5″ LCD, keyboard, pointer is a scroll wheel which pushes in for a click – or you can simply touch the screen -finger or stylus, toothpick, or fingernail….
If I could stick an iPod 1.5″ 20 – 40GB HD in it, or even a smaller 1″ micro drive 4,000MB – that would be great…..
You connect a Treo to your computers via USB – it sync’s your Mail, memo’s, Web bookmarks, photo’s, MP3′s, what’s left to do??? Not $1,000 but for sale @ $499 with Camera, phone, web, MP3, Palm OS -available now…
Mac OS “Classic” ( Mac OS 8/9 ) is already a speakable interface. Defaults are “Launch or Launch Outlook Express, check Mail, close window, copy text, paste text, etc. Built in, and free – Internet Explorer is also speech enabled ( Mac OS X is about to get a BIG improvement in Speech, synthesis & recognition…
One last thing, Pentium is a horrible power hog! Treo, Palm, all cell phones use ARM RISC for low power, high performance ( Battery can last 20 hours on a phone vs. 45 minutes in a P4 laptop! ) Intel makes X Scale ( ARM core CPU) TI, and everyone else make versions for price competition ( lower cost )
Also ARM can be converted to big endian, little endian, ( x86 or PPC ) – Hitachi 1.5″ or 1″ Micro drives are light, small as credit cards, or smaller…..and Palm OS can connect to Outlook servers, etc.
Is there much left to do, let alone make yourself??