Monday 13 February 2012

User Interfaces - An overview

As computers, and electronic devices in general, are ubiquitous in our everyday life, designing a simple, self-explanatory and fun-to-use interface to support the communication between user and machine is crucial. Starting from a rather non-intuitive form of interaction, recent developments show a transition towards more natural and human ways of communication. 

Command Line Interface (CLI) 
At the beginnings of the computer era, CLIs were the only interface, which lead to the term “shell” as a synonym for those kind of input types. Command line interfaces feature a single line, in which users can, with help of a keyboard, type in specific commands to start certain functions like copying, pasting, browsing, and so on. The whole interface is exclusively represented by lines of text and therefore has a steep learning curve and is hard to understand, especially for non-tech-savvy users. Figure 1 shows a typical CLI, as known e.g. from DOS or Windows Command Prompt.

 
Figure 1: Windows Command Prompt

Though the main interfaces of almost all operating systems shifted towards different types (see below), they still offer Command Line Interfaces as an additional input form for very specific commands and hence for more experienced users. 

Graphical User Interface (GUI)
The concept of GUIs was invented during the 1970s in California at Xerox PARC, but achieved its breakthrough with the introduction of Apple’s Macintosh in 1984, the first affordable computer on the market, which featured a graphical user interface. At first, GUIs were very slow compared to their text-based ancestors and lacked useful applications. Furthermore, the machines, which were able to show decent graphics and had enough memory to avoid the annoying constant switching of disks were still very expensive. But with increasing computing power and display resolutions, more complex concepts could be realized while at the same time prices decreased. Today, GUIs are the dominant form of representing information and controls in operating systems, on personal computers as on smartphones.

Via graphical elements and controls, GUIs help the user to better control the operating system of his computer and its applications. Those graphical elements are often metaphors of real life (e.g. a recycle bin), so that the user intuitively knows what function it serves because it looks familiar to known objects. Usually, a computer mouse is used for pointing, selecting and manipulating visual object on the screen, whereas the keyboard remains the primary text input device. Figure 2 shows the File Explorer of Windows 7 with the typical windows layout and icons for representing different functions/applications.

Figure 2: Windows Explorer with highlighted Recycle Bin



 
The applications themselves are usually represented as windows, which the user can change in position and size. As shown in Figure 3, the windows feature a title bar with certain controls for minimizing, changing to window- or full screen mode, or closing the window.
Figure 3: Windows Title Toolbar with minimize, change window mode and highlighted close button

In those windows, the actual functionality and data of the application is shown. Those contents also can be enriched by certain GUI elements likebuttons, toolbars, scrollbars, or other symbols.  Furthermore, dialog boxes are usually used when asking about user input or error boxes if some error occurred. While designing those boxes, certain aspects like e.g. preciseness, explicitness and understandability must be kept in mind to enhance the user experience of the interface. For further coverage of Error Message Design, you should have a look at the blog posts of Team 10, Team 17 and Team 20 about this very topic.

Natural User Interface (NUI) 
Recent developments lead into a more human form of interacting with a computer, i.e. the user doesn’t have to adapt his behavior, so that the computer “understands” his commands. The user rather behaves towards the computer like he would when interacting or talking to another person. We want to pick Touchscreen and Gesture Recognition Interfaces as examples for the first and Voice User Interfaces as examples for the latter.

Touchscreens
Touchscreens make it possible for the user to directly interact with the surface of the displaying device. Whereas former touchscreens used a stylus for input, recent touchscreens mostly got rid of artificial input devices and rely on finger input (for further information about receptive and capacitive touchscreen, see our former blog entry). This not only solves the common problem of losing the input device, but also allows for more intuitive and natural styles of interaction, especially gestures. Gestures for doing certain commands follow gestures known from real life, like e.g. swiping over the screen to turn a page, or crossing out a list entry to delete it. Other gestures like e.g. pinch-to-zoom on the iPhone’s multi-touch screen don’t have an equivalent in real life but provide a feeling of interacting straight with the target object. Watch the following video for an example of the gestures in the photo app of iOS, the operating system of the iPhone.



With the fingers as input “device”, the graphical user interface has to be designed in a different way as compared to GUIs which are operated with a computer mouse. Because the finger lacks the preciseness of a mouse pointer, buttons, checkboxes, etc. have to be bigger, and while interacting with the device, some parts of the screen are covered and therefore not visible for the user at this certain point of time. These are just two examples of a variety of issues designers have to address when thinking of a touchscreen interface. In turn, using the fingers in combination with a multi-touch screen allows for more flexible, complex and at the same time more intuitive and simpler controls, where e.g. buttons are omitted in favor of gestures. 

Gesture Recognition Interfaces (GRI)
As we already mentioned gestures for executing certain commands on a touchscreen, we now want to get rid of the surfaces on which those gestures are done and concentrate on spatial gestures.
A very well-known example of that technology is Microsoft’s Kinect together with the Xbox 360.
Kinect features a specific hardware device consisting of a range sensor, a multi-array microphone and a camera (see Figure 4) whose data is interpret by a proprietary software to gain a spatial image of the environment and to track movements, facial expressions and voice of up to six users. This information is then used to control certain objects or maybe a digital representation of the user, a so-called avatar, on a displaying device.
Figure 4: Microsoft Kinect

Initially, Kinect was designed solely for Microsoft’s Xbox360 and games, but short time after its market launch, users wrote drivers to make it also work on Windows 7 and Linux. Moreover, Kinect enjoys great popularity among scientists and researchers. Microsoft also realized the potential of this technology beyond games and is now heavily researching on new gesture interfaces (see the video belowand even published an SDK for developers to write their own Kinect applications.


Voice User Interfaces (VUI) 
Another very interesting development is the emergence of voice user interfaces. Coming from a microphone, the underlying software recognizes the user’s voice and interprets his commands. That is why VUIs usually come without any hierarchical menus as known from GUIs and allow the user to directly express his wishes.

While the voice recognition and interpretation software has to be error proof and able to deal with different phrases and wording, the user also must know, how to communicate with the device, that is, e.g. which commands and also languages the VUI can understand and interpret.

In order to get an impression how voice user interfaces can be implemented, you should check out Siri on Apple’s iPhone 4S, as featured in the following video clip.


Conclusion
In the beginnings (ref. CLI), the user had to adapt to the computer in order to communicate with it, like e.g. following strict interaction guidelines, specific input languages and syntax or predefined input fields. With new forms of user interfaces and input styles like voice and gesture recognition, the communication between user and computer becomes more human and resembles the interaction between people. It should therefore be simpler and more intuitive.

However, there are also drawbacks. Because those interfaces allow the user to interact in an open manner, they are more error-prone then former types of input. The programs and algorithms behind those interfaces must therefore not only follow design guidelines, but must also process and correctly interpret huge amounts of data and at the same time be as error proof as possible. As all of us already have experienced misunderstandings while communicating and interacting with other people, this part seems to be a tough challenge for future user interfaces.

1 comment: