Friday 13 April 2012

Seeing the world through augmented eyes

In the end of last year, Seth Weintraub, a blogger for 9to5 Google, talked about Google introducing wearable heads-up display glasses. With those, it shall be possible, to augment reality without holding a device in your hands and looking into a small screen. Instead, there is a transparent screen right in front of your eyes, which augments reality with additional information layers.

What sounds like science-fiction Terminator technology now isn't that much science-fiction anymore. Samsung (as covered in an earlier blog post) came up with a transparent display, smartphones and their processing units get faster and faster and furthermore even smaller and are therefore capable of complex calculations like real-time augmentation. So as recent news show, wearable technologies like heads-up display glasses could already become reality this year as Google is testing those devices in the wild, labelled Project Glass.
This is how the latest prototype of Google Project Glass looks like

The device itself processes information gathered from the person's surrounding, augments it with useful information in real-time and displays this information on a transparent screen along the real objects it is assigned to. Paired with voice recognition, the person wearing the device can controll the different functions and express his commands like where he wants to go, what meeting is up next and when, and so on. The possibilites of this is only limited by one's imagination and as shown in the following impressive video could really be useful in everyday life.

 

So what do you think about this development and the blurring of the real and virtual worlds? Do you think this is the future of how our interaction devices will be like?

If you want to stay updated on this matter, visit the Google+ Site of Project Glass.

Sources:
[1] http://9to5google.com/2011/12/19/google-xs-wearable-technology-isnt-an-ipod-nano-but-rather-a-heads-up-display-glasses/
[2] https://plus.google.com/111626127367496192147/posts
[3] http://www.engadget.com/2012/04/06/google-project-glass-sergey-brin/
[4] http://www.engadget.com/2012/04/04/google-testing-heads-up-display-glasses-in-public-wont-make-yo/
[5] http://www.theverge.com/2012/4/4/2925237/googles-project-glass-augmented-reality-glasses-begin-testing

Saturday 7 April 2012

Visualization

‘The man who can’t visualize a horse galloping on a tomato is an idiot’ – AndrĂ© Breton
Visualization is one of the important aspects for data representation. A proper artificial stimulus can produce the same effect as natural objects, with visual stimuli being extremely effective.  In general, visualization helps human to handle vast amount of data easily. The object of visualization is to increase human understanding and readability of complex data by taking advantage of the high-bandwidth human visual channel. In addition, the techniques used in visualization are majority from the field of computer graphics.

A comprehensive definition for visualization is given below ,[2], [1]:
‘Visualization is a cognitive process using the powerful information processing and analytical functions of the human vision system. It has always been a major factor in scientific progress, and now, with the assistance of computer graphics, it extends our vision system from sub-atomic to interstellar dimensions and allows geometric representation and simulations of any multidimensional data set. The fundamental objective is to acquiring new knowledge rather than generating images’.

A modern technique used for visualization is tone mapping. Tone mapping is only dealing with scalar data that came from difference input sources. In addition, tone mapping can extend data range significantly. Furthermore, the scale that the raw data are represented in is not compulsory compatible with sensory response curve of human photoreceptors, and therefore a direct linear mapping of the input data to light intensity does not have the desired visual effect. The majorities of tone mapping techniques work on the spatial domain and are therefore categorized under either local or global process, depending on the nature of the compression.

Transfer function is commonly used in tone mapping.  A transfer function is a special algorithm that mapping a space to another space based on some interest. A common adopt transfer function is based on luminance-chrominance HDR (High Dynamic Range).

In designing a visualization, technical issues is not the one must take into consideration but also the characteristics of the human visual system, that is, matters of perception. As known, human eye are sensitive to variation in intensity as well as chromaticity. A contrast sensitivity research, [3], showing human sensitivity to luminance contrast is very different from human sensitivity to chrominance contrast.  [4] found out that red-green and blue-yellow contrast sensitivity functions have similar spatial bandwidth.

Several color models can serve as intermediate level for conversion. RGB color model generally suffers from high channel correlation and mixing on luminance and chrominance. YUV and YCrCb are used to solve to problem from RGB model as Y channel being the luminance value and the other two channels is a combination of red-green and blue-yellow. These models do not solve entirely the correlation problem because the chrominance channel still correlated. Thus, other color model is required to decouple the channels. XYZ color model and LAB color model offer such requirement.  In addition to LAB color model, it offer a nice property that the L, A, and B channel have a Euclidean relationship. Other color model exists for special proposed such as HSV color model and CMYK color model.   

In conclusion, visualization is the most effective way for human to analyze and manipulate data. Various visualization principles need to consider for achieving highest possible way for conveying information. Once again, a picture is worth that a thousand words !


References:
[1] T. Theoharis, G. Papaioannou, N. Platis, N.M. Patrikalakis. Graphics & Visualization: Principles and Algorihthm.  pp. 231 - 365. A K Peters, Ltd. 2008.

[2] University of Edinburgh. Visualization, 2005.

[3] G.J.C. van der Horst, C.M.M. de Weert, and M.A. Bouman, ‘Transfer of spatial chromaticity-contrast at threshold in human eye’, Journal of the Optical Society of America, vol. 57, no. 10, pp. 1260-1266, October 1967.

[4] K.T. Mullen, ‘The contrast sensitivity of human colour vision to red-green and blue-yellow chromatic gratings’, Journal of Physiology, vol. 359, pp. 381-400, February 1985.

Tuesday 3 April 2012

The Battle of Mobile UIs - Android VS iOS

As many of you already know, there has been this ongoing battle between the two leading mobile platforms for smartphones, namely Android and iOS. So which is really better? Is there any real winner in this war of the mobile platforms? There probably isn't, in terms of overall user experience. However, we can do an analysis in the HCI context, and there might just be a winner with regards to HCI.



HCI Principles \ Mobile OS
iOS (latest: iOS 5.1)
Android (latest: Android 4.0.4 Ice Cream Sandwich)
Visibility – Users should see what functions are available and what the system is currently doing.
Buttons are generally visible and easy to find. The strength of the OS is in its simplicity and ease of navigation.
With the release of Android 4.0 (Ice Cream Sandwich), “hollow theme” was adopted, which got rid of the previous design for a sharp and high contrast UI theme which gives great visibility to the user.
Consistency – Components such as buttons, labels, messages, colour scheme, and menus should be consistent on all screens.
iOS always adopts a very consistent UI design and provides very clear design guidelines for application developers to follow.
Since Android 4.0, it is generally consistent. However, there were some problems with consistency among application developers. Some of the iOS developers that went over to develop apps for android tried to create an iOS-like feel on android, which causes the app to be inconsistent with the rest of the OS.
Familiarity – Use language and symbols that the user will be familiar with or suitable metaphors which help them transfer similar and related knowledge from a familiar domain.
Because of the consistency of iOS across multiple platforms (the different iPhones, iPads, and Apple TV), it creates familiarity for the users. Moreover, it is also quite similar to Mac OSX in with regards to the theme, icons, and general feel.
Android 4.0 is made to be multi-platform for use in both the tablet and phone. This will create a sense of familiarity for users of both Android tablets and phones. However, due to the availability of custom ROMs, there might be fragmentation in terms of user experience, resulting in a possible compromise in familiarity. Also, the inconsistency of designs in some apps might result in the loss of familiarity.
Affordance – Design things so it is clear what they are for.
iOS does follow a standardized conceptual model which is similar in iOS for all their devices. They also have a policy of enforcing their Human Interface Guidelines which creates a similar experience across their apps. This creates affordance in their UI.
Android follows a standardized conceptual model which gives the UI affordance. However,  this might be vary for different applications and custom ROMs as strict guidelines are not properly enforced.
Constraints – Provide constraints so people do not try to do things which are inappropriate.
This is a basic rule that was definitely enforced in the UI design of iOS and iOS apps.
This is a basic rule that was definitely enforced in the UI design of the stock Android OS. However, due to the availability of custom ROMs and not-so-strict
Navigation – Provide support to enable users to move around parts of the system.
Navigation to apps is done in the form of icons from the home screen and notification pulldown menu. This is quite intuitive for the users, and as a result the learning curve is very gentle for new users.
Navigation to apps can be done in the form of icons in the apps page, icons on the home screen, widgets from the home screen, or notifications pulldown menu. The use of widgets may not be as intuitive for users that are new to android.
Feedback – Rapidly feedback information from the system to the user.
iOS provides rapid feedback to the user in the form of sound alerts, status bar icons (battery, wifi signal, etc), popup notifications, progress bar indicators, notification messages via notification pulldown menu, colour changes with changing states of buttons, just to name a few. Thus, overall they do provide proper and rapid feedback to the users.
Android provides rapid feedback to the user in the form of sound alerts, changing states of widgets, status bar icons, popup notifications, progress bar indicators, notification messages via pulldown notification menu, colour changes with changing states of buttons, etc. Overall it is quite similar to what iOS has, with the exception of having widgets.
Recovery – ability to recover from actions, and errors quickly and with ease.
iOS devices have a physical ‘home’ button which also allows multitasking when double-tapped. This helps with recovery by allowing aborting apps quickly and switching between apps quickly.
The Android UI has ‘back’, ‘home’, and ‘multitask’ buttons either integrated as an on-screen button or as a physical button on the phone. This helps with recovery from actions in a familiar way to android users. The default picture editing app also allows for undoing and redoing in case of mistakes. Edited photos are also saved in another folder separately instead overwriting the original file to allow for easy recovery. Moreover, due to the open source nature of Android, bootloaders such as ClockworkMod Recovery are developed which allows easy backup and recovery of the ROM.
Flexibility – allow multiple ways of doing things which accommodates users with different levels of experience and preferences.
iOS is quite inflexible because it does not have much room for customization unless jailbroken.
Android is extremely flexible due to the customization it provides and the open source nature of the OS. It accommodates both the power user and the average user. Open source also means that very advanced users can even modify the OS to their liking to be used.
Aesthetically Pleasing – visually pleasant, appealing, and friendly.
By my personal opinion, iOS is very visually appealing and a pleasant experience to use, with smooth animations and a beautiful design.
By my personal opinion, Android is aesthetically pleasing only after version 4.0 with improved UI design, smoother and faster animations. Prior to that, I felt Android 2.X has a comparably less-pleasant look, and choppy animations.

All in all, I would conclude that there is probably no clear winner between the two mobile OSes in terms in the HCI context (applying the HCI principles). However, it can be noted that while both platforms generally design their UIs well to provide a pleasant and smooth experience for the users, Android has higher flexibility (accommodating the more tech-savvy people with the ease of customization and open source nature of their OS), whereas iOS focuses more on a holistic, intuitive, and familiar experience across all their products (iPhone, iPad, Apple TV, Mac OSX). Competition is between different mobile platforms is always good, because it will push companies to further improve their products and striving to provide a better user experience. Thus, I look forward to the endless pleasant surprises from these two tech giants will showcase in the future with regards to iOS and Android.

Sources - CS3240 Lecture 2 notes by Prof Bimlesh.
Author's Note (XC): Comparisons between the 2 mobile OSes are a result of my own personal opinions, experiences, and knowledge with the both platforms.

Tuesday 20 March 2012

Evaluation of UI

In this post, i shall be examining the case of Role-Playing games, which generally places heavy emphasis on statistics and numbers. But before that, let us zoom into the question of whether we can make use of usability heuristics as a means of evaluating such a game interface. Let us first look at Nielsen's 10 Usability Heuristics -
  • Visibility of system status
  • Match between system and the real world
  • User control and freedom
  • Consistency and standards
  • Error prevention
  • Recognition rather than recall
  • Flexibility and efficiency of use
  • Aesthetic and minimalist design
  • Help users recognize, diagnose, and recover from errors
  • Help and documentation
These are golden rules that can be applied for any task-driven application. By fulfilling them, user can expect a software that is satisfying and easy to use. In other words, they are rules for evaluating usability. However, what is lacking in this heuristics is the task itself. It should be fairly obvious that a less tech-savvy person would be less motivated to use a software meant for compiling java codes than someone who is, regardless of how usable the program is. For a task-driven software, the goal is to make it easy and intuitive to use. In contrast, the goal of a game interface is to attract people to use it, and it does so with a key element missing in these heuristic - fun.

In fact, fun and usability may not be as inter-dependent as we thought they would be. An application that is simple to use may be rated highly when evaluated using Nielsen's heuristics, but may not be fun to use at all. Before we go any further, let us look at GameFlow Heuristics (Sweetser & Wyeth, 2005).
  • Concentration: Games should require concentration and the player should be able to concentrate on the game.
  • Challenge: Games should be sufficiently challenging and match the player’s skill level.
  • Player Skills: Games must support player skill development and mastery.
  • Control: Players should feel a sense of control over their actions in the game.
  • Clear Goals: Games should provide the player with clear goals at appropriate times.
  • Feedback: Players must receive appropriate feedback at appropriate times.
  • Immersion: Players should experience deep but effortless involvement in the game.
  • Social Interaction: Games should support and create opportunities for social interaction. 
In particular, let us consider the second heuristics which is 'Challenge'. An interface that is easy to use would probably not be challenging to use. In contrast, an interface with multiple layers of complexity may give users the satisfaction of mastering the game as they move deeper and discover more for themselves. For instance, take the RPG game 'Diablo'. A new player who plays the game for the first time would probably be overwhelmed by the amount of text that they need to read, most of which probably would not make much sense. However, as they play and experiment with the game, they start to learn more about what they mean and get a greater sense of control and mastery of the system, both of which contributes to what makes the game fun.


Malone's Heuristics for Designing Enjoyable User Interface (Malone, 1982) also supports the element of challenge in game interfaces.

1. Challenge
  • Goal. Is there a clear goal in the activity? Does the interface provide performance feedback about how close the user is to achieving the goal?
  • Uncertain outcome. Is the outcome of reaching the goal uncertain?
  • Does the activity have a variable difficulty levels. For example, does the interface have successive layers of complexity?
  • Does the activity have multiple level goals? For example, does the interface include scorekeeping?
2. Fantasy
  • Does the interface embody emotionally appealing fantasies?
  • Does the interface embody metaphors with physical or other systems that the user already understands?
3. Curiosity
  • Does the activity provide an optimal level of informational complexity?
  • Does the interface use audio and visual effects: (a) as decoration, (b) to enhance fantasy, and (c) as a representation system?
  • Does the interface use randomness in a way that adds variety without making tools unreliable?
  • Does the interface use humor appropriately?
  • Does the interface capitalize on the users' desire to have "well-formed" knowledge structures?
  • Does it introduce new information when users see that their existing knowledge is: (1) incomplete, (2) inconsistent, or (2) unparsimonious?

Malone brought up an additional point, and that is the use of randomness to generate fun. If we look at Nielsen's Heuristics, the functions of the system should be visible to the users which can help in lowering the gulf of execution. Having an element of randomness in a task-driven software would probably be counter-intuitive. Clicking on a search button twice should produce the same result sorted by relevance, rather than a random mess of results. In a game interface of a RPG, randomising the statistics of a character can increase variety, and repeated clicking on the 'randomise' button to get a player's desired statistics and ultimately getting them can be satisfying.

For a more concrete example on the use of Malone's Heuristics to evaluate game interface, let us now look at the game interface of Torchlight, an action role-playing game which is fairly similar Diablo. The gameplay heavily emphasises on the collecting of loots (or resources dropped by monsters as they are vanquished) to build the characters the player play and thus, some sort of system has to be in place to help players manage all them. As the game is highly complex with a lot of statistics in play, we will only focus on the interface as presented in this screenshot here.


The game will keep track of all of your progress in terms of your level, the skills you have and so on. As your pointers hover over any of the 'class skills' on the right, a popup will appear showing you the current rank of the skill and the level needed to improve it. This gives gamers motivation to level up in order to improve their character. At the left side of the screen, it shows you your current XP (or experience points) and a bar that shows how close you are to leveling up.

While it may seem that the screen is cluttered with too much information at all, a great amount of them are needed when the player is trying to customise his character. For instance, while the player is trying to pick a skill, he might want to see what his existing skills are or what his statistics are in order to pick the best choice. While the player is not doing that, he can close both of the pop-outs at the side, and resume his game. The only HUD that is present will be the lower bar, which is an essential tool for gameplay as it shows when a skill is ready to be used, what key to activate it and how much of consumables are left.

In conclusion, the game interface as presented in the screenshot shows just a good amount of information without being overly cluttered. Most of the information are hidden in the form of pop-outs that will only appear when the player hover over it. It uses mostly icons to represent skills over the use of words. For a new player, none of the icons would probably be familiar and could be overwhelming. However, this could result in greater fun as the player tries to overcome the system and gain mastery over it.

References - http://beccascollan.com/wp-content/uploads/2009/04/emotionhci.pdf
Screenshots - http://www.torchlight-2.de/bestes-action-rpg-der-letzten-jahre.t3242.html

Monday 19 March 2012

Speeding Up Touch

With touchscreens as the dominant input style on mobile devices, users are able to manipulate the information and graphics directly on the screen using their fingers or sometimes a stylus. Compared to former versions, today’s touchscreens may be very good in terms of color rendering, viewing angles and brightness. However, to give the user the impression of directly manipulating the displayed information, there is another characteristic which has to be considered: latency.

In the world of displays, latency or lag is the “difference between the time a signal is input into a display and the time it is shown by the display.” [1] Combining this with touch-capabilities, one can say it is the difference between touching the display, computing the input and showing the result on the screen.

Today’s touchscreens as e.g. those on smartphones have latencies of about 100ms, which is already pretty fast but still produce a noticeable lag when executing rapid movements or gestures. For example when quickly dragging an icon from one corner to another corner on all of today’s tablets, the finger reaches its goal a tiny bit earlier than the moving icon, giving it the effect of trying to follow and catch up with its manipulator, that is the finger. This lag may give the impression of not directly manipulating the display’s information and therefore impairs the overall user experience, especially the one of immediate feedback.

However, recent developments at Microsoft Research Labs and their Applied Sciences Group may put a stop to that issue, as they were able to reduce the latency by 99% as compared to average touchscreens, giving the display a latency of 1ms and hence not perceivable for the human eye [2]. 



The results they were able to achieve in a lab environment are impressive and definitely worth to check out if you always felt to be slowed down by your display’s reaction time. However, if this technology ever sees the light of electronic stores or that of your home will depend on the costs to make it a mass-product and the consumers’ care of a more natural feeling as today’s touchscreens already do a decent job.

Sources:

Tuesday 13 March 2012

HCI and Multimedia

In the past decades, usability was the antecedent research area in HCI. It often finds the balancing of interaction between human and computer. Thus, it is more on human adaptive to the machine. With the emergence of multimedia, a multi-modal information from senses, HCI has changed their focuses. It is no longer for HCI to find the balance rather HCI is now more biasing towards human-oriented and -specialty.  Furthermore, HCI is now more on simplicity. Simplicity has a very broad sense. It often includes user-friendly, natural and etc.

Multimedia system is tightly connected to human perceptual system. In fact, human beings are amazing multimedia system.  Generally, human perceptual system is composed of visual, acoustical, haptic, taste, and smell sense.  This forms the basic consideration for designing a multimedia system. Multimedia system is thus defined as a system that can receive and process multi-modal information from those senses and produce desired multimedia output effortlessly. Multi-modal information contains high-level abstract details produced by human such as sound, music, speech, gesture, reading, writing and etc. Thus, coordination of interaction became an important issue.

Being simplicity or naturally for human to interact, the system must be complex in handling information. This forms the trade-off between HCI and multimedia system. The following sections provide an overview of current and/or future applications that required minimum interaction yet a powerful and desirable system from user point of view.

Information Processing System
With the advancing on internet and hypertext technology, information is widely available and being directly interacts to user. One such example is searching (Information retrieval). There are five basic categories for this system, namely: free text system, information retrieval system, information extraction system, questioning and answering system, dialog system and natural content processing system, [3]. The interaction for those systems is often restricted as a form of simplicity (Or we can say that user tends to be ‘lazy’). The terms ‘relevance’ is often concerns by user when they interact to the system. Thus, the output of information processing system must have certain confidence level about the relevance detail for the particular user requirement. Sometimes, relevance can be referred as rank. This is a primary concerns for a search engine where the interaction is often easy and simple, type in query or even supply with an image.
Technically, a text-based content retrieval system consists of a relevance-feedback-term-based analyzer which in turns consists of term selection algorithm, stemming algorithm, similarity measure, vector space model and latent semantic analysis, [3]. While, an image-based content retrieval system consists of series or single technique found in the discipline of computer vision and image processing. Such technique can be color histogram [3], color coherent vector model [3], color correlogram [3], saliency detection [3], edge detection model [2][4], mathematic morphological model [2], automatic seeded region growing[2] and a lot more.

Speech Processing
Speech is a natural form of communication between human and it reflects the variability and complexity of humans. Speech processing is the process aiming at modeling and manipulating the speech signal to be able to transmit, produce and recognize, [1]. There are a lot applications involving speech processing such as information inquiry system, voice control system, voice synthesis system, audio-book and etc. The interaction of this kind of system is more simplify and natural.
Technically, a speech processing system is based on hidden Markov model (HMM). A simple architecture is shown below.




Digital face beautification
Digital face beautification is a new developing research area and it often required image processing technique (sometimes, it qualify as computational photography). Nowadays, image processing methods for computational photography are of paramount importance in the research and development community.  This field is mainly involved human visual sense yet an interesting and potentially commercial successful application.
Technically, a digital face beautification system involved machine learning, face detection, facial feature detection and image warping, [4]. Two common machine learning methodologies are applied in this field: K-nearest neighbor (KNN) based and support vector machine (SVM) based.
Each of those technical terms mentioned above is hardly to understand in one-shot (sometimes, it takes months to understand !!). Perhaps, you will perceive that there is no link to HCI. In fact, those technical details are emphasis the simplicity of interaction between user and computer by adding more abstraction, or complexity, to the system. Yet, it also represents the transition from single-user based to multi-user (social community) based interaction. Thus, the trend for HCI in multimedia system is going to be simplicity and natural.

References
[1] CS5241 Speech Processing, AY2010/2011 Semester 2, NUS, SOC
[2] CS4243 Computer Vision and Pattern Recognition, AY2011/12 Semester 1, NUS, SOC
[3] CS5342 Multimedia Computing and Applications, AY2011/12 Semester 2, NUS, SOC
[4] CS5341 Computational Photography, AY2011/12 Semester 2, NUS, SOC

Saturday 25 February 2012

Bad Designs

In the pursuit for innovation, aesthetics or simply making the wrong choices, products fall into the "Bad Designs" category. Many of such products become unintuitive for the users, some becoming totally unusable, regardless of how special or powerful its functions are. Here I'll be giving a list of examples with bad designs.

Affordance
Designing of things should be done such that it is clear what it is for. A button is for pressing and a knob for turning.

What about this door here? Handles like this are designed for pulling. There is nothing unintuitive here for the door in the front. You can pull open the door and proceed down to the next door. You pull that door and what happens? The door cannot be opened! It turns out that the back door must be pushed, even though it looks similar to the one in front. In fact, many people got trapped in the walkway as a result.

What can be done to remedy this situation? It should be obvious both doors should be pulled. Alternately, a 'Pull' or 'Push' label can be displayed on the door, which i believe to be rather commonplace nowadays.


Consistency
This does not refer to just the colour schemes, but can also refer to placement of buttons and so on. This is one product that is inconsistent.

This is a kitchen timer. It should be easy to see how it works. Want to leave your favourite food to simmer while you watch television for 30 minutes? Turn it clockwise to 30. Now how about if you wanted to wait for 15 minutes? If you turn clockwise to 15, you are wrong. This is where the trouble starts. To set a timer of less than 15 minutes, you will need to turn one full round, then to 15. This clearly shows a lack of consistency in design.





Feedback
Feedback is incredibly important. We need to know the state of the system. We need to know the impact of what we have done. Is the application processed? Did the system received my application? The following is an example with bad feedback.

This is a coffee maker. On the menu are 4 buttons, which is nicely designed and clear what pressing them will do. But the problem starts after you press the button. The top red light is fairly clear in the message it gives. Lit means 'On'. Otherwise, 'Off'. The orange light, however, lights up when you press the lower right button - for 3 or less cups. The orange light is off if you select more than 3 cups. Generally, we would expect the light to go on when we select more, and off when we select off. The feedback given here is exactly the opposite.

And these are just 3 simple examples of bad designs. These examples are taken from http://www.baddesigns.com/, which has many more of them. We can certainly learn what not to do as we look at such examples.