Multimedia system is tightly connected to
human perceptual system. In fact, human beings are amazing multimedia system. Generally, human perceptual system is composed
of visual, acoustical, haptic, taste, and smell sense. This forms the basic consideration for
designing a multimedia system. Multimedia system is thus defined as a system
that can receive and process multi-modal information from those senses and
produce desired multimedia output effortlessly. Multi-modal information
contains high-level abstract details produced by human such as sound, music,
speech, gesture, reading, writing and etc. Thus, coordination of interaction
became an important issue.
Being simplicity or naturally for human to
interact, the system must be complex in handling information. This forms the trade-off
between HCI and multimedia system. The following sections provide an overview
of current and/or future applications that required minimum interaction yet a
powerful and desirable system from user point of view.
Information Processing System
Technically, a text-based content retrieval
system consists of a relevance-feedback-term-based analyzer which in turns
consists of term selection algorithm, stemming algorithm, similarity measure, vector
space model and latent semantic analysis, [3]. While, an image-based content retrieval
system consists of series or single technique found in the discipline of
computer vision and image processing. Such technique can be color histogram [3],
color coherent vector model [3], color correlogram [3], saliency detection [3],
edge detection model [2][4], mathematic morphological model [2], automatic
seeded region growing[2] and a lot more.
Speech Processing
Speech is a natural form of communication
between human and it reflects the variability and complexity of humans. Speech
processing is the process aiming at modeling and manipulating the speech signal
to be able to transmit, produce and recognize, [1]. There are a lot applications
involving speech processing such as information inquiry system, voice control
system, voice synthesis system, audio-book and etc. The interaction of this
kind of system is more simplify and natural.
Technically, a speech processing system is
based on hidden Markov model (HMM). A simple architecture is shown below.
Digital face beautification
Digital face beautification is a new
developing research area and it often required image processing technique
(sometimes, it qualify as computational photography). Nowadays, image
processing methods for computational photography are of paramount importance in
the research and development community. This field is mainly involved human visual
sense yet an interesting and potentially commercial successful application.
Technically, a digital face beautification
system involved machine learning, face detection, facial feature detection and
image warping, [4]. Two common machine learning methodologies are applied in
this field: K-nearest neighbor (KNN) based and support vector machine (SVM)
based.
Each of those technical terms mentioned
above is hardly to understand in one-shot (sometimes, it takes months to
understand !!). Perhaps, you will perceive that there is no link to HCI. In
fact, those technical details are emphasis the simplicity of interaction
between user and computer by adding more abstraction, or complexity, to the system.
Yet, it also represents the transition from single-user based to multi-user
(social community) based interaction. Thus, the trend for HCI in multimedia
system is going to be simplicity and natural.
References
[1] CS5241 Speech Processing, AY2010/2011 Semester 2, NUS, SOC
[2] CS4243 Computer Vision and Pattern Recognition, AY2011/12 Semester 1, NUS, SOC
[3] CS5342 Multimedia Computing and Applications, AY2011/12 Semester 2, NUS, SOC
[4] CS5341 Computational Photography, AY2011/12 Semester 2, NUS, SOC
url(www.interaction-design.org/encyclopedia/human_computer_interaction_hci.html), web resource
url(www.cs.cmu.edu/~amulet/papers/uihistory.tr.html), web resource
No comments:
Post a Comment