EYEPHONE TECHNOLOGY
ABSTRACT
Eye phone is a hands free interfacing system that is used for
activating mobile phone by eye. Here the the functions of the phone can be
drive easily. The phone functions
activated by blinking of the eye. The navigation key fuctions are done by the movement of eye.
The principle behind in eye phone technology is Eye tracking systems.There is
no need for any other devices that placed in the eye for tracking the movements
It is done by the movement of pupil in the eye. The device sense the movement of the eye using
the pupil movement. Normal devices are used front camera to sense the eye movement. But in modern phones uses sensors
used to track eye movements. The details , design and working of the eye phone
shown below report.
TABLE
OF CONTENT
NUMBER
TITLE PAGE
1.
INTRODUCTION
2. Eye phone Technology
3. Human
Phone Intraction
What is eye tracking?
Eye
tracking Technology
4. EyePhone
Design
An eye detection
An open eye template
creation phase
An eye tracking phase
A blink detection phase
5. Evaluvation
Daylight
Exposure Analysis for a Stationary
Subject
Artificial Light
Exposure For A Stationary Subject
6. Applications
7.
Conclusion
8
Reference
INTRODUCTION
As smartphones evolve
researchers are studying new techniques to ease the human-mobile interaction.
We propose EyePhone, a novel“hand-free”interfacing system capable of driving
mobile applications/functions using only the user’s eyes movement and actions
(e.g., wink). EyePhone tracks the user’s eye movement across the phone’s
display using the camera mounted on the front of the phone; more specifically, machine
learning algorithms are used to: i) track the eye and infer its position on the
mobile phone display as a user views a particular application; and ii) detect
eye blinks that emulate mouse clicks to activate the target application under
view. We present a prototype implementation of EyePhone on a Nokia N810, which
is capable of tracking the position of the eye on the display, mapping this
positions to an application that is activated by a wink. At no time does the
user have to physically touch the phone display.
Eye phone Technology
EyePhone tracks the user’s eye movement across the phone’s
display using the camera mounted on the front of the phone. It a novel“hand-free”interfacing system capable of driving
mobile applications/functions using only the user’s eyes movement and actions
(e.g., wink). EyePhone tracks the user’s eye movement across the phone’s
display using the camera mounted on the front of the phone.Eye phone track the eye and infer its position on the
mobile phone display as a user views a particular application; and eye phone
detect eye blinks that emulate mouse clicks to activate the target application
under view. We present a prototype implementation of EyePhone.Eye phone is
capable of tracking the position of the eye on the display, mapping this
positions to an application that is activated by a wink. At no time does the
user have to physically touch the phone display. EyePhone
is the first system capable of tracking a user’s eye and mapping its current
position on the display to a function/application on the phone using the
phone’s front-facing camera. EyePhone allows the user to activate an
application by simply“blinking at the app”, emulating a mouse click. While other
interfaces could be used in a hand-free manner, such as voice recognition, we
focus on exploiting the eye as a driver of the HPI. We believe EyePhone
technology is an important alternative to, for example, voice activation
systems based on voice recognition, since the performance of a voice
recognition system tends to degrade in noisy environments.
v The
front camera is the only requirement in EyePhone. Most of the smartphones today
are equipped with a front camera and we expect that many more will be
introduced inthe future (e.g., Apple iPhone 4G ) in support of video
conferencing on the phone. The EyePhone system uses machine learning techniques
that afterdetecting the eyecreate a template of the open eye and use template
matching foreye tracking. Correlation matching is exploited for eye wink
detection. We implement EyePhone on the Nokia N810 tablet and present
experimental results in different
settings. These nitial results demonstrate that EyePhone is capable of driving
the mobile phone.
Human Computer Interaction(HCI)
“HCI (human-computer interaction) is the study of how people interact with computers and to what extent computers are or are not developed for successful interaction with human beings”. Most HCI technology addresses the interaction between people and computers in “ideal” environments, i.e., where people sit in front of a desktop machine with specialized sensors and cameras centered on them
Human Phone
Interaction(HPI)
Human-Computer
Interaction (HCI) researchers and phone vendors are continuously searching for
new approaches to reduce the effort users exert when accessing applications on
limited form factor devices such as mobile phones.
Human-phone interaction
(HPI) extends the challenges not typically found in HCI research, more
specially related to the phone and how we use it. In order to address these
goals HPI technology should be less intrusive; that is ,
•
it should not rely on
any external devices other than the mobile phone itself;
•
it should be readily
usable with minimum user dependency as possible;
•
it should be fast in
the inference phase;
•
it should be
lightweight in terms of computation;
•
it should preserve the
phone user experience, e.g., it should not deplete the phone battery over
normal operations
HUMAN-PHONE INTERACTION
Human-Phone
Interaction represents an extension of the field of HCI since HPI presents new
challenges that need to be addressed specifically driven by issues of mobility,
the form factor of the phone, and its resource limitations (e.g., energy and
computation). More specifically, the distinguishing factors of the mobile phone
environment are mobility and the lack of sophisticated hardware support, i.e.,
specialized headsets, overhead cameras, and dedicated sensors, that are often
required to realize HCI applications. In what follows, we discuss these issues.
Mobility Challenges:
One
of the immediate products of mobility is that a mobile phone is moved around
through unpredicted context, i.e., situations and scenarios that are hard to
see or predict during the design phase of a HPI application. A mobile phone is
subject to uncontrolled movement, i.e., people interact with their mobile
phones while stationary, on the move, etc. It is almost impossible to predict
how and where people are going to use their mobile phones. A HPI application
should be able to operate reliably in any encountered condition. Consider the
following examples
Two HPI applications, one using the
accelerometer, the other relying on the phone’s camera. Imagine exploiting the
accelerometer to infer some simple gestures a person can perform with the phone
in their hands, e.g., shake the phone to initiate a phone call, or tap the
phone to reject a phone call . What is challenging is being able to distinguish
between the gesture itself and any other action the person might be performing.
For example, if a person is running or if a user tosses their phone down on a
sofa, a sudden shake of the phone could produce signatures that could be easily
confused with a gesture. There are many examples where a classifier could be
easily confused. In response, erroneous action could be triggered on the phone.
Similarly, if the phone’s camera is used to infer a user action , it becomes
important to make the inference algorithm operating on the video captured by the
camera robust against lighting conditions, which can vary from place to place.
In addition, video frames blur due to the phone movement. Because HPI
application developers cannot assume any optimal operating conditions (i.e.,
users operating in some idealized manner) before detecting gestures in this
example, (e.g., requiring a user to stop walking or running before initiating a
phone call by a shaking movement), then the effects
of mobility must be taken into account in order for the HPI application to be
reliable and scalable.
Hardware
Challenges:
As opposed to HCI applications, any HPI
implementation should not rely on any external hardware. Asking people to carry
or wear additional hardware in order to use their phone might reduce the
penetration of the technology. Moreover, state-of-the art HCI hardware, such as
glass mounted cameras, or dedicated helmets are not yet small enough to be
conformably worn for long periods of time by people. Any HPI application should
rely as much as possible on just the phone’s on-board sensors. Although modern
smartphones are becoming more computationally capable, they are still limited
when running complex machine learning algorithms . HPI solutions should adopt
lightweight machine learning techniques to run properly and energy efficiently on mobile phones.
What is
eye tracking?
Eye tracking refers simply to recording eye movements
whilst a participant examines a visual stimulus (Collewijn, 1991). Accurate eye
tracking must account for both the position of the head and the position of the
eyes relative to the head. The earliest eye trackers were mechanical devices
that must have caused participants great discomfort due to their size and invasiveness
(see Collewijn, 1991 and Hayoe, & Ballard, 2005 for descriptions of various
techniques). The technology has progressed significantly since then, such that
systems are now available that do not require headgear or any physical
attachments to be worn by the participant.
Two important aspects of eye tracking are calibrating the
system to specific participants and managing eye drift, via drift correction.
Calibration normally involves participants looking at an image (e.g., a dot or
a fixation cross) in a known location. The eye tracking system compares the
true location of the image to where it detects the participant’s gaze on the
screen, and applies
a
suitable correction for future fixations. Drift correction measures how much
the difference between a participant’s gaze and a central point “drifts” over a
short time period. Drift can occur because of factors such as fatigue and
changes in body (head) position. Moreover, the longer a viewing session, the
more drift that occurs, and thus the less precise the gaze recording.
Fixation duration, frequency, location and sequencing are
the primary measures of visual behaviour used to study face processing. Fixation
duration provides an index of the speed with which information is processed.
Increasing fixation duration is associated with tasks that require more
detailed visual analysis (e.g., Xiaohua, C., & Liren, 2007). Frequency of
fixations often serves as a measure of sampling quantity. Fixation location and
sequencing provide information regarding the regions of the face to which
participants are attending and information about the order in which stimulus
properties are sampled.
EYE TRACKING TECHNOLOGY
Eye
tracking is a technique where by an individual’s eye movements are measured so
that the researcher knows both where a person is looking at any given time and
the sequence in which their eyes are shifting from one location to another.
Tracking people’s eye movements can help HCI researchers understand visual and
display-based information processing and the factors that may impact upon the
usability of system interfaces. In this way, eye movement recordings can provide
an objective source of interface-evaluation data that can inform the design of
improved interfaces. Eye movements can also be captured and used as control
signals to enable people to interact with interfaces directly without the need
for mouse or keyboard input, which can be a major advantage for certain
populations of users such as disabled individuals. We begin this chapter with
an overview of eye-tacking technology, and progress toward a detailed
discussion of the use of eye tracking in HCI and usability research. A key
element of this discussion is to provide a practical guide to inform
researchers of the various eye-movement measures that can be taken, and the way
in which these metrics can address questions about system usability. We
conclude by considering the future prospects for eye-tracking research in HCI
and usability testing.
Most commercial eye-tracking systems
available today measure point-of-regard by the “corneal-reflection/pupil centre”
method . These kinds of trackers usually consist of a standard desktop computer
with an infrared camera mounted beneath (or next to) a display monitor, with
image processing software to locate and identify the features of the eye used
for tracking. In operation, infrared light from an LED embedded in the infrared
camera is first directed into the eye to create strong reflections in target
eye features to make them easier to track (infrared light is used to avoid
dazzling the user with visible light). The light enters the retina and a large
proportion of it is reflected back, making the pupil appear as a bright, well
defined disc (known as the “bright pupil” effect). The corneal reflection (or
first Purkinje image) is also generated by the infrared light, appearing as a
small, but sharp, glint (see Figure 1).
Figure 1. Corneal reflection and bright pupil as
seen in the infrared camera image.
Once the image processing
software has identified the centre of the pupil and the location of the corneal
reflection, the vector between them is measured, and, with further
trigonometric calculations, point-of-regard can be found. Although it is
possible to determine approximate point-of-regard by the corneal reflection
alone (as shown in Figure 2), by tracking both features eye movements can,
critically, be disassociated from head movements
Figure
2. Corneal reflection position changing according to point of regard (cf.
Redline & Lankford, 2001).
Video-based eye trackers
need to be fine-tuned to the particularities of each person’seye movements by a
“calibration” process. This calibration works by displaying a dot on the
screen, and if the eye fixes for longer than a certain threshold time and
within a certain area, the system records that pupil centre/corneal-reflection
relationship as corresponding to a specific x,y coordinate on the screen.
EyePhone Design
o
An eye
detection phase
o
An open
eye template creation phase
o
An eye
tracking phase
o
A blink detection
phase
Eye Detection
By
applying a motion analysis technique which operates on consecutive frames, this
phase consists on finding the contour of the eyes. The eye pair is identified by
the left and right eye contours. While the original algorithm identifies the eye pair with almost no error
when running on a desktop computer with a fixed camera (see the left image in
Figure 1).
Figure 1: Left figure: example of eye contour pair returned by
the original algorithm running on a desktop with a USB camera. The two white
clusters identify the eye pair. Right figure: example of number of contours
returned by EyePhone on the Nokia N810. The smaller dots are erroneously
interpreted as eye contours.
We obtain errors when the algorithm
is implemented on the phone due to the quality of the N810 camera compared to
the one on the desktop and the unavoidable movement of the phone while in a
person’s hand (refer to the right image in Figure 1). Based on these
experimental observations, we modify the original algorithm by:
i) reducing the image resolution, which
according to the authors in reduces the eye detection error rate, and
ii) adding two more criteria to the
original heuristics that filter out the false eye contours. In particular, we
filter out all the contours for which their width and height in pixels are such
that widthmin ≤ width ≤ widthmax and heightmin ≤ height ≤ heightmax. The
widthmin, widthmax, heightmin, and heightmax thresholds, which identify the
possible sizes for a true eye contour, are determined under various
experimental conditions (e.g., bright, dark, moving, not moving) and with different people.
Open Eye Template Creation
While the authors in adopt an online open eye
template creation by extracting the template every time the eye pair is lost
(this could happen because of lighting condition changes or movement in the
case of a mobile device), EyePhone does not rely on the same strategy. The
reduced computation speed compared to a desktop machine and therestricted
battery requirements imposed by the N810 dictate a different approach. EyePhone creates a template of a user’s open
eye once at the beginning when a person uses the system for the first time using
the eye detection algorithm described above1. The template is saved in the
persistent memory of the device and fetched when EyePhone is invoked. By taking
this simple approach, we drastically reduce the runtime inference delay of
EyePhone, the application memory footprint, and the battery drain. The downside
of this off-line template creation approach is that a template created
in certain lighting conditions might not be perfectly suitable for other
environments. We intend to address this problem as part of our future work. In
the current implementation the system is trained individually, i.e., the eye
template is created by each user when the application is used for the first
time. In the future, we will investigate eye template training by relying on
pre-collected data from multiple individuals. With this supervised learning
approach users can readily use EyePhone without going through the initial eye
template creation phase
Eye Tracking:
The eye tracking algorithm is based on the
template matching.
Figure 2: Eye capture using the Nokia N810 front camera
running the EyePhone system. The inner white box surrounding the right eye is
used to discriminate the nine positions of the eye on the phone’s display. The
outer box encloses the template matching region.
The
template matching function calculates a correlation score between the open eye
template, created the first time the application is used, and a search window.
In order to reduce the computation time of the template matching function and
save resources, the search window is limited to a region which is twice the
size of a box enclosing the eye. These regions are shown in Figure 2, where the
outer box around the left eye encloses the region where the correlation score
is calculated. The correlation coefficient we rely on, which
is often used in template matching problems, is the normalized correlation coefficient defined in . This coefficient ranges between -1
and 1. From our experiments this coefficient guarantees better
performance than the one used in. If the normalized correlation coefficient equals 0.4 we conclude that there is an eye in the
search window. This threshold has been verified accurate by means of multiple
experiments under different conditions.
Blink Detection
To detect blinks we apply a thresholding
technique for the normalized correlation coefficient returned by the template
matching function as suggested in. However, our algorithm differs from the one
proposed in. In the authors introduce a single threshold T and the eye is
deemed to be open if the correlation score is greater than T, and closed vice
versa. In the EyePhone system, we have two situations to deal with: the quality
of the camera is not the same as a good USB camera, and the phone’s camera is
generally closer to the person’s face than is the case of using a desktop
and USB camera. Because of this latter situation the
camera can pick up iris movements, i.e., the interior
of the eye, due to Eyeball rotation.
Table 1: EyePhone average eye tracking
accuracy for different positions of
the eye in different lighting and
movement conditions and blink detection average accuracy. Legend: DS = eye
tracking accuracy measured in daylight exposure and being steady; AS = eye
tracking accuracy measured in artificial light exposure and being steady; DM =
eye tracking accuracy measured in daylight exposure and walking; BD = blink
detection accuracy in daylight exposure
Figure 3: Eye tracking accuracy for
the middlecenter position as a function of different
distances between the phone and the eyes when the person is steady and walking.
EVALUATION
In this section, we
discuss initial results from the evaluation of the EyePhone prototype. We
implement EyePhone on the Nokia N810 [19]. The N810 is equipped with a 400 MHz
processor and 128 MB of RAM2. The N810 operat-
ing system is Maemo
4.1, a Unix based platform on which we can install both the C OpenCV (Open
Source Computer Vision) library [20] and our EyePhone algorithms which are
cross compiled on the Maemo scratchbox. To intercept the video frames from the
camera we rely on GStreamer [21], the main multimedia framework on Maemo
platforms. In what follows, we _rst present results relating to average
accuracy for eye tracking and blink detection for di_erent lighting and user
movement conditions to show the performance of Eye- Phone under di_erent
experimental conditions. We also report system measurements, such as CPU and
memory usage, battery consumption and computation time when running EyePhone on
the N810. All experiments are repeated fivetimes and average results are shown
Daylight Exposure Analysis for a Stationary Subject
The first experiment
shows the performance of Eye-Phone when the person is exposed to bright
daylight, i.e., in a bright environment, and the person is stationary. The eye
tracking results are shown in Figure 2. The inner white box
in each picture, which
is a frame taken from the front camera when the person is looking at the N810
display while holding the device in their hand, represents the eye position on
the phone display. It is evident that nine di_erent positions
for the eye are
identi_ed. These nine positions of the eye can be mapped to nine di_erent
functions and applications as shown in Figure 4. Once the eye locks onto a
position (i.e., the person is looking at one of the nine buttons on the display),
a blink, acting as a mouse click, launches the application corresponding to the
button. The accuracy of the eye tracking and blink detection algorithms are
reported in Table 1. The results show we obtain good tracking accuracy of the
user's eye. However, the blink detection algorithms accuracy oscillates between
_67 and 84%. We are studying further improvements in the blink detection as
part of future work.
Impact of Distance Between Eye and Tablet.
Since in the current
implementation the open eye template is created once at a fixed distance, we
evaluate the eye tracking performance when the distance between the eye and the
tablet is varied while using EyePhone. We carry out the measurements for the
middle-center position in the display (similar results are obtained for the
remaining eight positions) when the person is steady and walking. The results
are shown in Figure 3. As expected, the accuracy degrades for distances larger
than 18-20 cm (which is the distance be tween the eye and the N810 we currently
use during the eye template training phase). The accuracy drop becomes severe
when the distance is made larger (e.g., _45 cm). These results indicate that
research is needed in order to design eye template training techniques which
are robust against distance variations between the eyes and the phone.
System Measurements.
In Table 2 we report the average CPU usage,
RAM usage, battery consumption, and computation time of the EyePhone system
when processing one video frame { the N810 camera is able to produce up to 15
frames per second. EyePhone is quite lightweight in terms of CPU and RAM needs.
The computation takes 100 msec/frame, which is the delay between two
consecutive inference results. In addition, the EyePhone runs only when the eye
pair is detected implying that the phone resources
are used only when a
person is looking at the phone's display and remain free otherwise. The battery
drain of the N810 when running the EyePhone continuously for three hours is
shown in the 4th column of Table 3. Although this is not a realistic use case,
since a person does not usually continuously interact with their phone for
three continuous hours,the result indicates that the EyePhone algorithms need
to be further optimized to extend the battery life as much as possible.
Artificial Light Exposure For A Stationary
Subject
In this experiment, the person is again not
moving but in an artificially lit environment (i.e., a room with very low
daylight penetration from the windows). We want to verify if different lighting
conditions impact the system’s performance. The results, shown in Table 1, are
comparable to the daylight scenario in a number of cases. However, the accuracy
drops. Given the poorer lighting conditions, the eye tracking algorithm fails
to locate the eyes with higher frequency. Daylight Exposure for Person Walking.
We carried out an experiment where a person walks outdoors in a bright
environment to quantify the impact of the phone’s natural movement; that is,
shaking of the phone in the hand induced by the person’s gait. We anticipate a
drop in the accuracy of the eye tracking algorithm because of the phone
movement. This is confirmed by the results shown in Table 1, column 4. Further
research is required to make the eye tracking algorithm more robust when a
person is using the system on the move.
ADVANTAGE
·
Simply
used
·
Hands
free interfacing system
DISADVANTAGES
o
Light
dependent
o
Accuracy
less in night
APPLICATIONS
v EyeMenu:
An example of an EyePhone
application is EyeMenu as shown in belowfigure. EyeMenu is a way to shortcut
the access to some of the phone’s functions. The set of applications in the
menu can be customized by the user. The idea is the following: the position of
a person’s eye is mapped to one of the nine buttons. A button is highlighted
when EyePhone detects the eye in the position mapped to the button. If a user
blinks their eye, the application associated with the button is lunched.
Driving the mobile phone user interface with the eyes can be used as a way to
facilitate the interaction with mobile phones or in support of people with
disabilities.
Car Driver Safety:
EyePhone could also be used to detect drivers
drowsiness and distraction in cars. While car manufactures are developing
technology to improve drivers safety by detecting drowsiness and distraction
using dedicated sensors and cameras , EyePhone could be readily usable for the
same purpose even on low-end cars by just clipping the phone on the car
dashboard.
Phone face detection(Smart stay):
The Smart stay feature in Samsung Galaxy Series phones uses the front camera to detect when you are
looking at your device so that the screen stays on regardless of the screen
timeout setting and it auto adjust screen for better using phone. Smart stay
technology is works by using the technology of eye phone. Smart stay is a feature on the Samsung Galaxy in which it detects if you are looking at the
phone. By doing so, it keeps the screen on while you are using it, but turns if
off when you are not using it.
Fig: Activating smart stay function in Mobile
phones.
Smart stay is a feature on the Samsung Galaxy S4,S5
series phones
Conclusion
In this paper, we have focused on
developing a HPI technology solely using one of the phone’s growing number of
onboard sensors, i.e., the front-facing camera. We presented the implementation
and evaluation of the EyePhone prototype. The EyePhone relies on eye tracking
and blink detection to drive a mobile phone user interface and activate
different applications or functions on the phone. Although preliminary, our
results indicate that EyePhone is a promising approach to driving Mobile applications
in a hand-free manner.
·