Computer “Vision” for the Visually Impaired: I have a dream

Karim Ouda
7 min readDec 12, 2015

Throughout my life I have never dealt with a visually impaired person, never had a friend who can’t see and never knew about their problems and how their life looks like. As a Computer Scientist i thought they definitely have enough support from technology since we are in 2015 and i used to see many graduation projects focused on helping blind people 10 years ago. Sadly, that was found to be untrue.

The Shock

This year I had my first experience with a Visually Impaired friend. She is a highly educated intellect. She is independent and can do everything by herself even cooking. To cut it short, she is more productive and valuable than many lazy guys out there with 2 useless working eyes.

As a curious entrepreneur and a post-grad student who was studying Computer Vision at that time; i was observing how she manages to do everything, what are the difficulties and what Apps and Tools she uses. I was always asking her and myself “How can technology help in your daily life”. One day we conducted an experiment to check whether Mobile Apps can help her navigate and cross the streets or not.

The overall conclusion of my experience was

  • Unfortunately, technology is not effective in helping them doing their daily basic tasks
  • The mobile phone is not the best medium/technology to support blind people, it is not practical and it was not designed for that
  • Most of the mobile apps i tried were not effective
  • Many support software (like JAWS) and mobile apps are commercial and expensive

Their Needs

As mentioned in the previous section i was always asking my friend about her expectations from technology and then try to map that to an actual technical solution. Below are some requirements i got from her in addition to other ideas and personal observations.

They need

  • To Recognize objects around them
    Case: If something fell on the ground, they can’t find it easily. Also they may forget the location of their belongings (the keys or example).
  • To Recognize people and faces
    Case: when a visually impaired person enters a room they don’t know who is inside and they have to wait for the other people to greet and identify themselves which is really frustrating.
  • Fast text reader and color recognizer
    Case 1: Imagine you are blind and you want to buy milk, how would you check the expiry date, how would you make sure you didn’t pass that date later.
    Case 2: You go shopping, you can touch the cloth to know the quality of the material, how would you know the color, size and the price.
  • Support navigating, avoiding obstacles and crossing the street
    One of the major problems — even in top world cities
    Technology support is needed to help the blind walk fast and safe to his/her target destination, specifically help is needed to give warning if you are about to hit an obstacle and need to step down the curb. Another problem is knowing when to cross the street. Maybe technology can help detecting the color of the traffic light and the surrounding movements to support the blind pedestrian decision. Finally indoor navigation is needed.
  • Better experience interacting with mobile
    Current accessibility and voice guiding features are good, but not good enough. I believe navigation should be more intuitive and driven by natural conversation between the Human and the Mobile system.
  • Scene Explainer
    Ever thought if you are blind how would you see images and videos, for those who lost their sight in their mid life and know what is a sky and how does a river look like, it would be very nice for a system to be able to explain scenes.

My Dream

My dream is an “effective personal assistant device” for blind people. A supplemental digital eye + a smart personal assistant which provide them all kinds of support they need in their daily life …

Let’s call it “Project S”

S is an Augmented Reality wearable device, specifically “Google glass” or a Google-glass-like device which doesn't make the blind person look weird. It is equipped with camera, headphones (plus bone conduction) and internet connection (WiFi and 4G).

S is also connected to a super fast small processing unit which can placed in a bag or attached on a belt. Smart Computer Vision Learning system is installed on that processing unit to handle, analyze and respond to sensory inputs(camera and mic). It is also needed for the device to work in offline-mode.

S can do the following functionalities

  • Recognize Faces and Objects using a “deep-learning system” — Convolutional Neural Networks (CNN) like GoogLeNet — The system can recognize faces and will ask for a name for each face to associate with so next time it can tell you who is in the room. The system is also capable of recognizing objects indoor and outdoor and is able to change modes when triggered by some objects.
  • Read Text and Recognize Colors
    The user is in the store, he/she can read product names, expiration dates and choose their preferred product color. Another use case is triggering the “street-cross mode” after detecting the traffic light object and then give advice to pass when signal turns to green.
  • Navigation (Indoor and outdoor) and Obstacle Warning
    The system utilize both GPS, camera input and object detection and maybe laser distance measurement to give advice on outdoor navigation as well as indoor. An example: when walking in a park and the footpath is curving and you don’t own a guide dog (an actual case that really happened) the system will give warning and produce stereo sound in the direction of the curve as well as warning for obstacles and curb edges.
    Think about a Google Self Driving System for Humans.
  • Scene Explainer
    A blind girl is walking in the street and hears some noise which sounds like a street show. The girl ask the device to describe the scene using voice command; the system respond with “a man juggling balls”. An illustration of such functionality can be found in this facebook video. You can imagine so many other cases.
  • Modelling 3D space using sound
    This is something i don’t have experience in and not sure if it could even work. The first case was illustrated in the “Navigation” feature above to help the user feel curves using sound. The question now: can we model the whole 3D space using sound waves to give the blind person a new dimension of moving things that does not produce sounds ?
    If you close your eyes and a Bee passes from your left to your right you can see the direction although you don’t use your eyes, right ?
    Now, can a blind guy play Table Tennis ? as illustrated in the following links Holophonics Holophonic 3D Sound Compilation using such technology we can convert the movement of the ball to sound waves so that the brain can workout its location and the blind user can return the ball back ! I hope it is possible …
    (UPDATE: https://www.youtube.com/watch?v=I0lmSYP7OcM)

Now it is time for some Non-functional requirements

  • This device shouldn't be commercial, it should be given for free or at least at production price. I believe organization interested in the welfare of blind people and philanthropic foundations such as Bill & Melinda Gates Foundation and Chan Zuckerberg Initiative can help in that
  • The device should work offline and online
  • The device is processing intensive so it should be supported by a processing unit as well as a centralized cloud system
  • Strong battery
  • It should be a personalized learning system
  • Multilingual

Conclusion

That was my dream. I am looking forward to that day when i can see it happening, a day where i can obtain an early version of that device and give it as a personal gift to my dear friend and see the effect of technology on her life.

So far i can think of 2 companies capable of creating such device: Google and Facebook.

Life is full of false hopes, let us give them something true, a thing we know for sure that it will work …

Bellow are some relevant projects, none of which managed to cover all the functionalities listed earlier. The nearest project in terms of vision and functionality is Horus (well done guys).

Projects

http://horus.tech/en
http://www.openshades.com
http://www.orcam.com
http://www.project-ray.com
https://www.seeingwithsound.com (UPDATE)
http://pages.iu.edu/~hsseth/ (UPDATE)
https://sites.google.com/site/younghoonleehome/projects/computer-vision-for-the-visually-impaired (UPDATE)
http://www.va-st.com (UPDATE)
http://www.wired.com/2016/01/2015-was-the-year-ai-finally-entered-the-everyday-world/ (UPDATE) — Dulight

https://www.youtube.com/watch?v=R2mC-NUAmMk (UPDATE)

Videos

https://www.youtube.com/watch?v=SrpDlGxPJJc https://www.youtube.com/watch?v=4v8Mm5JotrY https://www.youtube.com/watch?v=4UGpDAmiVPM
https://www.youtube.com/watch?v=nrWfLAh2T3k
https://www.youtube.com/watch?v=Q07oHm3zh04
https://www.youtube.com/watch?v=I0lmSYP7OcM (UPDATE)
https://www.youtube.com/watch?v=monzuLsqcRc (UPDATE) https://www.youtube.com/watch?v=Xe5RcJ1JY3c (UPDATE)

Apps

http://www.knfbreader.com

Other links

http://blind.tech

--

--

Karim Ouda

Freelance Consultant — Data & Product. Writing about Data, Entrepreneurship and Life. https://karim.ouda.net