[Note: The content below *the line* was last updated before Fall 2011, a lot has changed since then. After I joined UC Santa Cruz, most of my new projects are research one's, which means that I try to convert my work to papers/publications. So you can find my work in the form of papers. Check out my CV or Google Scholar for more information. I'll try to update this page with more content, figures and graphs as soon as I get time. In the meantime however, I am putting up some quick project's information below. Most of my current projects are on the intersection of HCI, Crowdsourcing, Human Computation and ICTD right now. Thanks - Rajan Vaish 11/03/2012].
[i] 3D+2D TV: 3D Displays with No Ghosting for Viewers Without Glasses
3D displays are increasingly popular in consumer and commercial applications. Many such displays show 3D images to viewers wearing special glasses, while showing an incomprehensible double image to viewers without glasses. We demonstrate a simple method that provides those with glasses 3D experience, while viewers without glasses see a 2D image without artifacts. The paper got accepted at the ACM Transactions on Graphics 2012.
[ii] Exploring microwork opportunities at cybercafes
Poster accepted at the ACM DEV 2012. The business plan related to this concept reached semi-finals at the UC Berkeley’s Global Social Venture Competition 2012.
[iii] Exploring Employment Opportunities through Microtasks via Cybercafes
Paper accepted at the IEEE Global Humanitarian Technology Conference 2012.
[iv] Using crowdsourcing to generate ground truth data for computer vision training
This project is part of ongoing research with LANL under the category of “Human Assisted Computer Vision”. Computer vision is great, but at times it fails too. To train the algorithms, usually researchers spend several hours annotating images to create ground truths. Why not harness the crowd here? That’s what we’re trying to do, several experiments have been crowdsourced on Mechanical Turk and MobileWorks. See our social networking experiment too. You might want to try out the image annotation interface (via bounding boxes) from here. The paper is in progress right now. The experiment was conducted on two types of data sets, namely: Pedestrians and Bumble-bees.
[v] Digitization of Health Records in Rural Villages
This project was done in collaboration with HealthRecordsForEveryOne . Digitization of medical forms has always been a challenge, specially in the developing world set up. Again, why not use crowdsourcing to tackle the issue? In this project we tried to study various options among crowdsourcing platforms and compared it with traditional transcription or customized software widgets. Poster paper accepted at ACM DEV 2013, more work in progress.
[vi] Channelizing crowd via Crowdflower’s API
Crowdflower is a platform where work givers submit tasks. Crowdflower further uses its channels to get it done. An attempt was made to create one of those channels, so as to test whether those tasks can be done by crowd in developing countries. Link here. The project proposal entered semi-finals of the UC Berkeley’s GSVC 2012.
[vii] Understanding the color perception via a game through crowdsourcing
Link to game. The shape, size, color and speed changes as the game goes forward. The players of the game were crowdsourced via MobileWorks, and study was conducted as to how these parameters effect our visual ability. The game was developed using Processing.
———————————————————————————- Older projects [updated Winter 2010] ——————————————————————————–
Following is a list/description of selected projects (in no particular order) which I pursued out of complete personal interest and continuous desire to learn something new while staying engaged with something productive.
- I – OSM Direction tool for visually impaired
- II – Atlas America – OLPC Mexico
- III- Mujuntu (WikiStudios International)
- IV- Yahoo! Open Hack Bangalore 2009
- V- LifeCode Health
- VI – OSM navigation application for visually impaired on android
- VII – LiveGeo
- VIII – Selected ACCESSIBILITY Projects
- IX – Face recognition in video stream
- X- Extensions
I – OSM Direction tool for visually impaired.
Keywords: Accessibility, GSoC, OSM, Geo, GIS, Maps, Open Source, PHP, Python, Google App Engine, OpenRouteService, WebAnywhere, HCI
Abstract/Description: OSM Direction tool for visually impaired was developed during Google Summer of Code 2009 (GSoC’09) for OpenStreetMap Foundation (OSM). Due to heavy ajax integration, accessing directions over Google map or similar services was not very accessible through simple screen readers. Solution was to give direction info in pure HTML. There was a catch though, I could not use any APIs from a service which had its routing algorithm closed. So I began my experiments and developed few prototypes over open APIs, also created my own routing algorithm over YOURS optimizing Gosmore source code . The development of application required two important services -
(a)-Geocoding/ reverse Geocoding (to convert location to their respective latitudes/longitudes and vice versa), for which I experimented with OpenStreetMap’s Namefinder (deprecated since Aug’10) and University of Heidelberg’s OpenRouteService primarily, and chose the later.
(b) Routing and POIs (point of interests), for which I experimented with YOURS/Gosmore and OpenRouteService primarily, and chose the later. For reading the directions, I used University of Washington’s MIT TR’09 winning WebAnywhere screen reader (where I am a committer now).

Screen shot of audible map displayed. So if a user hovers mouse over "IBM", "IBM" will be heard, if mouse is moved left over "National Theater" , "National Theater" will be heard. This will help user to get spatial l information of the maps just by mouse hover. Locations across entire Europe is searchable.
I completed majority of project requirements, almost *a month* before deadline. Loving the R&D in the area, I did not want to freeze the project, rather began working on advanced features, like developing auditory mapping interface. I came across University of North Carolina, Chapel Hill’s BATS (Blind Audio Tactile Mapping system) and got in touch with Professor Gary Bishop, Dr. Peter Parente and Thomas Logan. On exploring BATS in-depth, I realized two constraints with it:
1- It was very static, only for ancient map of England and political map of North Carolina.
2- It was desktop based.
Overcoming both these constraints I used UNC’s Outfox and OSM’s Static Map APIs, and developed a web based mapping interface, where a person could search for any location within Europe (due to data constraints), and on mouse hover could get the spatial information of the location. This also required development of algorithms to sync relationship between image co-ordinates and lat/longs. Though majority of project was developed in PHP, requirement of its deployment over cloud – Google App Engine, also required manual translation of routing aspects into Python.
Award/Recognition: I was invited at State of the Maps’09, Amsterdam, as an acknowledgement to my efforts in the area. I also gave a talk on “Accessible maps for visually impaired” at Conference on Assistive Technology, Lucknow co-hosted by National Association for the blind and Rehabilitation society for the visually impaired.
Other relevant links and mentions: Wiki, Source code, Dr. Artem Dudarev (mentor), Dr. U.N. Sinha and Thomas Logan (unofficial mentor, guide for accessibility aspects).
II – Atlas America – OLPC Mexico.
Keywords: Educational project, Open Source, OLPC, Geo, Maps, OpenLayers, GeoRSS, QGIS, MapServer, PHP, HCI.
Abstract/Description: The “Atlas America” project was developed for One Laptop per Child (OLPC) Mexico in 2008, for their ongoing efforts to create a Geography course for children. With interest in Geography, I got in touch with Walter Bender and Samuel J. Klien (Sj), discussed my ideas and totally loved the concept behind this ongoing project. However, there were two problems, 1- I did not know Spanish (the project was to be developed with Spanish content), 2- I did not know anything about technologies which were to be used – OpenLayers, MapServer and QGIS. But it took me less than a week to get conversant with every technical aspect related, and I was good to go!
The project was divided into 20 chapters, each chapter teaching various geographical features across North and South America, like, mountain ranges, lakes, rivers, etc. Every chapter was divided into a map and description. Every map had multiple bubble tags for specific description, where children can upload more than 500 images for every tag associated. The maps were developed using OpenLayers and QGIS IDE, it was hosted on MapServer and data was fetched through GeoRSS. This was version 1, completed almost *a month* before deadline. Loving the R&D in the area, I did not want to freeze it, rather developed other relevant features like “Atlas America Notes”, which would enable children to make and save notes on the computer itself, this was done using PHP. While I also created a Google gadget for Geographic videos, in Spanish, using AOL Truveo’s API. This entire application gave children three way learning experience – text, images and video. The project is protected by Creative Commons Attribution 2.5 license.
Award/Recognition: I was awarded 1st prize for this project by Professor Sartaj Sahni (Chair, CS dept, University of Florida, Gainesville, USA) at International Conference of Contemporary Computing 2008, Noida, India. More details on OLPC presentation wiki here.
Other relevant links and mentions: Wiki, Source code and executables, Nestor Guerrero (mentor).
III- Mujuntu (WikiStudios International).
Keywords: Entrepreneurship, Opportunity Quest, University of Utah, Image Processing, Computer Vision, Object tracking, HCI.
Abstract/Description: Mujuntu application was developed under WikiStudios International entrepreneurial project. The application served as prototype/POC (proof of concept) during Opportunity Quest, an entrepreneurial competition by University of Utah, participated by Dominick Perrier-Strand and me. As per business plan’s requirement this project involving image processing and object tracking, was developed by me.
Award/Recognition: The project won “Spirit of Entrepreneurism” award! at Opportunity Quest and reached semi-finals of Utah Entrepreneurship Challenege (top 30 of 440) . The project also drew attention of Charles Mulvehill (Hollywood producer), who then showed interest to serve on the board of company. Taking initiatives to further this project we also participated in YCombinator Summer’09, Highland Capital Partners Summer@Highland’09 and LightSpeed Venture Partners Summer Grants’09.
Other relevant links/mentions:
–>As part of other entrepreneurial initiatives, I’ve also worked with Stuart Adams (LSU/MIT) and got interested Joel Bomgar (Founder and CEO of Bomgar Corporation, named Ernst & Young 2009 Entrepreneur Of The Year® Award Finalist) for our project related to events search engine in 2009.
–>My first entrepreneurial level interaction happened with Will Deane (CEO, MusicManagement LLC, USA) and Colin Sidoti (MIT) in 2008. I played role of strategist.
IV- Yahoo! Open Hack Bangalore 2009.
Keywords: Accessibility, Speech Interfaces, Windows 7 Speech Recognitions APIs, Educational project, Yahoo! APIs, PayPal APIs, HCI.
Abstract/Description: I was accepted for the program out of 3000+ applicants around India. Mission was to develop application(s) by hacking around with Yahoo! APIs, studying Yahoo! Developer Network and create something “cool” within 24 hours, non-stop. Well, so I coded for non-stop 24 hours and developed two applications with my team.
1- DirectCab: DirectCab is a unique hack, which bridges the gap between passengers and cab providers by directional marking (with information about incremental distances from source location) and navigation routes on Yahoo! Maps. It helps calculating distance, estimated time of travel and cab fair including surcharge% between two locations to avoid any excessive charges by cab drivers. It has in-built merchandise facility for instant payment of cab fair for the customers registered with the cab service, which happens through PayPal’s Adaptive Payments API.
2- Flagged!: Flagged! is an interesting accessible general knowledge game for children aged 5-8 yrs based on question/answer methodology. A set of Questions/Answers related to Country and their Capitals is stored within the application. Yahoo! Query Language is used to fetch the flag images and the current news for the respective country. Most importantly, speech recognition and voice synthesis have been extensively used keeping Accessibility in mind with both keyboard and speech based input/output for audible questions/answers. The speech recognition capabilities were developed using Windows 7 Speech Recognition APIs.
Other relevant links/mentions: Source code, Shirish Goyal (team mate).
Keywords: Entrepreneurship, MIT$100k, Wayne State University 2010 E2 Challenge summer venture, Microsoft Imagine Cup US, Mobile computing, Mobile health, HCI.
Abstract/Description: LifeCode Health project was born through brain storming sessions between students of Jaypee University IT, Wayne State University(WSU) and MIT back in 2009, for participation at MIT$100k Entrepreneurship Competition. The concept focused on redefining the effectiveness and usability of modern day medical data, through capabilities to record, organize, access, and securely store health information in digital format.
Award/Recognition: Since MIT$100k, the ideas kept evolving, later, under a modified team set, the project won funding through Wayne State University 2010 E2 Challenge summer venture. And has been finalists at Microsoft Imagine Cup (IC) US 2009 and 2010. Where the concept revolved around Smartphone and wireless sensor based real-time monitoring system for cardivascular care risk mitigation and to create virtual clinics in developing nations.
Other relevant links/mentions: My major role remains co-founding the project, and creating the founding team online by managing and connecting with students from WSU and MIT.
VI – OSM navigation application for visually impaired on android.
Keywords: Accessibility, Android platform, Geo, Open Source, OpenStreetMap (OSM), Mobile computing, GPS, Speech Interfaces, HCI.
Abstract/Description: Intrigued by a closed-source project by AT&T and its usability, this year while I was serving as co-admin for Google Summer of Code at OpenStreetMap, I planned to mentor its open source implementation. Source code of the project’s first version is available for download, while further development of the project is in progress and is being developed by a student volunteer – Vivek Kumar. Its currently being mentored by me and Birago Jones from Software Agents group, MIT Media Lab.
The project aims to be an accessible mobile application, a navigation tool for enabling visually impaired/low vision users to walk to a destination. Using GPS (LocationManager class) the application aims to detect current user location and show up the directions to the known destination. User is supposed to enter/speak the destination and direction routes are displayed or heard using TTS. The routing direction information is fetched using AOL MapQuest Open API (which now uses OpenStreetMap data). An open source development framework for building cross-platform mobile apps – PhoneGap, is used at back end to implement core functionalities, while jQTouch is used for displaying native UI.
Other relevant links/mentions: Source code
VII – LiveGeo
Keywords: Mobile computing, Geo, Google Maps API, Skyhook Wireless, SimpleGeo, SpotRank data, Smartphones, Population trends, Entrepreneurship, HCI.
Abstract/Description: The project was developed after a brainstorming session between me and my undergrad friend – Utkarsh Shrivastava (who was then a Masters student at Georgia Tech, USA) . Exploring SimpleGeo’s API we discovered huge potential in harnessing population ranking data/SpotRank data from Skyhook Wireless . SpotRank predicts the density of people in predefined urban square-block areas worldwide at any hour, any day of the week. Though several applications can be built over it (few in progress), we developed our first application to be as a comparison tool to evaluate population density between two chosen places at a given date and time. This will generally help user to choose route/place she is heading to. So for example, if there are two McDonald’s restaurants, user can choose the less populated one and hang out there, while at the same time advertisers can study user’s physical traveling behavior and target them appropriately. The evaluation happens on the basis of worldwide_rank, local_rank, trending_rank and city_rank and later plotted over a Google map. The application was developed in PHP and JavaScript using SimpleGeo’s PHP Client which is a PEAR package.
Award/Recognition: Being amongst first few developers of SimpleGeo API caught attention of its founder, Joe Stump. Utkarsh being in the US was interviewed by Christopher Mims who writes for MIT Technology Review, and our project was featured in MIT TR’s April issue with few other ideas we are working on right now.
Other relevant links/mentions: Source code (we plan to take project to next level and begin an entrepreneurial venture out of it, in that case we will be revamping the source code and might deprecate current version.)
VIII – Selected ACCESSIBILITY Projects.
VIII.a – AOL/TopCoder Sensations Developer Challenge Idea generation contest 2009:
Keywords: Accessibility, AOL, TopCoder, accessible email, visually impaired, old aged, cognitive disability, 3d assistive auditory interface, HCI.
Abstract/Description: This was my first project into accessible technologies. I made two submissions EMAIL4all and MyAIM , where earlier focussed on accessible email interface while later on accessible AIM chat client from AOL. For EMAIL4all, I focussed on a very common problem; there are millions of literate people who cannot access computer/web due to complex user-interfaces, cognitive disability or any technological challenges; there are also visually impaired people who are computer literate, but fail to access computer/web due to accessibility or tend to forget keyboard shortcuts/key locations. Solution was to create an application (I chose an email interface), which is not only with the simplest user-interface, but also accessible and could work with mouse, even for visually impaired people. My source of idea came from the way visually impaired people navigate within a room. They tend to keep articles along the wall, which makes it easy to search by sliding along it; they find obstacles using a cane, this also helps them getting the spatial information of the room and they use acoustic assistance to sense dynamic changes within a room or to locate target. Analogously, for the email application, the screen can be seen as a physical room, the cursor as a cane and the articles of the room as menus. As a person turns their head towards a direction of sound in a room, likewise if the sound effect is such that the menu operations are on the left edge, one gets a sense that sound is actually coming from the left. Interface plays a very important role as user navigational options are concentrated on the screen edges. This positioning will assist users in reaching options easily with positioning feedback by the assistive 3-D auditory interface.
Other relevant links/mentions: download EMAIL4all proposal, download MyAIM proposal for detailed aspects of the concept.
VIII.b -Microsoft Imagine Cup Accessibility Awards 2009:
Keywords: Accessibility, Microsoft, accessible email, 3d assistive sound interface, visually impaired, HCI.
Abstract/Description: The project was an extension to EMAIL4all proposal with surveys being formally conducted at School for the blind, Lucknow. The visit to the school gave me an opportunity to study learning methods of children and feasibility of its practical deployment, during this time I also connected with Professor Udai Narain Sinha of University of Lucknow, India (who happens to be a visually impaired professor). His guidance , concerns and view points on the project were of great help. Melissa Hui from Wayne State University also joint me in this initiative, further formalizing the concept as per Imagine Cup’s requirement. We participated as team “VisuAccess” and focussed more on the machine in-dependency aspects, and server side implementation of the EMAIL4all concept. A simple client side prototype was developed and also tested on OLPC’s XO laptop as proof of concept.
Award/Recognition: The project reached finals of Microsoft Imagine Cup Accessibility Awards 2009, top 30 globally while top 2 in India.
Other relevant links/mentions: Document of proposal, Presentation of proposal (storyboard).
VIII.c-ACM ASSETS 2009:
Keywords: Accessibility, ACM ASSETS 2009, 3d sound environment, accent modifiers, accessible email interface, India, Speech interfaces, WebAnywhere, HCI.
Abstract/Description: Desire to further research in accessibility aspects on the web, and with past experiences of participating in programs by AOL and Microsoft, I got in touch with Dr. Shari Trewin (Accessibility Researcher, IBM Watson Research Center, NY, USA) who was serving as ACM ASSETS 2009 Chair. Describing her of my ideas, she readily agreed to mentor me. I participated under poster submission category and researched on ”3D sound environment with accent modifiers for support of Web access for people with Visual Impairment in India”. The paper explored the challenges faced by visually impaired novice internet users in India. Problems included lack of access to a dedicated machine, difficulty with the English accent of existing commercial screen readers, and significant investment of time necessary to learn the keyboard skills and commands required to operate a screen reader. Focusing on email access, the paper proposed an alternative, web-based, mouse-based screen reading interface using spatial audio navigation, specifically aimed at beginner visually impaired users including non-native English speakers. In a survey of 15 visually impaired individuals in India, 95% reacted favorably to the proposal.
Other relevant links/mentions: view the ACM paper submitted for further reading.
VIII.d-Microsoft CloudApp() 2009:

Screen shot of application on WebAnywhere Screen Reader,after hitting any of query buttons,press down arrow to reach the Routing information text to be read.
Keywords: Accessibility, Microsoft Windows Azure, Cloud computing, Bing Maps API, direction routing, visually impaired, WebAnywhere, HCI.
Abstract/Description: Taking accessibility to the clouds, I participated at Microsoft CloudApp() competition in 2009. For which, I developed an accessible direction application for visually impaired using Bing Maps API , using simple JavaScript and a concept, where motive was to present routing information in “pure HTML” which could be read by screen readers (specially web based like WebAnywhere). The application was deployed over cloud using Windows Azure.
I would also like to share an interesting development experience associated. While I had developed entire application on localhost first, when deploying over Windows Azure, I found that I needed necessary softwares which were not compatible on Windows XP SP2, and required only Windows Vista. But this didn’t let me stop my progress, using TeamViewer and seeking a friend’s help, I set up entire environment on his computer and deployed the application remotely on cloud.
Other relevant links/mentions: The application was live until Windows Azure service was available for free to public beta test developers.
IX – Face recognition in video stream

FRVS in action, recognizing two faces.
Keywords: Image processing, Intel OpenCV, Object recognition, Computer vision, Real Time Face Detection, Real Time Face Recognition, Eigenfaces, Adaboost, Subspace LDA, PCA, Cascade Detector, HCI.
Abstract/Description: Face Recognition in Video Stream (FRVS) is a computer vision project which uses OpenCV and performs the task of locating human faces in a video stream, and recognizing those faces by matching them against the database of known faces. A processing pathway for visual data has been used for the design of FRVS which permits the study of collective performance of two or more algorithms when they work in unison. Viola-Jones based algorithm and AdaBoost face detection algorithms are used to realize the face extraction stage of FRVS. Comparitive study of these two algorithms suggests that AdaBoost algorithm is better suited for real time operations. Subspace Linear Discriminant Analysis (LDA) and Face recognition using Eigenfaces are the two algorithms implemented in the face recognition stage of FRVS. This stage uses results from the two algorithms to improve the overall accuracy of the recognition procedure. The FRVS image database includes faces of an individual in different lightening, postures etc. FRVS cannot be used as a complete software based solution for security. However, it can be used to assist security personnels. Anyone who wants to track/recognise the entry/exit of humans in an area can use FRVS for security or some other fields such as persons counting, attendance, etc. I developed this pet project in Visual C++ to understand aspects of Computer vision and for self learning.
Other relevant links/mentions: Source code and report (I’ve created pretty exhaustive report which I believe will be helpful for other developers too).
X.a – NASA WorldWind add-on and plugin:
Keywords: NASA WorldWind, Open Source, C#, Geo, Google direction API, HCI.
Abstract/Description: Inspired by an OpenStreetMap add-on over WorldWind, I decided to play with it too. Exploring and studying its source code lead to the development of an add-on and a plugin. The add-on is a Point layer based application, very useful for engineering aspirant students across the world. The add-on lets user browse through world’s top 50 engineering schools on interactive earth user interface and explore their CS department websites by just a click away.
The plugin in WorldWind is an application which adds extra functionality to the software. I developed a plugin in C# using Google directions API, which shows direction route from a given source to destination, plotted on the earth user interface. Everything from development to deployment here, use this link to understand WorldWind’s architecture and play around with source code for download below.
Other relevant links/mentions: download source code of plugin (C#), download source code of add-on (XML).
X.b – AOL’s Truveo/TopCoder Google gadgets:
Keywords: AOL, Truveo video search engine, TopCoder, Google gadgets, Localization, Internationalization, HCI.
Abstract/Description: Developed two Google gadgets during Truveo Developer Challenge – powered by TopCoder. The gadgets used Truveo API and served the purpose of video search engine, by giving flexibility to filter results on the basis of channels, languages, category, runtime and various sorting options. Though most of functionalities were common in both gadgets, they primarily differed on one aspect. “Advanced Truveo Video Search” had a special video search for US Presidential elections’08 then, while “Advanced International Truveo Search“ had auto support for 10 international languages based on browser’s language settings. The Internationalization and Localization aspects required message bundles functionality of Google gadget API. A modified version of the gadget was later used in the “Atlas America” – OLPC project for Geography videos in Spanish.
Other relevant links/mentions: Get Advanced International TRUVEO Search on your Google homepage, Get Advanced TRUVEO Video Search on your Google homepage.













