Selected Projects

[-] Twitch Crowdsourcing: Crowd Contributions in Short Bursts of Time [Google Play]

Keywords: HCI, Crowdsourcing, Mobilization, Citizen Crowd, Mobile Computing, Microtasks.

Collaborators: Keith Wyngarden, Brandon Cheung and Michael Bernstein of Stanford University.

Success in crowdsourcing depends critically on motivating many individuals to contribute, but contributors often are discouraged by the non-trivial effort required to make meaningful contributions. To lower the threshold to participation, we present twitch crowdsourcing, quick crowd contributions that can be completed in one or two seconds. In spare moments, we habitually turn to mobile phones for work, messaging, and entertainment. Taking advantage of this habit, Twitch overrides the Android unlock screen and allows users to make micro-contributions toward crowdsourcing goals each time they unlock their phone. We perform a public field deployment of Twitch with about hundred users. These users authored a census of local human activity, rated stock photos, and extracted structured data for Wikipedia pages via more than 10K unlock activities. The median Twitch activity took just less than 2 seconds, incurring no statistically distinguishable speed or cognitive load costs compared to a standard slide-to-unlock gesture. A poster of the project was presented at the Stanford MobiSocial Retreat’13.

[-] Internet vs. Enterprise Crowdfunding: Contrasting Motivations and Dynamics

Keywords: Crowdfunding; Enterprise crowdfunding; CSCW; Social media; Organizational computing; Collaboration.

Collaborators: Michael Muller, Werner Geyer, and Todd Soule of IBM Research.

In this project we contrast crowdfunding as it occurs on the Internet, with crowdfunding in an Enterprise setting, based on a grounded theory analysis of a crowdfunding trial in a major research laboratory. We explore themes of diverse projects, motivations and incentives, strategies and approaches, and collaborations and relationships. Enterprise crowdfunding has its own financial model, social scope, and dynamics, resulting from a heightened sense of collaboration and community. This project helps us learn about the implications for organizations and for future crowdfunding activities. A poster and talk was presented at the InternFest’13 at IBM Research Cambridge, MA.

[-] The Whodunit? Challenge: Mobilizing the crowd in India.

Keywords: Crowdsourcing, ICTD, Community crowd, Mobilization, Mobile, Social Game.

Collaborators: Bill Thies, Ed Cutrell and Aditya Vashistha of Microsoft Research.

While there has been a surge of interest in mobilizing the crowd to solve large-scale time-critical challenges, to date such work has focused on high-income countries. In settings such as India, broadband Internet penetration is often lacking, and people connect through face-to-face or mobile communications. In this paper, we describe the Whodunit Challenge, the first social mobilization contest to be launched in India. The contest relied only on mobile phones and required rapid formation of large teams in order to solve a fictional mystery case. The challenge encompassed about 10K participants in a single day and was won by a university team in about 5 hours. To understand teams’ strategies and experiences, we conducted about hundred phone interviews. We found that several teams utilized only personal networks, without relying on financial incentives or Internet connectivity. However, several teams also used the Internet to their advantage. We synthesize these findings and offer recommendations for future crowd mobilization challenges targeting low-income environments. The project was inspired from similar experiments in developed nations, like DARPA’s Network Challenge, the Tag-Challenge and the MyHeartMap Challenge.

[-] Exploring employment opportunities via cybercafes

Keywords: HCI, Crowdsourcing, Mobilization, Citizen Crowd, ICTD, Microtasks

Collaborators: James Davis and Mrunal Gawade of UC Santa Cruz; and Mercy Nduta Waihumbu.

Microwork in cybercafés is a promising tool for poverty alleviation. For those who cannot afford a computer, cybercafés can serve as a simple payment channel and as a platform to work. However, there are questions about whether workers are interested in working in cybercafés, whether cybercafé owners are willing to host such a set up, and whether workers are skilled enough to earn an acceptable pay rate? We designed experiments in internet/cyber cafes in India and Kenya to investigate these issues. We also investigated whether computers make workers more productive than mobile platforms? In surveys, we found that 99% of the users wanted to continue with the experiment in cybercafé, while 8 of 9 cybercafé owners showed interest to host this experiment. User typing speed was adequate to earn a pay rate comparable to their existing wages, and the fastest workers were approximately twice as productive using a computer platform. The project has been accepted as ACM DEV’12 Poster, IEEE GHTC’12 Paper, UC Berkeley Global Social Venture Competition Semi-finalists’12, and UC Berkeley CITRIS Poster’13 (find the pdfs in the links).

[-] Digitization of Health Records in Rural Villages

Keywords: Crowdsourcing, ICTD, Medical Health Records, User Interfaces, MobileWorks, Captricity, Microtasks.

Collaborators: James Davis, Sascha Ishikawa and Jing Liu of UC Santa Cruz; and Philip Strong of Palo Alto Medical Foundation; and Stephanie Berkey.

In this project, we present a study that reviews current available methods for obtaining electronic health records (EHRs) to facilitate the provision of health services to patients from rural villages in developing countries. The study compares processes of digitizing health records by means of manual transcription, both by hiring a professional transcriptionist and by using online crowdsourcing platforms. Finally, a cost-benefit analysis is conducted to compare the studied transcription methods to an alternate technology-based solution that was developed to support in-the-field direct data entry. A poster paper has been accepted at ACM DEV 2013, and full paper at IEEE GHTC’13, San Jose, CA.

[-] 3D+2D TV: 3D Displays with No Ghosting for Viewers Without Glasses

Keywords:HCI, 3D TV, Large displays, 3D+2D, Comp Graphics, User Interfaces, Display Technology.

Collaborators: James Davis, Jing Liu, Steve Scher, Prabath Gunawardane of UC Santa Cruz; and Tom Malzbender of HP Labs.

3D displays are increasingly popular in consumer and commercial applications. Many such displays show 3D images to viewers wearing special glasses, while showing an incomprehensible double image to viewers without glasses. We demonstrate a simple method that provides those with glasses 3D experience, while viewers without glasses see a 2D image without artifacts. The paper got accepted at the ACM Transactions on Graphics 2013 (presented at SIGGRAPH 2013).

[-] Using crowdsourcing to generate ground truth data for computer vision training

Keywords: HCI, Crowdsourcing, Microtasks, Image annotations, Computer vision, bounding boxes

Collaborators: James Davis and Sascha Ishikawa of UC Santa Cruz; and Reid Porter of Los Alamos National Laboratory.

This project is part of ongoing research with LANL under the category of “Human Assisted Computer Vision”. Computer vision is great, but at times it fails too. To train the algorithms, usually researchers spend several hours annotating images to create ground truths. Why not harness the crowd here? That’s what we’re trying to do, several experiments have been crowdsourced on Mechanical Turk and MobileWorks. See our social networking experiment too. You might want to try out the image annotation interface (via bounding boxes) from here. The paper is in progress right now. The experiment was conducted on two types of data sets, namely: Pedestrians and Bumble-bees.

[-] Channelizing crowd via Crowdflower’s API [Class Project]

Keywords: HCI, Microtasking platforms, Crowdsourcing, Business plan.

Crowdflower is a platform where work givers submit tasks. Crowdflower further uses its channels to get it done. An attempt was made to create one of those channels, so as to test whether those tasks can be done by crowd in developing countries. Link here. The project proposal entered semi-finals of the UC Berkeley’s GSVC 2012.

[-] Understanding the color perception via a game through crowdsourcing [Class Project]

Keywords: HCI, Visual perception, Colors, Game, Crowdsourced evaluation.

Color Game

Link to the game. The shape, size, color and speed of the bubbles changes as the game goes forward. The game has ten levels in total. The target of the game is to shoot the bubble which changes its color with respect to the background, as soon as possible. Attempts have been made to keep the hue of the bubble as close to the background, yet different from other bubbles. Evaluation was done by crowdsourcing the players via MobileWorks, and study was conducted as to how these parameters effect our visual ability. The game was developed using Processing.

=============================  Older projects [updated Winter 2010, before grad school] ==========================

Following is a list/description of selected projects (in no particular order) which I pursued out of complete personal interest and continuous desire to learn something new while staying engaged with something productive.

I – OSM Direction tool for visually impaired.

Screen shot of directions shown under Webanywhere frame.

Screen shot of directions shown under Webanywhere frame.

Keywords: Accessibility, GSoC, OSM, Geo, GIS, Maps, Open Source, PHP, Python, Google App Engine, OpenRouteService, WebAnywhere, HCI

Abstract/Description: OSM Direction tool for visually impaired was developed during Google Summer of Code 2009 (GSoC’09) for OpenStreetMap Foundation (OSM). Due to heavy ajax integration, accessing directions over Google map or similar services was not very accessible through simple screen readers. Solution was to give direction info in pure HTML.  There was a catch though, I could not use any APIs from a service which had its routing algorithm closed.  So I began my experiments and developed few prototypes over open APIs, also created my own routing algorithm over YOURS optimizing Gosmore source code . The development of application required two important services -

(a)-Geocoding/ reverse Geocoding (to convert location to their respective latitudes/longitudes and vice versa), for which I experimented with OpenStreetMap’s Namefinder (deprecated since Aug’10) and University of Heidelberg’s OpenRouteService primarily, and chose the later.

(b) Routing and POIs (point of interests), for which I experimented with YOURS/Gosmore and OpenRouteService primarily, and chose the later.  For reading the directions, I used University of Washington’s MIT TR’09 winning WebAnywhere screen reader (where I am a committer now).

Screen shot of audible map displayed. So if a user hovers mouse over "IBM", "IBM" will be heard, if mouse is moved left over "National Theater" , "National Theater" will be heard. This will help user to get spatial l information of the maps just by mouse hover. Locations across entire Europe is searchable.

Screen shot of audible map displayed. So if a user hovers mouse over "IBM", "IBM" will be heard, if mouse is moved left over "National Theater" , "National Theater" will be heard. This will help user to get spatial l information of the maps just by mouse hover. Locations across entire Europe is searchable.

I completed majority of project requirements, almost *a month* before deadline. Loving the R&D in the area, I did not want to freeze the project, rather began working on advanced features, like developing auditory mapping interface. I came across University of North Carolina, Chapel Hill’s BATS (Blind Audio Tactile Mapping system) and got in touch with Professor Gary Bishop, Dr. Peter Parente and Thomas Logan. On exploring BATS in-depth, I realized two constraints with it:

1- It was very static, only for ancient map of England and political map of North Carolina.

2- It was desktop based.

Overcoming both these constraints I used UNC’s Outfox and OSM’s Static Map APIs, and developed a web based mapping interface, where a person could search for any location within Europe (due to data constraints), and on mouse hover could get the spatial information of the location. This also required development of algorithms to sync relationship between image co-ordinates and lat/longs. Though majority of project was developed in PHP, requirement of its deployment over cloud – Google App Engine, also required manual translation of routing aspects into Python.

Award/Recognition: I was invited at State of the Maps’09, Amsterdam, as an acknowledgement to my efforts in the area. I also gave a talk on “Accessible maps for visually impaired” at Conference on Assistive Technology, Lucknow co-hosted by National Association for the blind and Rehabilitation society for the visually impaired.

Other relevant links and mentions: Wiki, Source codeDr. Artem Dudarev (mentor), Dr. U.N. Sinha and Thomas Logan (unofficial mentor, guide for accessibility aspects).

Back to project index


II – Atlas America – OLPC Mexico.

Atlas America home page with index to all chapters

Atlas America home page with index to all chapters

Keywords: Educational project, Open Source, OLPC, Geo, Maps, OpenLayers, GeoRSS, QGIS, MapServer, PHP, HCI.

Abstract/Description: The “Atlas America” project was developed for One Laptop per Child (OLPC) Mexico in 2008, for their ongoing efforts to create a Geography course for children. With interest in Geography, I got in touch with  Walter Bender and Samuel J. Klien (Sj), discussed my ideas and  totally loved the concept behind this ongoing project. However, there were two problems, 1- I did not know Spanish (the project was to be developed with Spanish content), 2- I did not know anything about technologies which were to be used – OpenLayers, MapServer and QGIS. But it took me less than a week to get conversant with every technical aspect related, and I was good to go! 

Screenshot of chapter showing geographic features

Screenshot of chapter showing geographic features

The project was divided into 20 chapters, each chapter teaching various geographical features across North and South America, like, mountain ranges, lakes, rivers, etc. Every chapter was divided into a map and description. Every map had multiple bubble tags for specific description, where children can upload more than 500 images for every tag associated. The maps were developed using OpenLayers and QGIS IDE, it was hosted on MapServer and data was fetched through GeoRSS. This was version 1, completed almost *a month* before deadline. Loving the R&D in the area, I did not want to freeze it, rather developed other relevant features like “Atlas America Notes”, which would enable children to make and save notes on the computer itself, this was done using PHP. While I also created a Google gadget for Geographic videos, in Spanish, using AOL Truveo’s API. This entire application gave children three way learning experience – text, images and video. The project is protected by Creative Commons Attribution 2.5 license.

Award/Recognition: I was awarded 1st prize for this project by Professor Sartaj Sahni (Chair, CS dept, University of Florida, Gainesville, USA) at International Conference of Contemporary Computing 2008, Noida, India.  More  details on OLPC presentation wiki here.


Other relevant links and mentions: Wiki, Source code and executables, Nestor Guerrero (mentor).

Back to project index


III- Mujuntu (WikiStudios International).

wikistudios

WikiStudios International homepage in 2008

Keywords: Entrepreneurship, Opportunity Quest, University of Utah, Image Processing, Computer Vision, Object tracking, HCI.

Abstract/Description: Mujuntu application was developed under WikiStudios International entrepreneurial project. The application served as prototype/POC  (proof of concept) during Opportunity Quest, an entrepreneurial competition by University of Utah, participated by Dominick Perrier-Strand and me.  As per business plan’s requirement this project involving image processing and object tracking, was developed by me.

The image processing component was written using C++, which also required development of some math libraries.
The program would be able to track an object in a low resolution video that would go behind other objects. Main stress was layed in tracking a moving object and making sure that it didn’t lose focus. In order to do this, first the image was pulled into a matrix. Then image was broken into segments by taking the gradient of the image and the surrounding pixels.  In particular a simple gradient like this was used:
0  1  0
-1 0  1
0 -1  0
And a filter was used on each pixel, if it was greater than a particular value then it would return 1, else 0. When the filter was ran on the image, then another image was created and gave the value of that the filter returned for each pixel in the second image. Next, all of the pixels in the resulting image were went through and blobs were pulled out of areas that had the value of greater than .99. Later, the area that the user first selected in the image was taken and all the blobs that were in that area were saved into a second data structure. Then for each frame in the video, the process was repeated and compared the areas of the blobs with those of the blobs in the previous frame or frames. Finally, the one that looked the most similar in a near area was taken. Here’s a final demo:

Award/Recognition: The project won “Spirit of Entrepreneurism” award! at Opportunity Quest and reached semi-finals of Utah Entrepreneurship Challenege (top 30 of 440) .  The project also drew attention of Charles Mulvehill (Hollywood producer), who then showed interest to serve on the board of company. Taking initiatives to further this project we also participated in YCombinator Summer’09, Highland Capital Partners Summer@Highland’09 and LightSpeed Venture Partners Summer Grants’09.

Other relevant links/mentions:

–>As part of other entrepreneurial initiatives, I’ve also worked with  Stuart Adams (LSU/MIT) and got interested Joel Bomgar (Founder and CEO of Bomgar Corporation, named Ernst & Young 2009 Entrepreneur Of The Year® Award Finalist) for our project related to events search engine in 2009.

–>My first entrepreneurial level interaction happened with Will Deane (CEO, MusicManagement LLC, USA) and Colin Sidoti (MIT) in 2008. I played role of strategist.

Back to project index


IV- Yahoo! Open Hack Bangalore 2009.

Keywords: Accessibility, Speech Interfaces, Windows 7 Speech Recognitions APIs, Educational project, Yahoo! APIs, PayPal APIs, HCI.

Abstract/Description: I was accepted for the program out of 3000+ applicants around India. Mission was to develop application(s) by hacking around with Yahoo! APIs, studying Yahoo! Developer Network and create something “cool” within 24 hours, non-stop. Well, so I coded for non-stop 24 hours and developed two applications with my team. 

DirectCab

DirectCab

1- DirectCab: DirectCab is a unique hack, which bridges the gap between passengers and cab providers by directional marking (with information about incremental distances from source location) and navigation routes on Yahoo! Maps. It helps calculating distance, estimated time of travel and cab fair including surcharge% between two locations to avoid any excessive charges by cab drivers. It has in-built merchandise facility for instant payment of cab fair for the customers registered with the cab service, which happens through PayPal’s Adaptive Payments API.

Flagged

Flagged

2- Flagged!: Flagged! is an interesting accessible general knowledge game for children aged 5-8 yrs based on question/answer methodology. A set of Questions/Answers related to Country and their Capitals is stored within the application. Yahoo! Query Language is used to fetch the flag images and the current news for the respective country. Most importantly, speech recognition and voice synthesis have been extensively used keeping Accessibility in mind with both keyboard and speech based input/output for audible questions/answers. The speech recognition capabilities were developed using Windows 7 Speech Recognition APIs.

Other relevant links/mentions: Source code, Shirish Goyal (team mate).

Back to project index


V- LifeCode Health.

Keywords: Entrepreneurship, MIT$100k, Wayne State University 2010 E2 Challenge summer venture, Microsoft Imagine Cup US, Mobile computing, Mobile health, HCI.

Lifecode Health

Lifecode Health

Abstract/Description: LifeCode Health project was born through brain storming sessions between students of Jaypee University IT, Wayne State University(WSU) and MIT back in 2009, for participation at MIT$100k Entrepreneurship Competition. The concept  focused on redefining the effectiveness and usability of modern day medical data, through capabilities to record, organize, access, and securely store health information in digital format.

Award/Recognition: Since MIT$100k, the ideas kept evolving, later, under a modified team set, the project won funding through Wayne State University 2010 E2 Challenge summer venture. And has been finalists at Microsoft Imagine Cup (IC) US 2009 and 2010. Where the concept revolved around Smartphone and wireless sensor based real-time monitoring system for cardivascular care risk mitigation and to create virtual clinics in developing nations.

LifeCode Health project at IC.

LifeCode Health project at IC.

Other relevant links/mentions: My major role remains co-founding the project, and creating the founding team online by managing and connecting with students from WSU and MIT.

Back to project index


VI – OSM navigation application for visually impaired on android.

Directions displayed from source to destination on the Android emulator.

Directions displayed from source to destination on the Android emulator.

Keywords: Accessibility, Android platform, Geo, Open Source, OpenStreetMap (OSM), Mobile computing, GPS, Speech Interfaces, HCI.

Abstract/Description: Intrigued by a closed-source project by AT&T and its usability, this year while I was serving as co-admin for Google Summer of Code at OpenStreetMap, I planned to mentor its open source implementation. Source code of the project’s first version is available for download, while further development of the project is in progress and is being developed by a student volunteer – Vivek Kumar.  Its currently being mentored by me and Birago Jones from Software Agents group, MIT Media Lab.

The project aims to be an accessible mobile application, a navigation tool for enabling visually impaired/low vision users to walk to a destination. Using GPS (LocationManager class) the application aims to detect current user location and show up the directions to the known destination. User is supposed to enter/speak the destination and direction routes are displayed or heard using TTS. The routing direction information is fetched using AOL MapQuest Open API (which now uses OpenStreetMap data).  An open source development framework for building cross-platform mobile apps – PhoneGap, is used at back end to implement core functionalities, while jQTouch is used for displaying native UI.

Other relevant links/mentions: Source code

Back to project index


VII – LiveGeo

LiveGeo at work, showing Stanford more populated than Georgia Tech at a given time.

LiveGeo at work, showing Stanford more populated than Georgia Tech at a given time.

Keywords: Mobile computing, Geo, Google Maps API, Skyhook Wireless, SimpleGeo, SpotRank data, Smartphones, Population trends, Entrepreneurship, HCI.

Abstract/Description: The project was developed after a brainstorming session between me and my undergrad friend – Utkarsh Shrivastava (who was then a Masters student at Georgia Tech, USA) . Exploring SimpleGeo’s API we discovered huge potential in harnessing population ranking data/SpotRank data from Skyhook Wireless .   SpotRank predicts the density of people in predefined urban square-block areas worldwide at any hour, any day of the week. Though several applications can be built over it (few in progress), we developed our first application to be as a comparison tool to evaluate population density between two chosen places at a given date and time. This will generally help user to choose route/place she is heading to. So for example, if there are two McDonald’s restaurants, user can choose the less populated one and hang out there, while at the same time advertisers can study user’s physical traveling behavior and target them appropriately.   The evaluation happens on the basis of worldwide_rank, local_rank, trending_rank and city_rank and later plotted over a Google map. The application was developed in PHP and JavaScript using SimpleGeo’s PHP Client which is a PEAR package.

Award/Recognition: Being amongst first few developers of SimpleGeo API caught attention of its founder, Joe Stump. Utkarsh being in the US was interviewed by Christopher Mims who writes for MIT Technology Review, and our project was featured in MIT TR’s April issue with few other ideas we are working on right now.

Other relevant links/mentions: Source code (we plan to take project to next level and begin an entrepreneurial venture out of it, in that case we will be revamping the source code and might deprecate current version.)

Back to project index


VIII – Selected ACCESSIBILITY Projects.

VIII.a – AOL/TopCoder Sensations Developer Challenge Idea generation contest 2009:

Keywords: Accessibility, AOL, TopCoder, accessible email, visually impaired, old aged, cognitive disability, 3d assistive auditory interface,  HCI.

Email4all scrible on paper, formalizing thoughts..

Email4all scrible on paper, formalizing thoughts..

Abstract/Description: This was my first project into accessible technologies. I made two submissions EMAIL4all and MyAIM , where earlier focussed on accessible email interface while later on accessible AIM chat client from AOL. For EMAIL4all, I focussed on a very common problem; there are millions of literate people who cannot access computer/web due to complex user-interfaces, cognitive disability or any technological challenges; there are also visually impaired people who are computer literate, but fail to access computer/web due to accessibility or tend to forget keyboard shortcuts/key locations. Solution was to create an application (I chose an email interface), which is not only with the simplest user-interface, but also accessible and could work with mouse, even for visually impaired people. My source of idea came from the way visually impaired people navigate within a room. They tend to keep articles along the wall, which makes it easy to search by sliding along it; they find obstacles using a cane, this also helps them getting the spatial information of the room and they use acoustic assistance to sense dynamic changes within a room or to locate target. Analogously, for the email application, the screen can be seen as a physical room, the cursor as a cane and the articles of the room as menus. As a person turns their head towards a direction of sound in a room, likewise if the sound effect is such that the menu operations are on the left edge, one gets a sense that sound is actually coming from the left. Interface plays a very important role as user navigational options are concentrated on the screen edges. This positioning will assist users in reaching options easily with positioning feedback by the assistive 3-D auditory interface.

Other relevant links/mentions: download EMAIL4all proposal, download MyAIM proposal for detailed aspects of the concept.

VIII.b -Microsoft Imagine Cup Accessibility Awards 2009:

Keywords: Accessibility, Microsoft, accessible email, 3d assistive sound interface, visually impaired, HCI.

Abstract/Description: The project was an extension to EMAIL4all proposal with surveys being formally conducted at School for the blind, Lucknow. The visit to the school gave me an opportunity to study learning methods of children and feasibility of its practical deployment, during this time I also connected with Professor Udai Narain Sinha of University of Lucknow, India (who happens to be a visually impaired professor). His guidance , concerns and view points on the project were of great help. Melissa Hui from Wayne State University also joint me in this initiative, further formalizing the concept as per Imagine Cup’s requirement. We participated as team “VisuAccess” and focussed more on the machine in-dependency aspects, and server side implementation of the EMAIL4all concept. A simple client side prototype was developed and also tested on OLPC’s XO laptop as proof of concept.

Award/Recognition: The project reached finals of Microsoft Imagine Cup Accessibility Awards 2009, top 30 globally while top 2 in India.

Snapshots from the storyboard of project

Snapshots from the storyboard of project

Other relevant links/mentions: Document of proposal, Presentation of proposal (storyboard).

VIII.c-ACM ASSETS 2009:

Keywords: Accessibility, ACM ASSETS 2009, 3d sound environment, accent modifiers, accessible email interface, India, Speech interfaces, WebAnywhere, HCI.

Abstract/Description: Desire to further research in accessibility aspects on the web, and with past experiences of participating in programs by AOL and Microsoft, I got in touch with Dr. Shari Trewin (Accessibility Researcher, IBM Watson Research Center, NY, USA) who was serving as ACM ASSETS 2009 Chair. Describing her of my ideas, she readily agreed to mentor me. I participated under poster submission category and researched on  ”3D sound environment with accent modifiers for support of Web access for people with Visual Impairment in India”. The paper explored the challenges faced by visually impaired novice internet users in India. Problems included lack of access to a dedicated machine, difficulty with the English accent of existing commercial screen readers, and significant investment of time necessary to learn the keyboard skills and commands required to operate a screen reader. Focusing on email access, the paper proposed an alternative, web-based, mouse-based screen reading interface using spatial audio navigation, specifically aimed at beginner visually impaired users including non-native English speakers. In a survey of 15 visually impaired individuals in India, 95% reacted favorably to the proposal.

Other relevant links/mentions:  view the ACM paper submitted for further reading.

VIII.d-Microsoft CloudApp() 2009:

Screen shot of application on WebAnywhere Screen Reader,after hitting any of query buttons,press down arrow to reach the Routing information text to be read.

Screen shot of application on WebAnywhere Screen Reader,after hitting any of query buttons,press down arrow to reach the Routing information text to be read.

Keywords: Accessibility, Microsoft Windows Azure, Cloud computing, Bing Maps API, direction routing, visually impaired, WebAnywhere, HCI.

Abstract/Description: Taking accessibility to the clouds, I participated at Microsoft CloudApp() competition in 2009.  For which, I developed  an accessible direction application for visually impaired using Bing Maps API , using simple JavaScript and a concept, where motive was to present routing information in “pure HTML” which could be read by screen readers (specially web based like WebAnywhere). The application was deployed over cloud using Windows Azure.

I would also like to share an interesting development experience associated. While I had developed entire application on localhost first, when deploying over Windows Azure, I found that I needed necessary softwares which were not compatible on Windows XP SP2, and required only Windows Vista. But this didn’t let me stop my progress, using TeamViewer and seeking a friend’s help, I set up entire environment on his computer and deployed the application remotely on cloud.

Other relevant links/mentions: The application was live until Windows Azure service was available for free to public beta test developers.

Back to project index


IX – Face recognition in video stream

FRVS in action, recognizing two faces.

FRVS in action, recognizing two faces.

Keywords: Image processing, Intel OpenCV, Object recognition, Computer vision, Real Time Face Detection, Real Time Face Recognition, Eigenfaces, Adaboost, Subspace LDA, PCA, Cascade Detector, HCI.

Abstract/Description: Face Recognition in Video Stream (FRVS) is a computer vision project which uses OpenCV and performs the task of locating human faces in a video stream, and recognizing those faces by matching them against the database of known faces. A processing pathway for visual data has been used for the design of FRVS which permits the study of collective performance of two or more algorithms when they work in unison. Viola-Jones based algorithm and AdaBoost face detection algorithms are used to realize the face extraction stage of FRVS. Comparitive study of these two algorithms suggests that AdaBoost algorithm is better suited for real time operations. Subspace Linear Discriminant Analysis (LDA) and Face recognition using Eigenfaces are the two algorithms implemented in the face recognition stage of FRVS. This stage uses results from the two algorithms to improve the overall accuracy of the recognition procedure. The FRVS image database includes faces of an individual in different lightening, postures etc. FRVS cannot be used as a complete software based solution for security. However, it can be used to assist security personnels. Anyone who wants to track/recognise the entry/exit of humans in an area can use FRVS for security or some other fields such as persons counting, attendance, etc. I developed this pet project in Visual C++ to understand aspects of Computer vision and for self learning.

Other relevant links/mentions: Source code and report (I’ve created pretty exhaustive report which I believe will be helpful for other developers too).

Back to project index


X- Extensions.

X.a – NASA WorldWind add-on and plugin:

WorldWind add-on for displaying top 50 engineering schools globally.

WorldWind add-on for displaying top 50 engineering schools globally.

Keywords: NASA WorldWind, Open Source, C#, Geo, Google direction API, HCI.

Abstract/Description: Inspired by an OpenStreetMap add-on over WorldWind, I decided to play with it too. Exploring and studying its source code lead to the development of  an add-on and a plugin. The add-on is a Point layer based application, very useful for engineering aspirant students across the world. The add-on lets user browse through world’s top 50 engineering schools on interactive earth user interface and explore their CS department websites by just a click away.

The plugin in WorldWind is an application which adds extra functionality to the software. I developed a plugin in C# using Google directions API, which shows direction route from a given source to destination, plotted on the earth user interface. Everything from development to deployment here, use this link to understand WorldWind’s architecture and play around with source code for download below.

Other relevant links/mentions: download source code of plugin (C#), download source code of add-on (XML).

X.b – AOL’s Truveo/TopCoder Google gadgets:

Advanced International Truveo Search

Advanced International Truveo Search

Keywords: AOL, Truveo video search engine, TopCoder, Google gadgets, Localization, Internationalization, HCI.

Abstract/Description: Developed two Google gadgets during Truveo Developer Challenge – powered by TopCoder. The gadgets used Truveo API and served the purpose of video search engine, by giving flexibility to filter results on the basis of channels, languages, category, runtime and various sorting options. Though most of functionalities were common in both gadgets, they primarily differed on one aspect. “Advanced Truveo Video Search” had a special video search for US Presidential elections’08 then, while “Advanced International Truveo Search“  had auto support for 10 international languages based on browser’s language settings. The Internationalization and Localization aspects required message bundles functionality of Google gadget API. A modified version of the gadget was later used in the “Atlas America” – OLPC project for Geography videos in Spanish.

Other relevant links/mentions:  Get Advanced International TRUVEO Search on your Google homepage, Get Advanced TRUVEO Video Search on your Google homepage.

Back to project index


Comments are closed.