Between 2017 & 2018, as a volunteer and core team member of Bangalore Chapter of DataKind, I had the opportunity to work with multiple non profit organizations from India (Pollinate Energy, Karnataka Learning Partnership, Pratham Books) to help them with their data related needs.

Following are two long term projects (3 months+) I contributed to.

Identifying Urban Makeshift Communities using satellite imagery and geo-coded data

Pollinate Energy is a social business improving the quality of life in urban poor communities with the help of innovative, affordable products like solar-powered lamp and efficient cooking appliance running on clean fuels. Pollinate Energy was using Google Maps to identify urban poor settlements. It was a a manual and time-consuming process and lacked any validation mechanism. Pollinate Energy partnered with DataKind Bangalore to address the challenge of detecting various urban poor communities in Bangalore using two Machine Learning based approaches: satellite image and geocoded data.

I worked on the second approach. Using geocoded data (via Google Places and Overpass APIs) collected across entire Bangalore, we developed multiple machine learning models to predict location of urban poor settlements. This approach might reduce man-hours and resources spent by Pollinate Energy.

The complete work was presented at Anthill Inside conference, 2017 by DataKind

Auto generation of tags for children stories on StoryWeaver

As of mid of 2018,  Pratham BooksStoryWeaver platform had 8000+ books in 113 languages (70+ International languages and 30+ Indian Languages) distributed across 25+ categories with hundred of tags. Since StoryWeaver was scaling up to host more stories, discoverability of books (inside the StoryWeaver platform as well as across the web via search engines) was becoming a challenge. Through this project, DataKind intended to solve this problem using various Natural Language Processing tools out there.

As a Core Team member of DataKind Bangalore, I worked with StoryWeaver leadership to understand the business need, define the Data Science problem statement (“Build a tool to generate relevant tags for each story to improve the searchability of content”), collecting relevant data, deciding on various technical approaches (github issues here) and finally enabling the volunteers to contribute to the project.

Forecasting Book Club

Forecasting Book Club hosted by Bahman Rostami-Tabar (Associate Professor, Cardiff University) is an online learning community to read and discuss forecasting related books. As a member of this book club, I presented Chapter 3 of the book “Forecasting: Principles and Practice” by Rob J Hyndman and George Athanasopoulos.