Each year Rice University holds an annual Hackathon for everyone to participate and code away their weekend. All the students who participate trade away their sleep for a weekend to build something cool in 36 hours and showcase it. That's how hackathons are and that's the thrill of it.
And I love that thrill. I love taking part in Hackathons. It's a way for me to work on my ideas, hobby projects which I always wanted but never could, because of the classic time crunch of being a graduate student at Rice. Classes, Research and Valhalla take away most of your time and you hardly get enough time to work on your side projects.
My affair with HackRice goes back to 2014 when I first participated in HackRice from Dallas and won the Mastery Of Computer Science award by the department of computer science. That kickstarted a snowball effect which is responsible for much of what I am today. But that story is for another day and demands another blog post all by itself.
This year HackRice 2017 was my fourth time participating in the Hackathon. It was also probably the last time I will be taking part as a student so it was special for me. HackRice had five tracks this year. And one of those tracks was The Hurricane Harvey track which dealt with designing any application that deals with Hurricane Harvey. Being one of the many people who got affected by the flooding at Harvey and having friends who were stuck in House for days, I decided it's the perfect opportunity to do something about it in the Hackathon. On a side note, you can see what happened to our apartment during the flooding in the video.
The prize from JPMorgan and Chase for the category on best hack of Harvey Disaster management using social media also encouraged me :)
Motivation
The motivation behind the concept was, people post a lot of updates to facebook and twitter. This reddit thread was my own source of information. The hack was suppsoed to make it easier to dynamically encompass all these kind of coverage and then use keywords, filters and different statistical methods to make sense out of the data to actually create alerts and help victims and early responders withreal-timee data.
Motivation
The motivation behind the concept was, people post a lot of updates to facebook and twitter. This reddit thread was my own source of information. The hack was suppsoed to make it easier to dynamically encompass all these kind of coverage and then use keywords, filters and different statistical methods to make sense out of the data to actually create alerts and help victims and early responders withreal-timee data.
The Team
I am by nature a very lazy person and have a very curious relationship with working in any team. I have been told (by my advisor even...) that I function better in a team. But also when I have better autonomy. In short, the team works best if I am in a team with my friends. Having realized that I started the hackathon alone and kept bugging my friend Avisha until she agreed to form a team,. She also is a graduate student at the University of Houston hence commute was not a problem for us.
The Hack
What we initially envisioned was a system that will be able to monitor the social media websites in real-time based on certain parameters. This will essentially be our dashboard which will suggest us what to monitor. And then with help of different set of crafted rules, we will drill further down. While designing the system, we decided to keep the monitoring parameters and "watchers" configurable so that we can change everything from the frontend. My previous encounters with different publicly available social media crawlers made me appreciate how blissful a good UI and configurability can be for crawlers.
And eventually, we made Harvey Track. HarveyTrack is a web application for using social media and other resources to track incidents around the world in real-time. Events such as natural disasters or elections or crime, almost anything that people are talking about can be tracked.
HarveyTrack can retrieve data from several sources:- Twitter (tweets matching a keyword search)
- Facebook (comments from publicly accessible groups and pages)
- RSS (article titles and descriptions)
- ELMO (answers to survey questions)
Items (called reports) from all sources are streamed into the application. Monitors can quickly triage incoming reports by marking them as relevant or irrelevant. Relevant reports can be grouped into incidents for further monitoring and follow-up. Reports are fully searchable and filterable via a fast web interface. Report queries can be saved and tracked over time via a series of visual analytics.
Users can be assigned to admin, manager, monitor, and viewer roles, each with appropriate permissions.
And we had the perfect testbed to use it in action. All of this was happening when Hurricane Irma and Jose was just leaving us. In our live demo, we could show the judges in real-time how monitoring global tweets using geo-boundaries helped us identify victims and early responders. Using our filters and keyword based heuristics we could further drill down and categorize priority, SOS tweets and also tweets offering help. Pairing that information with posts in facebook and feeds helped us triangulate their position and also keep a crowdsourced tab on the status of the neighborhood. This we felt especially was important considering when we were stranded in flood, we constantly kept monitoring twitter and reddit feeds to know the status. This tool just made it more comprehensive, easy to do that and also helped us to create automated alerts. The other aspect was to pair early responders with victims and that too was a must for us. And at the end we had a nice analytics which showed all this information in a bar chart and also overlayed on Google Maps. The useful thing that we were able to pull of was to differentiate between real tweets and retweets. That helped prune a lot of false positives.
We managed to run it from my localserver and got it live using localtunnel. Which really was a pain.
You can still have a look at it here: https://f9f984bd.ngrok.io
And our submission in DevPost with more details: https://devpost.com/software/harveytrack
What it can do
- Monitor FB, Twitter, any website and SMS channel for specific events (based on Source and Incident)
- Can classify "Victim" and "responders" based on tweet. So if somebody posts "My basement is full of water! What should I do!" and someone else "My basement is full of water! But I have a boat! Take that harvey". It will classify the first one as victim and the other one as able to help
- Utilize IBM watson for small magick trciks like stress analysis in text to rank which are most stressed victims and which area
- Have a complete tracking system with option to assign different volunteers for different incidents
- Handle huge amount of data. Our "Trump" filter enabled us to analyze more than 12k tweets per minute and it didn't carsh!
- Of course everything is stored (in mongodb) for later analysis
The Result
The two teams who won the two challenges in Harvey Track |
We ended up winning the "Best Hack for Disaster Response Using Social Media Data" award from JPMC. HarveyTrack essentially became a real-time disaster tracking and response application.
Or in fact, any incident tracking that you want to monitor in social media.
Avisha was already back in the home by the time the results were announced and so I got back home with a Bose Soundlink 2 as a prize for the hack. And overall being happy that the allnighter paid off.
Also, this was an important lesson for me. If we have a natural or man-made disaster. We don't always have to be a victim. We can use our skills to actually do something about it. I will be happy if this effort can contribute even 0.1% in that effort. If anyone wants to take it up or improve it further ir use it in a similar scenario. The code is always open source and you can get in touch with me to help you set up.
Comments
Post a Comment