style=

rss search

Big Data + Social Good

line

The Gates Foundation recently hosted a webcast on digital strategy for nonprofits. The Director of Social Strategy for the American Red Cross, had a slide in her presentation titled, How Americans use Social Tools in Emergencies. The statistic that I couldn’t get out of my head, is that more than 76% of people who posted a distress message to a social media site,  expected to be rescued within 3 hours. The Director of Social Strategy for the American Red Cross thought that the public may need some tempering, as to whether or not that was plausible.

I immediately thought, maybe, it depends. Was The Director of Social Strategy for the American Red Cross basing her statement on personnel and resources available or on information available? If she was basing her statement on a personnel and resources available constraint, and that it just isn’t possible to rescue everyone who posts a distress message to a social media site within 3 hours, that made sense to me. However, if she was basing her statement on not being able to process the social media information fast enough to perform a rescue within 3 hours, with readily available personnel and resources, big data analytics may be able to analyze distress messages sent via social media fast enough to make a 3 hour rescue possible.

I will come back to the theoretical 3 hour rescue time frame, but now I want to give you a powerful example of how big data analytics is being utilized right now in business and industry. Recently, the Danish wind power plant manufacturer Vestas Wind Systems and IBM won the 2012 Big Data Award jury at the Computerwoche Big Data Congress in Germany. Lars Christensen, Vice President of Plant Siting & Forecasting, estimates that Vestas will soon have between 18 to 24 petabytes of data. To put that into perspective, that is like watching 70 years of HD video. Of course all of that data is worthless without applying analytics. Using a customized IBM big data analytics solution, answers to queries that used to take Vestas 3 weeks, now take 15 minutes.
http://www.ibmbigdatahub.com/blog/lords-data-storm-vestas-and-ibm-win-big-data-award

Now, let’s go back and think about the theoretical 3 hour rescue time frame, within the context of big data analytics. First, consider an analogy between the data Vestas is analyzing and the number of social media distress messages. If Vestas is able to answer queries in 15 minutes, that used to take 3 weeks, on petabytes of data, I posit that it is theoretically possible to analyze social media distress messages to make a 3 hour rescue time frame plausible. I am not taking into account personnel or resource constraints, but focusing on how social media distress messages could be analyzed in the context of big data.

Although it may appear that I am solely making a case that someone could be rescued within 3 hours of posting a distress message to a social media site, what I’d like to do, is use this specific theoretical scenario to do some inductive reasoning and ask, how can big data be applied for social good?

I have seen big data analytics being discussed in business, academia, and government. I haven’t seen much, if any, discussion and/or application of big data analytics in the humanitarian sector. If you are reading this post and are aware of such discussion and/or application, I welcome the information, as I think that big data analytics could have profound effects on the humanitarian sector.


9 comments on “Big Data + Social Good

  1. I liked your post. I think you are right that there is a lot that can be done with Big Data in the humanitarian sector. In fact, I think that will be one of Big Data’s greatest success. Understanding poverty, immunization, education, water contamination, human traffic, agricultural needs, education, governance, are all area’s that Big Data can be used for good. In fact, with the amount of sensory data being collected at the moment we will be if not already be able to collect info on just about any issue we choose to approach. The worry I see is not that Big Data has no role in humanitarian causes its more that it lacks ROI that investors tend to look for. I am very interesting in looking in to more humanitarian uses for Big Data myself. Please email if you would like to discuss further.

    Thanks!

    Fizzy

  2. Fizzy,

    Thanks for your comment. Will send you an email to start a discussion about using Big Data for humanitarian purposes.

    Lucy

  3. Scalable technologies and analytics are already core to humanitarian work. Every time you are using a search engine or browsing stories on social media, you are seeing data that has been managed and filtered at a massive scale by some of the smartest algorithms out there. 99% of humanitarian work is conducted by the crisis-affected populations themselves, and these are the tools that they are using: search engines and social media – the technologies that they are already familiar with.

    This is the right way to do it, too: crisis-affected communities bootstrapping their own recovery with Silicon Valley technologies. That is, by using existing solutions that are already robust and well-known, rather than developing new ‘humanitarian-branded’ technology solutions that are not tested until a disaster occurs. There are some technologies that NGOs and humanitarian organizations can build in-house (blogs, websites, and very simple content management systems), but the complicated statistics and data management skills that are required to manage data at scale are too far outside of the skill-sets possessed by software engineers in humanitarian organizations.

    Unfortunately, ‘big data’ is a catchphrase right now, so it being used to brand technology among the NGO community to attract grant funding. The same happened a few years ago when ‘crowdsourcing’ was a similar catchphrase. The NGOs that tried to use volunteer crowdsourcing all ended up using more resources than they saved, because they didn’t know what they were doing – it was a terrible failure.

    This is not to say that there weren’t exceptions. For example, I ran a crowdsourced translation and mapping platform in response to the 2010 earthquake in Haiti:
    mission4636.org
    It was a net benefit to the relief efforts, mainly for the translation aspects but also for structuring data about missing people. However, I drew on my experience of professional crowdsourcing more than my social development experience; we used a commercial crowdsourcing platform, CrowdFlower; and the primary reason it was a net benefit was not because of us international folk but because of the Haitians themselves who completed the actual information processing, drawing on their local knowledge. So it is probably fair to say that it was a success for precisely the reasons above: it was the crisis-affected community bootstrapping its own recovery with Silicon Valley technologies.

    I founded Idibon with this in mind. We need scalable language technologies, and we need to know that they work robustly at scale, in any context, before we start using them to respond to a disaster. I first met my Chief Technology Officer, Schuyler Erle, through humanitarian work and he is also a founder of the Humanitarian Open Street Map Team. We both continue to work in social development, advising humanitarian organizations and working directly with populations, and I think we stand a good chance of being able to dodge the negative aspects of the ‘big-data-bandwagon’ to continue supporting humanitarian response with proven technologies.

    • Rob,

      I am glad to hear that you are assisting humanitarian efforts as an academic and I will definitely take a look at Idibon.

      I appreciate the information your comment contained and I largely agree with some of the points you made. Specifically, these:

      1. Big data is a catchphrase right now.
      A friend of mine brought up big data a few months ago in a conversation. When she gave me a brief definition, I asked her, and how is this different from large data sets in general? In other words, big data seems like a catchphrase for something that has existed for a very long time and is just being repackaged using the terminology big data.

      2. Existing solutions that are already robust and well-known should be utilized rather than humanitarian-branded technology solutions that are not tested until a disaster occurs.
      I have seen several examples of viable solutions in one sector, being reinvented by another sector. I will add your example of the humanitarian community trying to reinvent solutions to handle big data sets, when Silicon Valley has already developed robust and well-known solutions, to my mental list. However, the question that comes to mind is, are Silicon Valley companies willing to have their technology utilized without much return on investment? I understand that some businesses may participate under the corporate responsibility umbrella, but at the end of the day, the goal of businesses is to make money.

      In a broader context, I would like to see academics, humanitarians, business, government, artists, musicians, etc. working together to solve the world’s complex problems, but I am not naïve. I understand that along with having different goals, these various organizations/sectors/people also have different motivations, have detrimental egos, along with a host of other issues that make this type of collaboration close to impossible. That said, I still think this type of collaboration might be theoretically possible, if the group contained a few thought leaders from each sector and was incredibly small.

      Thank you again for your thoughtful comment and engagement in this conversation. I am going to take a look at Idibon and then engage you directly via email.

      Lucy

      • It has been my experience that Silicon Valley companies, and tech companies more broadly, are very interested in having a positive impact on the world. Some of this is simply corporate responsibility, but most of it is just a fundamental part of engineering culture: we are building technologies that we want people to use. The more important the context, the more we would like to see our technology used there.

        The biggest barriers are currently from within the NGO and Humanitarian sectors themselves. The not-for-profit world is happy to pay for cars, flights, food, accommodation, and even for data storage, but for some reason many organizations balk at the idea of paying for information processing. When you run the numbers, many of the volunteer crowdsourced initiatives would have been cheaper if they had paid professional workers rather dedicating the resources for managing and training volunteers. Some for-profit companies will work for free and many more for cost-recovery, but even turning a profit is not necessarily a bad thing if it allows companies to direct more resources that are specifically tailored to social-development markets.

        This bias is from funders and practitioners alike. Funders are less likely to pay for additions to for-profit technology, and practitioners are less like to want to dilute their brands by using existing solutions from the for-profit world. The crowdsourcing technology used by humanitarian and social development organizations in 2012 is *less* sophisticated than the technology they were using in 2010, even though the same for-profit crowdsourcing companies are offering their services for free. This is because many not-for-profits are now trying to host their own information processing services and falling short. If organizations like the Red Cross and UN try to develop their own ‘big data’ technologies in-house, we’ll see the same thing again.

        • It is unfortunate to hear that although in some cases, it is more cost effective to use readily available commercial solutions, that some NGOs and humanitarian organizations still choose more costly and less effective homegrown solutions. One would logically think that one would choose the more cost effective, readily available alternative. That said, choosing the more expensive homegrown alternative, may be indicative of the intra-sector collaboration hindrances I alluded to in my previous comment. These hindrances prevent real intra-sector collaboration and progress, which is regrettable.

  4. Jill Finlayson (@jfinlayson) on said:

    Hi, you may want to take a look at Markets for Good, http://www.marketsforgood.org/ that are taking on standards and the “who will” process data questions.
    Cheers,
    Jill

  5. On a related topic there’s also some good points about peer production and crowd sourcing of data during emergencies, including information about local ownership, building capacity, quality control, and standards. Take a look at the World Bank’s new Striking Poverty discussion on Mapping and Disaster Management https://strikingpoverty.worldbank.org/c121017. As well the contract transparency is about making contract data transparent and accessible to citizens in resource rich countries… https://strikingpoverty.worldbank.org/c121031

    • Jill,

      Thank you for your comments. I think Markets for Good started a few days after I posted this! I will also take a look at the Striking Poverty website thank you for the suggestion. Also, I just came across an article on Twitter this morning, in the Stanford Social Innovation Review, The Math of Social Change. So, it would seem this post is definitely timely, as the social sector is starting to get serious about big data analytics. Fabulous!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>