Putting the Face on the IMMENSE WORLD of BIG DATA


As I start this blog I have to tell you I am reminded of what Steve Jobs purportedly said to John Sculley, an executive at Pepsi, to get him to come run Apple.

“Do you want to sell sugar water for the rest of your life or come with me and change the world?”  

You know it’s rare to be part of something that is so new and innovative that it’s hard to explain. What I’m trying to describe is the “fresh” before the fad. I’m not talking about being part of a flash mob, but join in on the very first. I’m not talking about the 20,000th wedding party dance down the aisle-a-thon, but a vulnerable moment…where you don’t know the outcome of your effort. Few of us are lucky enough to join a startup that takes an idea, and grows it to an empire, or be at the moment when something newsworthy is created. These special moments hard to find. They become salient memories when you are part of them.

Well I have one of these moments I can let you in on…

Like reading tomorrow’s newspaper today, I can prognosticate on an upcoming event, and I’m going let you in on the ground floor. Today’s blog is to let you know about an organized series of events that will measure our world in new ways, it will change the ways we see ourselves, and you can be right in the freaking middle of it all.

“The Human Face of Big Data” (HFOBD)Human Face of Big Data

EMC Corporation and Rick Smolan, in conjunction with Rick’s production company “Against-All-Odds” Productions, is launching one of the most aggressive, creative, and newsworthy projects the Information Technology (IT) industry has ever seen. It’s so audacious that it’s not even IT anymore. Out of what many would have traditionally called the marketing department, comes a project that is more akin to the benevolence of the Medici family funding Michelangelo’s Sistine masterpiece, than what your average Fortune 100 Corporation spearheads. EMC, its social network of partners and a diverse set of stakeholders have come behind this vision that employs the ancient art of storytelling. The story they will tell will introduce us all to the global face of big data. 

Don’t get distracted… I know you’ve read a blog on global big data, but this is no blog mad ‘am. When you take a visionary from photo journalism and the market’s think tank of big data idea people, you come up with something much more profound. This is an orchestrated process of multiple global events on a grand scale that will, by its very action, deliver new insights on the emerging data-rich world we inhabit. It will be a fractal analog of data analytics it’s self. We create, we explore and we learn.

I’ll stop and get to the details.  The HFOBD is a multi-point plan. I want to share with you some of the high and low level details so you can get excited and join in, while I’ll leave a fair amount for you to explore as you engage in the process.

So is it BIG?

Well here are some of the initial estimated data points on the project’s scope

  • 100 Photographers
  • 1,000 node Greenplum device
  • 10,000 data detectives
  • 10,000,000 human sensors
  • 1,000,000,000 media impressions

Will I learn anything?

  • Did you know that depression can be predicted a couple days before it happens? Find out more.
  • Do you realize scatter plots can prevent future crime? I predict a causative relationship!
  • Ever feel like you could do more to reduce your energy bill, but don’t have an outlet? (pun intended…)
  • “CONTEXT IS KING!” we can do more if we can bundle silos on knowledge via context. Open-minded data-driven solutions.

Are there some Milesposts along the Journey?

  •  From September 25nd  until October 1st, millions of people will interact with their mobile device (w/ HFOBD app) and the world around them. For just a few days, devices will collect both passive and active human interaction for millions across the planet.
  • Additionally as part of the experience children will be given devices to collect data about their world.
  • There will be a control center in the ballroom of the New York Stock Exchange where Big Data Big Shots will gather to manage the process.
  • During and after the event, thousands of data detectives will process and explore the data for new levels of knowing.
  • The experience will culminate in a handful of artifacts:
    • a website where you can explore the experience and you can additionally connect with people that are like you, in profile, across the globe. 
    • And, a large book of data and photos telling the story. The book will be distributed to the 10,000 of the world’s leaders and influencers in November.
    • A CNN Special documentary and other public events.

Where do I go?  What do I do?

  • TODAY– Get Connected  
    •  go register your email and share the link with everyone. http://www.humanfaceofbigdata.com/
    • Start following on Twitter: @FaceOfBigData  ; #HFOBD
    • Vimeo Video: https://twitter.com/#!/search/face%20of%20big%20data/slideshow/videos
    • September 25th thru Oct 1st – Download the App & Engage the App! 
      •  Each person who loads the apps changes the outcome.
      • Get involved, and inspire your network of friends to join
      • Carve off enough time to engage the device, carry on with who you are for 24 hours.
      • I’ll be at Oracle OpenWorld in San Francisco with 40,000 other people, hopefully all engaged in the big experiment.
    • October 2nd –  Keep Tabs on the Progress!
      • Keep an eye out for information that begins to post on the progress
      • Check the website routinely, what is planned sounds really cool
    • November 20th – Enjoy the Outcome
      • Book will be available in electronic form,
      • once published alsoin a free iPad app and also in PDF form.
      • Watch for the CNN Special

I can’t wait to share this memory with you.

Analytic-Driven Action, one step into the uncharted abyss… What Now?


 Hello everyone. A topic that has been running through my mind is what impact does “analytic-driven action” have on our analytic, reporting and operational business systems. First there is plenty of stochastic prose about terminology used in BI/BA space today. I think it’s important you are aware of my interpretation of the terminology, so I put a write up at the end of the blog.

Otherwise we’ll continue.

My interests today are around predictive analytics and real-time analytics. Let’s assume we’re using super fast equipment returning data in milliseconds or we’re leveraging predictive tools that allow us the ability to take on the proactive steps in engaging customers, partners or employees in real-time. If we have the ability to leverage new found intelligence to effect action (aka analytic-driven manipulation of the user experience) my question is what do you do with it and how do you set up your systems so that you can do not just one data-driven action, but essentially blend analytics and operational systems in an iterative social activity?

What I am talking about here is a perfect puree of business, sales, support, and analytics systems. In other words we’re talking 100’s of millions if not billions of dollars of transformation for you average Fortune 100 company to put this into action. So we have to make an enormous, profitable market while we’re at it. And, hopefully change the world.

There are a few ways I’d like to evolve the thinking on this topic to make it worth the effort. In this blog I have laid out a few areas to consider in forming an approach. First topic: if we’re going to do “Analytic-Driven Action” we need to include additional functionality beyond our CRM and ecommerce toolsets. Let’s also consider:

–          True Engagement – Sending a coupon based on my GPS co-ords is shallow and intrusive.  Engage me, know me, interact with me, and provide more than I know about myself and my surroundings. As Guy Kawasaki  likes to put it “Enchant me”.

–          Cross System Integration – You want to immerse and manipulate me, then you’ll need to bring multiple sources of information together to create data that expands my accessible knowledge about items in my locale, and about the other people around me who resemble me in a demographic way.

–          Process Management – We’ve all been on that phone call where 30 people join a weekly cadence and no one’s done anything… Our systems are similar we can’t shepherd important customer/client/constituent interaction without being able to take an action and follow through. We’ll need workflow technology to deliver on our promises.

–          Authorized Personal Profile – The privacy answer can’t be “no systems” or “no privacy”. We need concentric circles of privacy, where we allow access to our general profile, our personal relationships, and our financial/health data. Allow us to let people into our circles and establish a social contract of privacy that is democratic and individually controlled, yet universally leveraged.

–          Minimum Collective Knowledge – Recently I did an informal survey as I ran on the treadmill at my local YMCA. The Y had 4 choices in 24/7 news running (CNN, Fox, MSNBC, CNBC) That day I ran about an hour and in that time, I would tell you unofficially, these 4 stations only overlapped on about 10% on the news they covered, (overlap of all 4 on same topic was about 4%). There were interesting stories across the spectrum, but really quite different. If you didn’t invest in churning between channels, you really miss part of the contemporary experience. Ultimately, we’re living different lives side-by-side. If the polarization of our politics is any example of that, then I am concerned with the specialization of information creating more distance between us and our commonalities waning.  Yes this is a societal issue, but it has relevance in this discussion. We can’t get so specialized that we forget the value in a common dialog and the structures that compose that. If we get out of “broadcasting” activities how is that lessened need supported and/or replaced.

When we talk about Analytic-Driven Action, we often reference the use cases of “fast trades on the trading floor”, or “proximity offers to mobile user” as examples.  These are two early versions in this field of thought, but they don’t test out the conceptual model to my satisfaction.  Let’s find something more interesting.  Life’s two most important activities are to “live” and “reproduce”.  I want to keep my “G rating”… so I’m going to choose “Iive” as the verb for my use case.  We all want to live and we are all motivated to take action that will ensure we live while shopping, exercising, taking a business trip, etc. In my mind, living is made up of primarily 4 things:

  • Historic vices & virtues – what you’ve done right & wrong in the past. This has little to do with your future, but you may already have been impacted by it. I put stress in the “vice” category…
  • Current vices & virtues – what you do today really matters for your current health
  • Environmental factors – Is there a piano hanging over your head, or a hurricane off the coast?
  • Defects – Genetics… Some of us have defects that will appear, and some will be fatal.

So what could we do to improve a person or group’s chances of living?  A quick brain dump of marginal ideas rendered the following list:

  • Remind someone they haven’t exercised in two days, and provide suggestions on optimal time to exercise with close locations based on weather and schedule
  • Point them to 5 people within a mile that also would like to play a pick-up game of basketball
  • Report that you’re current hydration, heart rate and temperature for people you age has lead to ___% instances of hypovolemic shock in the last year.
  • Offer a note that someone similar to them has just sat down for coffee alone and wouldn’t mind a conversation.
  • Identify other close-proximity people in a traumatic situation, so that the team can group to survive.
  • ID tha someone 1 mile away is looking for an expert in your expertise. (service is healthy…)
  • Verbal “play by play” steps to perform CPR, merged with a launched 911 call with GPS co-ords auto supplied, notifying a doctor that is 200 yds away.
  •  See that you are late for you meeting automatically sending an email to the attendees with an estimated arrival and/or launch a bridge call to connect everyone to get started.
  • Suggest products that are located within walking distance (when you have a free 30 mins in your schedule) that people of similar profile have purchased for health reasons, and accept alternative requests from the user. “No I don’t want a Buns of Steel Video, but my wife did need a cartridge for the Brita filter, is there one close?”

   How do systems have to change?

So maybe I went on too long about ideas and I am sure there are tons of other, better ideas. My point is that let’s not do thinly veiled commerce with “did you know there’s a Tony Romo around the corner…” type activity, but let’s actually impact people’s lives in a positive way. Analytics can’t crunch through numbers in real-time, then feed little digital billboards to us and call it innovation. Instead engage with the ability to inspire and interact.  This requires something more than real-time analytics, it requires our operational systems to respond in similar ways. Not that everything will run in real-time, but the systems need to interact on demand, kicking processes off that then come back with answers or start execution of other events and deliverables.  Additionally, we need the ability to integrate with other sources of data and operational systems that are owned by others.  The “supply chain” revolutionized the retail store, thus this model will revolutionize our economy and ultimately society all by surrounding the individual and his/her tribes.  We need to apply various flavors of analytics in conjunction with operational systems with the individual in mind, instead focusing on short-money targets. Otherwise we will miss a grand opportunity to change the way we do business in the future.

=Terminology Reference Section=

Here’s my little reference section on terminology. Is there a difference between all the current buzzwords? I believe the answer is yes and there is a difference worth noting. Here’s a reasonable definition I support (http://bit.ly/Of0tXt). I have also provided my 2 cents on the differences of BI, BA, RTA, & PA in this section.

Business Intelligence (BI) was, is and will be a function within corporations that generates reports on internal operations, sales tickets and possibly soft targets like customer feedback. It is the reporting function that sits upon primarily relational databases today and has been mostly rear facing. The primary data management functions within BI are to aggregate and slice views out of the big chunk of data (i.e. June Revenue, Inventory Turns, Customer wait time, etc.). Companies read this intelligence and react to change their business. I do believe this term will evolve to incorporate the entire world of business analytics, if for no other reason companies like SAP with their BusinessObjects, Sybase, and HANA products will continue to use the term BI in their product suite. That’s ok, but in most companies today it’s their reporting system and different than analytics.

Business Analytics (BA) is different. I am NOT saying it’s new and that some companies haven’t been deploying business analytics/data analytics in their overall business intelligence investments, but most haven’t considered its impact at the same level that they do today. Business analytics is the leverage of statistical analysis to explore business interactions and data correlations. These tools were once relegated to the engineering or science lab, but now are being employed in BA. BA uses linear regression, logarithmic regression, k-means, hypothesis testing, scatter plots to iteratively push millions of bits of data through analytic pipes hoping to get new insight at the end. “We can say with 95% certainty that customers with yellow cars will likely to… in this scenario…” It’s more like archeology than a hardened business process.

Real-time Analytics (RTA) So what is the difference in real-time analytics and predictive analytics? These two are arbitrarily inter-changed in the press today. Probably because it is not profitable for companies to split hairs over such definitions and thus the lines blur, but in my mind they are distinctly definable and different. The term real-time comes from real-time systems. R-T Systems are required to not only compute, but compute at a given speed. While getting my masters, I worked at a company writing code for real-time systems, in my case NTSC (20 frames/sec). It is difficult to ensure that code written on one system runs consistently at the same speed on another system. As you can guess they don’t, you have to come up with algorithms that create the consistency.  So to me real-time analytics not only refers to systems that can deliver fast results, but that the analytics has some form of temporal requirement.

Predictive Analytics (PA) is not a time based activity, it is primarily leveraging tools like linear regression to determine the “Y” deviation as you plot x.  If you have thousands of examples of x,y plotted out and you know your “x”, you can predict your “y”.  For example: “If you are 40 years old, then you have a 50% or better likelihood of you will ___ your____”.  We’re using vast historical knowledge to predict behavior.