Tagged: big data

Forty three (iii)

“Have you heard of the Quantified Self?” he asked us all. Nope.

“It’s about using technology to record data about your personal life in terms of inputs such as what you eat and drink, states such as mood or heart rate, and mental and physical performance. Only by recording the data can we spot correlations that might prove useful if they turn out to be cause and effect. Only by sharing personal data sets with others can we attempt to find wider patterns.”

“Ah yes,” said Yvonne, “one of my friends wears a wristband that records stuff and uploads it to her phone I believe.”

“So that might tell her how well she’s sleeping for example, and she can ascertain what factors in the way she lives affect her quality of sleep,” Saket replied. “Then, if she’s having trouble sleeping, instead of masking the problem with sleeping pills, she can adjust the way she lives to sleep better naturally.”

“And you’re saying you can treat an organization just the same?” I asked.

“Yes. This takes us back to big data and your conversations with John and your friend Dom. Only by gathering as much data together as possible can we spot, or rather have software spot patterns we wouldn’t otherwise know about.”

“So is this contradicting the idea of organizational synapses?” I asked.

“It’s not yet entirely clear to me, but I think there are ways in which they complement rather than contradict.”


Big data

John said “big data” is the phrase describing today’s facility to harvest, store and analyze large quantities of data. In fact, he went so far as to say that the phrase implies such facility wasn’t widely available during the first decade of the century; it demands a whole new set of tech.

He described another way in which “big data” may be considered to be different to “the normal sized stuff” – data helps us answer questions; big data also helps us conceive new questions.

Calling big data “big” is sort of underplaying it. It’s really really REALLY big! It’s often measured in terms of petabytes, where a petabyte is a thousand terabytes, or a billion megabytes. John put that into perspective: a 1-petabyte mp3 music track (128kbps) would play for 1,980 years.

I asked John to repeat the statement he’d confronted me with a few weeks earlier. He did. “Data paucity was the problem of the 20th Century. Having too much of the stuff is rapidly becoming the challenge and the opportunity of the 21st.”

I’d begun to think about the skills aspects of this. We weren’t overflowing with people competent in statistics or research methodologies, let alone people who understood the vagaries of big data collection, storage and analysis. Perhaps the heavy tech can be outsourced – per my conversation with John about the ‘T’ in IT – but we still need to understand the analytical insight it gives us. And as John says, perhaps the challenge isn’t so much understanding the answers as knowing what to ask.

I didn’t raise this concern because it seemed more to do with the ‘how’ when we were still focusing on the ‘what’.

Future sources feeding our big data include the social web, test data, and performance data kicked off by our products in the field.

John said if every one of our commercial products fed back a couple of kilobytes of data an hour, this could add up to more than 300 gigabytes a day, a tenth of a petabyte a year. He was at pains to point out that this was an estimate because right now we don’t even have the data to tell us how many of our products are in use, and this simple observation alone underlined the stark contrast of the transition we’re confronting.

Thirty nine (ii)

In particular, not everyone yet appreciated the difference between “complexity” and “complicated”, especially as they’re often used synonymously in everyday language.

ACTION: Eli to arrange a complexity training session.

This training aside, the consensus is that navigating complexity – in the age of big data (see below) – represents a source of considerable competitive advantage.

With the exception of news of Georgio’s home remodeling, the conversation over lunch was dominated by the morning’s work. More specifically, it was dominated by everyone retelling in their own words what they’d heard and learned, a good sign that we were truly knowledge building.

Fifteen (ii)

“A collection of information is not knowledge. We must build knowledge from such information by identifying and interpreting patterns. So for this example, we identify the process causing the occasional test failures and develop an appreciation for how it might be fixed.

“Putting some important security and legal issues to one side for the moment, we don’t have to care too much about the underlying technology any more. Rather, we need to focus on getting the right information to the right people at the right time in the right format, and help them translate the information into knowledge in order that they can do their jobs better.

“Data paucity was the problem of the 20th Century. Having too much of the stuff is rapidly becoming the challenge and the opportunity of the 21st.”

I checked with myself to make sure I knew John was an important cog in the Attenzi machine. I did.

“Big data?” I asked.

“Are we playing buzzword bingo?” said John, deadpan.

“Hey, you said ‘cloud’ not me!” I countered.

John smiled, “Yes, so-called big data. This idea that we can digitize almost anything, including all our parts and products and services and processes by the way, and collate all those terabytes of data, and store it cheaply and easily and forever and use it for all sorts of analyses.”

I chipped in: “But only if that analyses translates data into information and knowledge, right? Makes it useful to us?”

“Well how do you determine the value of the information and knowledge prior to the translation of data into information and information into knowledge?”

“Er, ask someone in IT?!” I offered.

John took the compliment with a caveat, “And the domain experts. Humans are quite capable of digesting four dimensions of data when presented in an appropriate way, simply because we inhabit a four dimensional world – three dimensions of space and one of time. But the data we have, and could harvest in the future, spans many dimensions. We need therefore to work together to develop intelligent software that identifies and extracts the most interesting, useful, valuable four dimensions for visual presentation to the domain experts.

“A primary challenge isn’t trying to find answers to questions but determining good questions in the first place.”

I put my empty cup down and began kneading the blue cushion with the white bird on it. You know the one.

I was thinking out loud now. “Divining exact usefulness or attributing precise value to an insight is incredibly difficult to do with hindsight, let alone in advance. Perhaps the best we can achieve then is simply to have a good guess at whether the potential information and knowledge we might unearth will make anyone act on it.”

John clarified the thought, “Or perhaps, more precisely, will anyone change what they would have done otherwise.”

“What’s the difference?” I asked.

“Well, deciding not to do something isn’t often recognized as an action.”