Asia Pacific

The Big Data Conundrum: Garbage In Garbage out and other challenges

If you haven’t heard by now, IoT is a growing web of connected sensors and “things” that will may dramatically improve our lives with the magnitude of data captured. From our discussions with industry end users these 2 years, the value of IoT is in this data and the corresponding data analytics that can unlock the most business potential.

As EMC and IDC pointed out in their 2014 Digital Universe report, organisations need to hone in on high-value, ‘target-rich’ data that is (1) easy to access; (2) available in real time; (3) has a large footprint (affecting major parts of the organisation or its customer base); and/or (4) can effect meaningful change, given the appropriate analysis and follow-up action.

The term “garbage in, garbage out” was first coined in the early 1960s, and in the age of Big Data, the GIGO problem may be exacerbated with the speed and volume of data being collected. 

From Data Collection to Delivering Business Value – 5 Key Challenges

1. Accuracy and Usefulness of Data

Veracity is one of the 4 Vs of Big Data and for me, the most important one. (The other 3 Vs are Volume, Velocity and Variety.) In Solon Barocas and Andrew Selbst’s article: “Big Data’s Disparate Impact” on social discrimination, they note that an algorithm is only as good as the data it works with. Bad data not only produces inaccurate information, it also can lead to catastrophic results. 

Above the accuracy of data, it is finding out the most useful data. Mckinsey recently found that only 1% of the data from an oil rig’s 30,000 sensors are examined for detecting and controlling anomalies. The other 99% are not being analysed for performance optimization or predictive maintenance. The best data is actionable and it is always useful to start with the specific problems you would like your data to solve.

“We’ll collect the data now and figure out what to do with it later” should not be relied upon as a strategy. Having too much (useless) data is not only expensive to store but also creates a legal liability. Case in point: the Ashley Madison saga.

2. The BIG data

The vast amounts of data that will be generated by IoT devices will put enormous pressure on network and data centre infrastructure.

As Gartner’s Research Director Fabrizio Biscotti points out, “Processing large quantities of IoT data in real time will increase the workloads of data centers, leaving providers facing new security, capacity and analytics challenges.”

“Data center managers will need to deploy more forward-looking capacity management platforms that can include a data center infrastructure management (DCIM) system approach of aligning IT and operational technology (OT) standards and communications protocols to be able to proacively provide the production facility to process the IoT data points based on the priorities and the business needs.”

3. Security and Privacy Who should know What at When

A constant topic at all our conferences in the region, the topic of security and privacy is a tricky issue as different stakeholders in the IoT ecosystems have different needs. To compound the issue, the huge diversity of device types, their different capabilities and the range of deployment scenarios makes security a unique challenge.

However, the IoT can only reach its full potential if we have strong security and privacy safeguards in place, especially in industries like healthcare, finance and critical infrastructure. Security protocols should be inbuilt into the entire data chain from sensors to datacenters to applications.

With data as the new currency, it will become important to know who has ownership over what types of data. Open data sources would be important for rapid IoT development, but what sorts of information should be made available?

How should one sort the varying levels of data access at different periods of time? Does the subject of data collection have a say over what they share?

4. When diversity may not be a good thing

Gartner analyst Doug Laney in his original paper in 2001 on data management wrote that “the variety of incompatible data formats, non-aligned data structures, and inconsistent data semantics” was the principal barrier to effective data management.

The challenge of having to deal with multiple data sources with different levels of accuracy and across formats/ standards is especially prevalent for large enterprises with legacy systems and abundance of collected data serving different purposes.

These enterprises find that there is a need to harness the variety of these data from different departments and sources to maximize return from their analytics and also apply insights to as many areas of the enterprise as they can.

The diverse data also presents a security and scalability challenge that the industry is trying to solve with IoT standards. With more industry leaders trying to set their own unique standards, the final goal is still a distance away.

5. Competing for the talent

Every big new technology advancement brings the inevitable talent shortage. In IoT that key hire is often the data scientist or data architect.

“The data scientist has become the unicorn of the big data world,” said Rob Patterson, VP of corporate strategy at ColdLight, a PTC business. “It has been extremely difficult finding those people with programming skills, mathematical expertise, and business acumen.” 

There just aren’t enough people with the required skills to analyze and transforming data into actionable insights and Gartner found that more than half of the business leaders they interviewed felt that their big data efforts were constrained by the ability to find the right talent.

More companies are now teaming up with universities to address this issue. In our recent 6th Asia IoT Business Platform event in KL, Prof. Dr. Sharin bin Sahib from UTeM felt that it is important to have a talent pipeline where potential candidates can be continuously nurtured and approached when vacancies arise. Towards this need, they have opened an IoT academy with Samsung.

In the meantime, take a page out of Walmart’s book. They launched an analytics competition on Kaggle, asking “armchair data scientists” to solve real world problems from given data sets, and hiring designers of the best solutions. The approach led to interesting appointments, who may not have been considered for interviews based on past experiences.

For more on market trends and outlook of IoT, join us for our series of Asia IoT Business Platform events in 2016.


This post was contributed by Yue Yeng Fong, Vice President Business Development at Industry Platform and was first published here.


Yue Yeng Fong

Nov 30, 2015

Leave a Reply