The unfortunate truth is, big data has a lot of enemies — and the largest group of opponents might just be those of us who have simply been force fed too much media spin on the subject in the first place. Of course there is no real big data i.e. all the data out there is still just data. As we know, big data is just another name for a large group of (often unstructured video, emails etc.) data in the new massive ‘webscale’ world of the Internet.
Data is the plural (of datum) anyway, so why all the ‘big’ data fuss in the first place?
The term big data has arisen to be defined something like: that amount of data that will not practically fit into a standard (relational) database for analysis and processing caused by the huge volumes of data created by the Internet of Things as well as machine-generated and transactional processes.
So if big data is so hard, who or what are the big data enemies that make dealing with it so tough?
Enemy #1 – IT architecture
Sumit Nijhawan, CEO & President of Infogix argues that technology itself is the first major big data enemy. More specifically, Nijhawan is referring to the architectural challenge of integrating ‘elements’ and data models for big data, which require the proper planning of design and architecture
“Data veracity and data silos are the biggest technical challenges, architecturally. Diverse data sources and repositories make it difficult to keep data consistent and accurate all the time. To fix the issue we must implement means to identify data redundancies and inconsistencies by appropriately planning data management and governance strategies,” he said.
Enemy #2 – Amateur data science
Pentaho director of enterprise solutions Wael Elrifai argues that big data has given rise to many people who now want to call themselves data scientists — and that this trend has led to a group of people who derive all manner of bizarre, unreliable and incorrect conclusions.
“That’s what happens when you apply statistical techniques without understanding how they work,” said Elrifai. “If I’m interviewing someone who can’t explain BLUE, selection bias, or tell me the difference between Bayesian and Frequentist inference, then the interview is over (surprisingly few can). The power of big data must come with responsibility. You wouldn’t put a 14 year-old in the driver’s seat of a Scuderia Ferrari… they’d have incredible power at their disposal but the odds of a good outcome would be statistically slim!”
Enemy #3 – Resources
Nijhawan of Infogix tables point three in this list and points to the resources problem in terms of the people-resources that are able to analyze data, draw conclusions and help organizations make better business decisions based on the data.
Echoing point #2, Nijhawan says that as analytics experiences exponential growth, most organizations face a lack of resources or analysts to handle and process big data elements and draw insights by observing the underlying patterns in data.
“It’s for this reason that many colleges and universities are starting specialized programs for analytics. However, it will take few more years to fill this gap. As an alternative, those organizations seeking analytics resources can cross train some of their people on analytics functions that are critical to the business. Hire the right talent that can develop an analytics framework that addresses business problems through automated data solutions,” he said.
Enemy #4 – Culture
Nijhawan also points to culture and says that when we consider culture, we think about the ways an organization makes decisions based on what has previously been done successfully, or sometimes, unsuccessfully.
“However, by leveraging analytics, organizations can evolve their existing classical decision making process based on past learnings or intuitions, into a logic-based decision support system that is based on evidence. The biggest complication in adopting analytics is the unbending mindset that is cautious to change and is complacent with existing legacy systems. It’s for this reason that change in culture is one of the key factors in successful adoptions of analytics. Leadership teams need to encourage their organization to make analytics-driven decisions. This helps develop a culture that leverages analytics for better decision making,” said Nijhawan.
Enemy #5 – Classification
Bob Plumridge, EMEA CTO at Hitachi Data Systems provides point #5 in this list and says that managing the huge increases in the volume of data flowing through a business each day is a huge challenge for IT teams.
“As the guardians of information, businesses are looking to IT departments to navigate where their data currently resides and interpret how it can be used. The issue for IT teams is that they are going in data blind. Often data isn’t classified at the point of creation, leaving businesses with no way of knowing whether they are looking at HR, sales or customer data,” said Plumridge.
The Hitachi Data Systems man reinforced the point by saying that with the majority of data holding little to no value, the importance of classification is paramount to ensure businesses retain the crucial 20 per cent. “Finding the right balance in predicting what might be deemed important in the future and today’s regulatory, storage requirements are a hurdle that organisations are increasingly tackling,” he said.
Enemy #6 – Translation
High-end ERP vendor IFS is run by CEO Alastair Sorbie — he thinks that enemy #6 is translation. Not only is accurate data crucial for informed decision-making, but importantly it needs to be effectively translated so that you can do something useful with it is his argument.
“It’s not just a case of mapping data to business processes, but rather to points in a process where decisions can be made. To combat the enemy of ‘translating data into something useful’, visual insights can help decision makers turn data into an easily understandable format so they can make the right decisions at the right time,” said Sorbie.
Enemy #7 – Scope
As we questioned before on Forbes when asking when is big data analytics a waste of time, the scope of which big data we assess makes it hard to know just how big our big data universe should be.
Big data clearly has a lot of negative factors (enemies) impacting how successfully we are able to deal with it in real world use cases. This list is intended as some kind of introduction to what may be a much more complex world of enemy factors in reality.
As Datastax CEO Billy Bosworth pointed out this week, there are software developer programmers… and then there is a new breed of employee that we are calling the ‘data engineers’. These are the good guys that can help us fight the big data enemies at the front line.