David Stephenson, Chief Data Officer and Managing Director, Data Science Innovation
Can you give a basic outline of the relationship between structured and unstructured data, and how that may create challenges for companies across the Middle East?
Structured data is data with nicely pre-defined fields, such as name, address, size, etc. It’s the type of data that typical data systems have worked with over the past decades. Unstructured data, such as free-text or video streams, does not come in a specific format. Being less restricted, it has the potential to include a great amount of useful information that we may not expect to receive. However, the lack of structure makes it that much more difficult to process and more challenging to draw insights from.
Can you identify some of the greatest technical challenges facing organisations which are looking to adopt a big data strategy?
I believe the biggest challenge is not the technology itself so much as linking technology to the business case. Many people get excited about products and vendors, but far fewer people seem to see the real strengths and weaknesses of placing the products within their organizations or know how to decide between a small-data solution and a big-data solution.
What are the current big data analytics technical solutions available to companies to overcome these challenges?
Big data analytics solutions have been progressing for several years, including implementations of R in Hadoop, Apache Mahout, and in-database analytics for major analytics vendors within their MPPs. These provide solutions within certain contexts, but the underlying challenge remains to develop a business-appropriate analytics framework and not try to replace brainpower with horsepower.
What upcoming analytical innovations do you feel will be able to solve the current big data limitations on the market?
I believe cloud computing is helping in terms of lowering the entrance barrier to big data analytics. If nothing else, it allows companies to experiment with the possibilities without the large initial investment of time and money that would otherwise be required.
Beyond delivering customer insights, in which areas of business can big data improve operational efficiencies and strategies?
It varies per industry, but marketing and product improvements are two big ones (beyond customer insights). A good thought exercise to show this is to identify something you’d like to observe in detail, be it a customer journey, the impact of a product change, an individual’s response to a marketing effort, a series of video footage, web logs, etc., in order to gain some business insight. Now imagine being able to simultaneously analyze that insight for every single customer journey, response, video or event.
How new technology is affecting worldwide big data strategy and influencing modern business models, in respect to the Middle East and within the wider market in general ?
The ecosystem around Apache products such as Hadoop and Spark continues to mature, with open source and proprietary analytics solutions further developed to run as MapReduce jobs and companies such as Cloudera and Hortonworks producing tools that increase usability of Big Data solutions in an enterprise setting.
What are the most important recent developments in big data and analytics that you believe will have the biggest impact on the industry? What developments should the market be aware of that they aren’t already?
I would say that the biggest potential is from solutions with the lowest threshold of implementation. In that sense, it’s interesting to see the product developments with the larger providers of cloud solutions, such as the recently expanded product offerings of Amazon and the acquisition of Revolution Analytics by Microsoft. The low threshold of cloud solutions means that Big Data developments in this area have perhaps the greatest potential for impacting the market.