Data lakes may be more like data dumps today

Dr Ryad Soobhany, Director of MSc Projects, School of Mathematical and Computer Sciences, Heriot-Watt University Dubai.
Dr Ryad Soobhany, Director of MSc Projects, School of Mathematical and Computer Sciences, Heriot-Watt University Dubai.

Due to the interconnectivity and tech-savviness of the modern world, a massive amount of big data is being generated. Research firm International Data Corporation IDC has forecasted that the data generated yearly worldwide will grow from 33 ZB or 33 trillion Gigabytes in 2018 through to 59 ZB in 2020 to 175 ZB by 2025. The data can be generated from media, web, IoT, financial and medical sources.

The data can be stored in data lakes in unstructured raw format. Organisations need effective data storage management for the storage and handling of their growing data. Data analytics and visualisation techniques are used to extract knowledge from the data and present useful information to end users. 

Healthcare and life sciences, banking and finance, IT and telecommunication, manufacturing, energy and utilities, media and entertainment, and government are some of the key sectors currently investing in big data technologies and data lakes. 

The investment by the healthcare and life sciences segment is likely to grow exponentially during the next five years in an effort to collate patient data more effectively, improve overall patient experience, and enable data-driven, actionable analytics. 

In the banking sector, predictive data analytics is already transforming entities helping them strengthen their risk assessment and optimise their operations. 

The data analysis needs to concentrate on extracting and presenting relevant information so that organisations can focus on their main business intelligence goals and projects. The data in the data lake is not pre-cleaned, and users could be receiving data that triggers more questions than answers. 

Moreover, lack of conformed dimensions in data lakes requires the need for data scientists to apply advanced analytics, machine learning and visualisation. Since the data dumped into a data lake is undefined, it is not known whether the data is actually useful until big data analytics are applied.

Millions of ad dollars are lost in running inefficient marketing campaigns, but now many brand and agencies have started to leverage big data to derive useful insights on customer behaviour that help them run more targeted and impactful campaigns. 

Netflix is one such example of a large global brand that utilizes big data analytics for targeted advertising, which has helped them cement a top position in the online, on-demand entertainment space. 

Healthcare providers are collaborating with smartwatches manufacturers IoT to monitor the health and securely share data of patients. An example is Apple’s ResearchKit and CareKit frameworks.

Don't Miss

Dr. Hadj Batatia, Director of Research, Mathematical and Computer Sciences, Heriot-Watt University Dubai

Maximizing Big Data Benefits Needs Strong Data Governance

How have big data analytics evolved in the last few years, and
Dr Ryad Soobhany, Assistant Professor, Postgraduate Project Director, School of Mathematical and Computer Sciences, Heriot-Watt University Dubai.

Multi-device, multi-cloud requires robust security

The adoption of remote working in most industries, digital-first customer experiences, no-contact