Companies today face incredible challenges around compliance, security and analytics, as their data lakes fill with invaluable information from ever more sensors. And tomorrow’s challenges will be no easier. As the digital age expands to cover all facets of our lives, more and more computing power will be necessary to process all of the data created.
Take the explosion of Internet of Things (IoT) as an example. We have only sampled the benefits that the IoT can provide. In the words of Dan Mitchell, a retail analytics industry expert with SAS, IoT can be fundamentally described as “a network of connected physical objects embedded with sensors. IoT allows these devices to communicate, analyze, and share data about the physical world around us via networks and cloud-based software platforms.”
The concept of devices collecting data and sending it somewhere to be analyzed is not new. Heart monitors have been used for decades to collect and send data to cardiologists about the health of their patients’ hearts. But in a data-driven world, collecting data is only one part of the equation; the capability of analyzing this data in real time takes IoT to an exciting level. But the amount of data the IoT creates already, and will create in the future, is truly staggering.
Luckily, the speed and capacity of the mainframe, analyzing data across your entire enterprise in real time, means the mainframe can process your IoT data without a hiccup. I’ll explain what I mean.
The internet of where we buy our things
In the brick-and-mortar retail industry, tools such as mobile apps, Wi-Fi tracking, RFID inventory tracking, and traditional in-store infrared foot-traffic counters all contribute to a retailer’s data lake for the purpose of understanding their consumers and providing a customized shopping experience. But providing the connected consumer with a frictionless shopping experience means analyzing the data in your lake in real time to target your consumers with features such as in-store sales ads.
Some of the data in your lake is perishable and should be leveraged immediately while the consumer is still in the store, while other data is timeless and will provide insight for future use. But the question becomes this: How do you make sense of the data being collected by so many connected devices?
All your water in different lakes
Specific devices often store their data in specific databases. With so many devices collecting data, it’s not unusual for a retailer to end up with multiple databases, each warehousing different types of data, stored both on the mainframe and on a myriad of distributed platforms. The challenge then becomes how to design a process that enables data scientists to access IoT data in a manner that facilitates real-time analytics.
Extract, Transform, Load processes (ETL) — also called Extract, Load, Transform processes (ELT) — are often used to move data from one database or one platform to another and then transform it to create one homogeneous set of data on which to perform analytics. It makes sense to leverage the power and speed of the mainframe to process IoT data by ETL-ing all relevant data from their distributed platforms to the mainframe. But the ETL process often involves multiple pieces of software and multiple steps, which means real-time analysis is more like near-real-time analysis.
Data virtualization brings it all together
Thankfully, new innovations on the mainframe make it possible to leave IoT data in each of the databases where they reside and join the different data sources to perform your analytics. After all, with so many IoT devices collecting data and storing it in different locations, eliminating the ETL/ELT process altogether certainly sounds like a more efficient means to analyzing data.
The concept of data virtualization allows users to define the structure of a data source in a relational format so that SQL can be run against that data. This means disparate hierarchical data sources can be joined via SQL, just like relational data sources, creating an aggregate view of the available data, enterprisewide. Reading IoT data in situ using SQL can be performed on the mainframe using its enhanced capabilities and can eliminate the ETL/ELT process entirely. By using the workhorse platform retailers already have within their infrastructure, they can gain real-time access to their IoT data and provide their customers with a frictionless and customized shopping experience. All from the mainframe.
Collecting data from sensors and devices will only increase over the coming decade. A February 2017 report from Gartner states that there are already more devices on the internet than people. So, in an age with ever-growing data collection happening across all industries, ETL-ing data for analysis will become a thing of the past.
Eliminating latency will become as critical to a company’s success as other traditional critical business processes. Leveraging the power of the mainframe to virtualize your IoT data in real-time will become a standard practice to effectively put critical pieces of information in one place for real-time analysis.