5 top data challenges that are changing the face of data centers

Data is clearly not what it used to be! Organizations of all types are finding new uses for data as part of their digital transformations. Examples abound in every industry, from jet engines to grocery stores, for data becoming key to competitive advantage. I call this new data because it is very different from the financial and ERP data that we are most familiar with. That old data was mostly transactional, and privately captured from internal sources, which drove the client/server revolution. 

New data is both transactional and unstructured, publicly available and privately collected, and its value is derived from the ability to aggregate and analyze it. Loosely speaking we can divide this new data into two categories: big data – large aggregated data sets used for batch analytics – and fast data – data collected from many sources that is used to drive immediate decision making. The big data–fast data paradigm is driving a completely new architecture for data centers (both public and private).

Over the next series of blogs, I will cover each of the top five data challenges presented by new data center architectures:

  1. Data capture is driving edge-to-core data center architectures: New data is captured at the source. That source might be beneath the ocean, in the case of oil and gas exploration, from satellites in orbit, in the case of weather applications, on your phone, in the case of pictures, video and tweets, or on the set of a movie. The volume of data collected at the source will be several orders of magnitude higher than we are familiar with today.
  1. Data scale is driving data center automation: The scale of large cloud providers is already such that they must invest heavily in automation and intelligence for managing their infrastructures. Any manual management is simply cost-prohibitive at the scale that they operate at. 
  1. Data mobility is changing global networks: If data is everywhere, then it must be moved in order to be aggregated and analyzed. Just when we thought (hoped) that networks were getting faster than internet bandwidth requirements at 40 to 100 Gbps, data movement is likely to increase 100x to 1000x.
  1. Data value is revolutionizing storage: In a previous blog entitled, “Measuring the economic value of data,” I introduced a way of thinking about and measuring data value. There is no question that data is becoming more valuable to organizations and that the usefulness of data over longer periods of time is growing as a result of machine learning and artificial intelligence (AI) based analytics. This means that more data needs to be stored for longer periods of time and that the data must be addressable in aggregate in order for analytics to be effective.
  1. Data analytics is the driver for compute-intensive architectures in the future: Organizations are driven to keep more data in order to aggregate it into big data repositories, by the nature of analytics and in particular machine learning. These types of analytics provide better answers when applied against multiple, larger data sources. Analytics and machine learning are compute intensive operations. As a result, analytics on large datasets drive large amounts of high speed processing. At the same time, the compute intensive nature of analytics is driving many new ways to store and access data, from in-memory databases to 100 petabyte scale object stores.

Challenge No. 1: Data capture is driving edge-to-core data center architectures

New data is captured at the source. The volume of data collected at the source will be several orders of magnitude higher than we are familiar with today. For example, an autonomous car will generate up to 4 terabytes of data per day. Scale that for millions – or even billions of cars, and we must prepare for a new data onslaught. 

Leave a Reply

Your email address will not be published. Required fields are marked *