To get up and running on a self-service, big-data analytics platform efficiently, many data-center and network managers these days would likely think about using a cloud service. But not so fast – there is some debate about whether the public cloud is the way to go for certain big-data analytics.
For some big-data applications, the public cloud may be more expensive in the long run, and because of latency issues, slower than on-site private cloud solutions. In addition, having data storage reside on premises often makes sense due to regulatory and security considerations.
With all this in mind, Dell EMC has teamed up with BlueData, the provider of a container-based software platform for AI and big-data workloads, to offer Ready Solutions for Big Data, a big data as a service (BDaaS) package for on-premises data centers. The offering brings together Dell EMC servers, storage, networking and services along with BlueData software, all optimized for big-data analytics.
“There are tradeoffs to cloud versus on-premise,” said Harold Kreitzman, a vice president of Strategic Advisory Services at the Edison Group who is doing a TCO analysis on Ready Solutions for Big Data for Dell EMC.
To be sure, public-cloud services give smaller companies access to the sort of compute power wielded by large enterprises with deep pockets, without all the upfront infrastructure costs, Kreitzman said. However, “cloud services charge for things that you normally don’t charge for on-premise, like data transfer,” Kreitzman said. When your analytics applications are constantly pulling data from the cloud, those costs can add up.
In addition, public cloud services are typically offered by geographical region and adding a region can involve a cost jump of more than 20 percent, Kreitzman added. “Ironically you’d think that cloud solutions would be cheaper, but it turns out that the more regions you put into place the more expensive it gets,” he said.
With an $800,000 entry price point, though, the Dell EMC Ready Solutions for Big Data is likely to appeal to larger companies. Depending on the configuration ordered, various enabling software and hardware technologies include:
- BlueData EPIC software: BlueData’s EPIC (short for “Elastic Private Instant Clusters”) platform uses Docker container technology to let users spin up virtual Hadoop or Spark clusters within minutes, the company says, giving analysts and data scientists on-demand access to the applications, data, and infrastructure. Related software includes Spark, Kafka and Cassandra.
- Dell EMC PowerEdge R640 or R70xd servers running on Intel Xeon processors and Red Hat Enterprise Linux. The package also offers Nvidia Tesla V100 GPU accelerators.
- Switches: The S5048-ON multirate 25GbE ToR supports 48 ports of 25GbE and six ports of 100GbE or 72 ports of 25GbE. The S3048-ON switch features 48x 1GbE and 4x 10GbE ports.
The BDaaS package also includes various services in the base price, and optional services for ongoing support. “We provide an integral, holistic experience,” said Kevin Gray, director of product marketing for Dell EMC, speaking at the Strata Data conference in New York Tuesday. “We do the installation, and it’s more than that – we have our Accelerate services, which not only help with the installation but helps customers with their first use case and understand what the roadmap is for use cases over time; so we’re helping them understand how to use their analytics better in their environment.”
Among other things, the Dell EMC BDaaS package helps solve a typical problem in enterprises, Gray said. Often, a company will put together a big-data analytics solution for a group of analysts or data scientists, only to then turn around and do another project for another group of users.
“Over time, the company finds itself doing multiple projects and replicating data over and over again,” Gray said. The Ready Solutions offering serves to consolidate software, data and hardware, cutting costs and application development efforts, he said.