From on-premises to cloud analytics
Even in industries like semiconductor manufacturing that have been relatively slow to embrace a cloud solution for analytics, cloud computing trends are reshaping data architectures. Traditional reliance upon on-premises data warehouses is gradually giving way to cloud infrastructures and to hybrid architectures, which blend on-premises company data center resources with public cloud services.
Companies are shifting toward cloud analytics to deal with exponential growth in data volumes, as well as increasing complexity of data types and operations across their supply chains, according to a 2021 Semiconductor Engineering article. At the same time, on-premises options aren’t disappearing completely for these firms and their peers, due to considerations about:
- On-demand costs of cloud compute and cloud storage
- Data reliability and performance over cloud connections
- Security and control of proprietary company data
The situation underscores the broader question of how on-premises and cloud architectures compare, and when and why to use each one. Although it’s common to frame the issue as an either/or “on-premises vs. cloud” situation, in reality many organizations need, or can at least benefit from using, both architectures.
On-premises vs. cloud for analytics: The key differences
An on-premises environment is a single-tenant setup, supported by purpose-built hardware like Teradata IntelliFlex or a company’s existing data center infrastructure. It is located on-site, hence the name. VMWare virtual machines can also serve as the foundation for running on-premises apps.
Public cloud computing is a cloud service model through which a cloud provider shares some combination of infrastructure, platforms, and applications with multiple organizations over a network. The provider’s data center resources are partitioned customer by customer, with each customer subscribing or paying as they go for the right to use those services.
One or more public cloud services can be connected to a single-tenant data platform running in a customer’s environment. Combining these public clouds with on-premises infrastructure creates a hybrid cloud.
Beyond these basic distinctions in deployment location and responsibility, on-premises and cloud computing differ in more specific ways:
- On-premises: Most on-premises costs come from purchasing, managing, and maintaining data center infrastructure, such as a server or an onsite data warehouse. Capital expenditures, especially for hardware, dominate.
- Cloud: Cloud costs, for data warehousing at least, are predominantly subscription-based. Cloud server capacity may be reserved in advance from a cloud provider. Operating expenditures dominate.
- On-premises: Because it’s so physically close to an organization’s data sources and data architectures, on-premises infrastructure can have low latency — assuming it’s well-maintained, properly sized, and kept up-to-date.
- Cloud: Cloud infrastructure providers offer access to virtually unlimited resources, which provide the performance necessary for advanced data analytics when paired with a modern data platform.
Scalability and elasticity
- On-premises: Scaling an on-premises environment means adding more physical or virtual infrastructure, often at considerable cost and without the ability to closely match capacity to actual need (e.g., overprovisioning often occurs).
- Cloud: Public clouds make it easier to scale up and down as analytics workload requirements evolve. Clouds also provide elasticity, which is the rapid adaptability of such resource sizing for dynamic workloads.
- On-premises: Securing an on-premises data center is the sole responsibility of the enterprise. The company maintains complete control over security practices and tooling, with all of the advantages and disadvantages that come with it.
- Cloud: Cloud security is a shared responsibility between service provider and customer. The former oversees different layers of the security stack depending on the type of cloud service in use (infrastructure as a service, software as a service, etc.), with the rest falling to the customer. Data security will always fit into the latter bucket.
- On-premises: The enterprise controls how its on-premises environment works, along with its long-term roadmap, primarily at the hardware level. It buys and maintains infrastructure to support desired levels of performance, storage needs, the specific data architectures (data warehouse, data lakes, etc.) in use, and all data loading and querying.
- Cloud: Cloud service providers (CSPs) and independent software vendors (ISVs) offer an ever-evolving set of software-based services for storing, processing, and scaling enterprise data. Cloud-delivered AI and machine learning innovations in particular are important capabilities for advanced analytics workloads, and may be bundled with cloud service offerings. Overall, the pace of innovation and availability is much faster in the cloud than on-premises.
The pro and cons of on-premises vs. cloud analytics
These key differences between on-premises and cloud resources don’t necessarily favor one option over the other. The ideal choice depends on specific business requirements — and, in fact, could be a combination of both.
Even though cloud is the future, there is still a role for on-premises analytics in certain instances, such as when cloud might not offer the optimal mix of cost and performance for a given workload, or when existing investments need to be optimized. The most important pros and cons to consider when deciding where to run analytics workloads are:
- Local performance for low latency
- Full control of the technology stack
- Physical access to infrastructure
- Preservation of data center assets
- High hardware costs and depreciation
- Relatively limited elasticity
- The enterprise has total responsibility for failures and security incidents
- Less access to cutting-edge analytics capabilities
- Complications with end of support and replacement
On-premises summary and use cases
On-premises data analytics implementations have been the norm at many companies. They rely on its combination of speed, control, and familiarity to load and query business data.
On-premises analytics can also be folded into a hybrid cloud architecture that uses public cloud resources for bursting, or for specific uses like disaster recovery. Performance-sensitive workloads may be better suited to on-premises and hybrid cloud deployments than to cloud-only data warehouses.
Reasons to migrate away from on-premises architectures include the mounting costs of the hardware they require, along with outdated functionality and difficulties with scalability and elasticity.
For example, an aging on-premises warehouse was the impetus for Australian private insurance provider Medibank to migrate to a multi-cloud data platform. It shifted to a data platform deployed on a public cloud to access and extract insights from its data more quickly.
- Immense scalability and elasticity
- Pricing model is "pay for what you use"
- Provider responsible for uptime, availability, and some aspects of security
- Rapidly expanding feature set with innovations like AI and machine learning
- Latency that may be excessive for some workloads
- Unpredictable pricing if billed exclusively on-demand
- Not the most cost-effective for workloads with no variability/constant utilization
- Hybrid architectures require tight compatibility and interoperability
Cloud use cases
Cloud data warehouses, data lakes, and data lakehouses have become viable alternatives to on-premises infrastructure. They offer superior elasticity, plus a broad and expanding set of capabilities for advanced analytics via AI and machine learning processing.
As data volumes increase and the number of data sources proliferates, cloud computing services provide the vast resources and flexibility to adapt. A 2021 report from Seagate and IDC found that enterprise data growth is expected to exceed 40% over the next two years, due in large part to data analytics projects.
But making the most of cloud analytics — and avoiding excessive costs from on-demand pricing — requires a coherent strategy, supported by a modern data platform that can connect to multiple clouds and to on-premises resources as needed. Seagate and IDC respondents cited concerns with the complexity of managing all of these different environments as a top challenge.
For casual restaurant company Brinker, a connected multi-cloud platform delivered the precise combination of low latency and easy, consistent access to data that had been missing from both its aging legacy environment and a cloud-only data warehouse it had previously tried. The organization eliminated silos and gained the ability to reliably run and scale any type of workload.
Connecting on-premises and cloud: Hybrid cloud considerations
Brinker’s case shows that although “on-premises vs. cloud” is a popular and useful comparative frame, “on-premises and cloud” is frequently the reality. A hybrid cloud connecting on-premises environments to a public cloud(s) is often the most practical solution for organizations that want to protect their existing investments, minimize latency, and still have access to the latest innovations in the cloud.
The concept of data gravity is a major driver of these hybrid cloud environments that strive to deliver the best of both worlds from the on-premises and cloud spheres. Data gravity is the idea that large, growing masses of data will inevitably attract more applications and services to them, becoming more complicated to move over time. Shifting components from one environment to another when setting up a hybrid cloud environment thus requires careful deliberation about:
- The locations of data gravity clusters and the data sources that they encompass.
- The potential cloud egress charges associated with moving large amounts of data in these clusters.
- The latency involved in these movements and whether it’s acceptable for the intended use case.
- Whether application components from one environment will work in another.
Hybrid cloud inevitably involves these types of complex considerations, but the overall journey can still be simplified with the right data platform.
How to make analytics work anywhere
In Medibank’s case discussed earlier, the fact that Teradata Vantage was the exact same software on-premises and in the cloud streamlined its analytics modernization project. Migration was straightforward, with no need to extensively recode its critical applications.
A modern data platform like Vantage provides flexible deployment options that deliver the same advanced functionality on-premises or in a public cloud of choice. Key capabilities include ingestion of any data source and format, sophisticated workload management, and enterprise-wide data access with no silos. In the cloud, it also offers blended pricing — reserved capacity plus on-demand, matched to customer needs and queries — to optimize costs.
Whether an enterprise is looking to migrate its on-premises data to multiple clouds, simply connect it to a single public cloud, or pursue another deployment model, Vantage can get it done.