Everything seems to run in cycles, and we are hitting one now with the cloud journey. Companies around the world have already embarked upon this journey, while others are in the planning phase. It is only natural that when looking to move to the cloud, a company would take that opportunity to re-evaluate other parts of their current analytic systems. With a wide range of options available these days, the question of running a Proof of Concept (POC) type benchmark invariably pops up.
Having been around long enough to have seen this before, there are typically two questions that need to be asked when performing a POC. Very simply, what concepts are being proven and what proof will prove it?
Unfortunately, a lot of people do not have the answers to even answer the first question, let alone the second. Most will say things like “prove the performance” or “prove the technology.” These are very ambiguous answers open to a wide array of interpretations. Others will just take the “hardest queries” to see if they can be run which is not a true representation of needed workload or workload mix. When undertaking the expense and time to execute a proof of concept, make sure the efforts are well defined and drive an actionable outcome.
What is your ultimate end goal with analytics?
The real goal and challenge of an analytic environment should be the ability to adapt to constant change. The platform is used to gain insight which allows understanding on how to change business and processes for better outcomes. At the same time, that constant change in the business will create ever growing needs in your data and information.
When designing a POC, look for ways to ensure the environment will be resilient to change. But what type of change? No matter what happens in your business, you can be assured of the following changes: data volume and arrival rates, workload mix and priorities, data types and value, and a user community that spans the whole spectrum from dashboards to full self-service activities.
It is helpful if the POC is geared around a real business use case. It can be an existing use case that is being migrated, a new use case that will leverage new data and analytics, or a mixture of both. The key is that the POC proves out the technical capability to provide end users with actionable information that drives profitable outcomes.
What concepts do you need to prove?
When taking the above into practice, it does not make sense to only perform a “canned” benchmark with a set number of queries which are simply run at different scale and concurrency. Yes, those tests are important and must be done, but they do not really prove the ability to adapt to change. From a simple perspective, you need to have test situations that prove out the flexibility, simplicity, and accessibility of the total environment. Let’s look at each of these in a bit more detail.
- Flexibility – The ability to choose the most appropriate software resources (e.g., tools, languages, and libraries) to accelerate the user’s time to insight and minimize operationalization efforts. Enabling new user communities to natively join the environment in exploration and discovery efforts.
- Simplicity – The ability to quickly provision and decommission analytic resources (e.g., compute, storage, and network) in a simplified, manageable, and cost-effective manner for business user and IT. Take this a step further by testing how easy it is to integrate the new compute or storage into the analytics. Do queries need to be stopped to take advantage of new compute? Do users need access rights, or do paths need to be built, explicitly for new data repositories?
Who gets to say it was proved?
- Accessibility – The ability to efficiently find, secure and govern information and analytics within the entire analytic ecosystem without slowing down the business users or jeopardizing production. Access does not just mean “query” but encompasses the management and monitoring of that access as well. Can reports be generated easily to understand who is accessing what data and how frequently data is accessed?
Simply stated, it’s the stakeholders. This should include executive, business, and IT stakeholders. Executive involvement would likely include the CIO, CTO, CFO, and/or CDO as they are the ones with the budget authority, and they understand company strategy and critical initiatives.
Business leadership includes the personas of those striving to do analytics (i.e., data analyst, statistician, and data scientist). They would also represent the voice of the lines of business and data management group concerned with data governance and stewardship of metadata, master data, reference data, data quality and security/regulatory compliance.
IT personas would likely include those that would represent architecture personas (i.e., information, application, security, network, and operations management). The proposed solution needs to compliment the environment and integrate with existing corporate processes.
A proven concept leads to successful, low risk execution
A proof of concept is a large undertaking and needs to be approached seriously. Creating a realistic test that not only proves the technology but also the value of enabled analytics will succeed today and well into tomorrow.
At the most basic level that means taking the time to build a consortium across business, IT, and executives that work together towards painting a vision of success, with a roadmap that has a multi-year outlook. This allows the POC creators to understand the users and future needs of the company. From there they can plot a course that embraces the challenges and provides confidence that those obstacles can be overcome. The POC should prove the analytic environment enables flexibility, simplicity, and accessibility for both the business and IT which ultimately will accelerate company success!
Dwayne Johnson is a principal ecosystem architect at Teradata, with over 20 years' experience in designing and implementing enterprise architecture for large analytic ecosystems. He has worked with many Fortune 500 companies in the management of data architecture, master data, metadata, data quality, security and privacy, and data integration. He takes a pragmatic, business-led and architecture-driven approach to solving the business needs of an organization.
View all posts by Dwayne Johnson
Starting with Teradata in 1987, Rob Armstrong has contributed in virtually every aspect of the data warehouse and analytical processing arenas. Rob’s work in the computer industry has been dedicated to data-driven business improvement and more effective business decisions and execution. Roles have encompassed the design, justification, implementation and evolution of enterprise data warehouses.
In his current role, Rob continues the Teradata tradition of integrating data and enabling end-user access for true self-driven analysis and data-driven actions. Increasingly, he incorporates the world of non-traditional “big data” into the analytical process. He also has expanded the technology environment beyond the on-premises data center to include the world of public and private clouds to create a total analytic ecosystem.
Rob earned a B.A. degree in Management Science with an emphasis in mathematics and relational theory at the University of California, San Diego. He resides and works from San Diego.
View all posts by Rob Armstrong