Subscribe to the Teradata Blog

Get the latest industry news, technology trends, and data science insights each week.



I consent that Teradata Corporation, as provider of this website, may occasionally send me Teradata Marketing Communications emails with information regarding products, data analytics, and event and webinar invitations. I understand that I may unsubscribe at any time by following the unsubscribe link at the bottom of any email I receive.

Your privacy is important. Your personal information will be collected, stored, and processed in accordance with the Teradata Global Privacy Policy.

Ten Things I’ve Learned in 20 Years in Data and Analytics

Ten Things I’ve Learned in 20 Years in Data and Analytics
I recently passed 17 years at Teradata and a quarter of a century in the industry. In no particular order, here are ten things I’ve learned in those 20-odd years.
 
#1: Data-driven organisations are out-competing their peers and eating the world (witness Apple, Amazon, eBay, Facebook, Google, PayPal, etc., etc.).

#2: Connecting, integrating and sharing data is (mostly) a virtuous circle; managing them in silos is (almost always) a vicious cycle.  Putting detailed sales, order and inventory data together and sharing it with partners and suppliers enabled Wal-Mart to dominate grocery Retail in the 90s, by creating a demand-driven supply chain that simultaneously improved sales and customer experience whilst crushing costs. And Amazon similarly dominates Retail today by combining purchase data with behavioural data to understand what customers want better than its competitors do - and by enabling partners to leverage the platform that it has created, generating even more data about even more customers. If we are not optimising an end-to-end value chain, I have learned that mostly we are doing it wrong.

#3: Managing, connecting, integrating and sharing data is often hard and never comes for free.  That effort and expense needs to be aligned with company strategy and cost-justified, because whilst all data have value, some data are more valuable than others - and the value of many datasets varies over time. I have learned that there is always, always, always (at least) one schema - but equally that it is a mistake to over-model data, especially where the value of that data and the extent to which it will be re-used are unclear. I have also learnt that integration is not the goal – and to leave well alone when the cost of integration exceeds the benefit.

#4: The only constant in modern business is change - and data and data products delayed are business and societal value denied.  I have learned that the simplest way to deliver value faster is to: start with the end in mind and build what is necessary, avoiding over-engineering; re-use and extend existing data assets and services wherever practicable, avoiding the creation of expensive and difficult-to-maintain data silos through repeated reimplementation; and to automate wherever possible.

#5: Taking processing to data scales and performs better than shipping data to processing nine times out of ten – and optimising for acquisition and loading in a read intensive environment is wrong, wrong, wrong.  Design for access!

#6: Large and complex organisations are just that: large, complex and diverse – so a successful data platform is an open data platform that supports multiple tools, technologies, and languages. That said, tools and technologies that are simple to deploy, use, manage, maintain and optimise should often be preferred. Good old SQL may have its limitations, but there’s an awful lot to be said for a simple, declarative language when it comes to the optimisation of complex queries that feature join, merge, aggregation and sort processing. And all worthwhile analytics features a lot of join, merge, aggregation and sort processing.

#7: Data are a force for both good and ill - and ethical considerations should underpin the way that data are collected, managed, exploited and, crucially, protected and secured. 

#8: Machine Learning will be ubiquitous and the basis of competitive advantage in many industries in the very near future – and Machine Learning is first-and-foremost a data problem. At the same time, organisations can’t Machine Learn everything – and whilst Machine Learning relies upon good data, good data underpin more than just Machine Learning. John Snow did not need a Convolutional Neural Network to change how we think about Cholera - and simple A/B testing remains a powerful tool for even the most sophisticated e-Commerce platforms.

#9: Data are moving to the Cloud.  And we should increasingly think of the Cloud less as a place, more as a next-generation computing paradigm that provides: a rich ecosystem of composable services; on-demand infrastructure; API-driven everything; automated operations; usability and simplicity. In particular, object storage technologies have the potential to succeed where Hadoop failed and to provide Enterprises with a “data operating system” that will enable radical architectural simplification.

#10: Data architecture and data management aren’t cool right now – and since digitalisation will never achieve its full potential without them, as an industry we need to fix that.
Portrait of Martin Willcox

(Author):
Martin Willcox

Martin leads Teradata’s EMEA technology pre-sales function and organisation and is jointly responsible for driving sales and consumption of Teradata solutions and services throughout Europe, the Middle East and Africa. Prior to taking up his current appointment, Martin ran Teradata’s Global Data Foundation practice and led efforts to modernise Teradata’s delivery methodology and associated tool-sets. In this position, Martin also led Teradata’s International Practices organisation and was charged with supporting the delivery of the full suite of consulting engagements delivered by Teradata Consulting – from Data Integration and Management to Data Science, via Business Intelligence, Cognitive Design and Software Development.

Martin was formerly responsible for leading Teradata’s Big Data Centre of Excellence – a team of data scientists, technologists and architecture consultants charged with supporting Field teams in enabling Teradata customers to realise value from their Analytic data assets. In this role Martin was also responsible for articulating to prospective customers, analysts and media organisations outside of the Americas Teradata’s Big Data strategy. During his tenure in this position, Martin was listed in dataIQ’s “Big Data 100” as one of the most influential people in UK data- driven business in 2016. His Strata (UK) 2016 keynote can be found at: www.oreilly.com/ideas/the-internet-of-things-its-the-sensor-data-stupid; a selection of his Teradata Voice Forbes blogs can be found online here; and more recently, Martin co-authored a series of blogs on Data Science and Machine Learning – see, for example, Discovery, Truth and Utility: Defining ‘Data Science’.

Martin holds a BSc (Hons) in Physics & Astronomy from the University of Sheffield and a Postgraduate Certificate in Computing for Commerce and Industry from the Open University. He is married with three children and is a solo glider pilot, supporter of Sheffield Wednesday Football Club, very amateur photographer – and an even more amateur guitarist.

View all posts by Martin Willcox

Turn your complex data and analytics into answers with Teradata Vantage.

Contact us