You may recall the childhood tongue twister – How much wood would a woodchuck chuck if a woodchuck could chuck wood. As a child, you most likely never explored an answer even though woodchucks are real animals, and they apparently could chuck wood if there was a need to do so. Check out the answer here.

This childhood game resembles our data strategies of the past. We thought collecting any and all data was superior to understanding why we were collecting the data or if the data we were collecting was relevant. We then went on to believe what the data was telling us even though we had not explored whether these were the answers we needed for our business to thrive.

Our current strategies are starting to resemble the burrowing animal’s habits. We are moving towards chucking less data and are focusing on more what is required for our business. This starts with better understanding the V’s of data (Volume, Velocity, Variety, Veracity, and Value) when designing systems and repositories. Yet even then, we are still capturing ever increasing amounts of data from many sources and not all of it is neatly structured.

This is where data administration and data engineering meet data science. Whereas Database Administrators and Data Engineers focus on how to best store and administer data, the Data Scientist brings the data to the business for analysis. So, what is it exactly that Data Scientists do?

Data Scientists’ core responsibilities are to collect, cleanse, and analyze data resulting in business insights that can improve organizational decision making. This involves taking data from various sources across the organization, cleaning and validating data to ensure accuracy, completeness, and uniformity, modeling the data (many times using machine learning), create a visualization of the data, and communicating findings to the business with recommendations.

You may have someone doing this in your organization today whether they hold a Data Scientist title or not. They may be combining data from your Customer Relationship Management (CRM) system, Marketing Automation platform, ecommerce platform, social media channels, and website. In pulling all this data together, they are likely to provide a more complete view on existing or prospective customer behaviors.

For example, if a customer has already purchased, they are likely in your CRM and ecommerce platform. But how are they engaging with your email campaigns, through your website, or in social media? What can this behavior tell you about their level of satisfaction of what they have purchased or are likely to purchase next? By modeling out this cross-platform data, a Data Scientist will be able to assist you in answering these and many more questions. In the end it is not about how much wood we can chuck, but more about the value we create by the wood we do chuck!


CertNexus is a vendor-neutral certification body, providing emerging technology certifications and micro-credentials for Business, Data, Development, IT, and Security professionals. CertNexus’ mission is to assist in closing the emerging tech global skills gap while providing individuals with a path towards establishing rewarding careers in Cybersecurity, Data Science, Internet of Things, and Artificial Intelligence (AI)/Machine Learning.