A ground-breaking study says 90% of the entirety of the world’s data has been created within the previous two years. In just two years, we have collected and processed 9x the amount of information than the information collected by humankind combined. It’s projected we’ve already created astounding 44 zettabytes of data.
What do we do with all this data? How do we make it useful to us? What are its real-world applications? These questions are the domain of data science.
Pillars of Data Science
While data scientists often come from many different educational and work experience backgrounds, most should be strong in, or in an ideal case be experts in four fundamental areas and these are:
Data Science Goals and Deliverables
To understand the importance of these pillars, one must first understand the typical goals and deliverables associated with data science initiatives, and the data science process. Let’s discuss some common data science goals and deliverables.Here is a short list of common data science deliverables:
The Data Science Process
Data scientists usually follow a process similar to this, especially when creating models using machine learning and related techniques.
In IntelloGrit the Process Model consists of five iterative phases—goals, acquire, build, deliver and optimize. Each phase is iterative because any phase can loop back to one or more phases.