Comparison
Why comparison needed?
There has been trend of adding "Ops" as a suffix now a days. There have been many such cases post the introduction of word DevOps. There are samples like DevSecOps, DataOps, SecOps, GitOps, etc. Now there are two terms which uses "Ops" and "Data" together.
- DataOps
- Data Drive DevOps
On top of that yet another term is being introduced i.e. Data First DevOps aka DFD. For a normal human being this might be confusing or may be even irritating. Hence it makes perfect sense to ensure that no more confusion added to the readers. This page will talk about the clear-cut differences to eliminate any confusion caused.
Concept and(if) difference
DataOps
DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting and recognizes the interconnected nature of the data analytics team and information technology operations. - As per the Wikipedia on 16th March 2020
In simpler words, what DevOps does for application development, DataOps does same for the analytics and data-based activities/projects. This is the business data which is used for analytics and completely dedicated to the business and only business and has nothing to with anything else.
In the contrary, DFD talks about the data which is basically used for/by/in the DevOps process and has nothing to do with business data. Yet, there is no practical case where business data might affect the DevOps related data.
Data Driven DevOps
Data Driven DevOps is about the metrics created by the DevOps process. Here are few samples of the metrics which are covered as the part Data Driven DevOps:
- Build duration
- Build duration trend
- Build status/trend
- Deployment status/trend
- Mean time to recovery
- Commit trends
- Code quality status/trend
- Bug trends
There can be many more metrices. The whole concern of Data Driven DevOps is to investigate these treads and:
- Improve on the areas which are alarming
- Trying to predict the upcoming failure
- Continuously monitoring the trends
- Using the trends backed by data to continuously improve on the DevOps processes
If you read the definition of DFD properly, you can relate that Data Driven DevOps is the subset of the output data. Still there are much more in the DFD. Here are the differentiating factors:
Output Data:
Even if the Data Driven DevOps is about taking the output data of the DevOps process and create trends and visualization to make sense of the DevOps processes to make it better, yet in DFD we use the output data not only for purpose similar to Data Driven DevOps, but also use the same output data as input data in other DevOps processes.
Input Data:
DFD is more about the data which is used to do the DevOps process i.e. the input data. In DFD we use the data to make the DevOps pipeline work. Refer smart data dumb pipelines principle. DFD also encourages to see the DevOps not as an automating toolset but as a full-fledged application development.