Streaming data architectures
if you are a data engineer and you are handling a lot of databases , Debezium is the right tool to trigger the changes and the transactions . In this article I’m going to explain step by step how to do it , and guess what folks , you can try it because it is going to be deployed on your local environment .
Within our company, Ancud IT, it’s our daily business to help our customers for building highly scalable data infrastructure to get the maximum value out of the data. But which technologies to use and how to start?
To answer this questions, we want to share our experience with you.
before heading deeper into the tutorial , let’s define CDC and where does it fit in modern projects .
CDC is perfectly the best solution to move data across different networks and systems in real time , and since it ensures this great feature , it supports processing and analytics for this real-time data . let me explain this further with this architecture :
Figure 1 : Streaming data architectures
in the above architecture I tried to break down event & streaming architectures and I will point out the things to remember as engineers :
Architecture of the concept:
Figure 2 : Architecture of the concept
Requirements :
Steps briefly :
note : each step should be executed in a separate terminal
enough talking ! let’s start the tutorial :
The first step would be to start zookeeper , but let’s understand first of all zookeeper .
zookeeper : top-level software developed by Apache , used to maintain naming and configuration data and to provide flexible and robust synchronization within distributed systems [2] .
docker run -it --rm --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 quay.io/debezium/zookeeper:2.0
DON’T WORRY , I GOT YOU , YOU WANNA UNDERSTAND THE ARGUMENTS , SURE NO PROBLEM .