Getting started with Apache Kafka. Ch. 0 - Introduction.

This is my attempt to create a beginner-friendly tutorial series on Apache Kafka.
In this series, we will look into the Kafka Theory, architecture, code examples of system implementation which uses messaging queues.

Skip to the first chapter

Why was Kafka needed?
Imagine an architecture from a Microservice universe where there are multiple data-producing services and multiple data-consuming services.
You can always integrate systems with each other by hard coupling it as a source/target system.
For N data-producing services and M data-consuming services, there has to be N*M integrations required for them to talk to each other.

Again every integration will have its own challenges, if you are a software developer, you would know it. ;)

To solve this hard-coupling integration issue and the numerous challenges which come bundled with it, we will use a messaging queue/data streaming platform. Kafka is one of the open-sourced, highly scalable, extensively used, and real-time data streaming platforms built by LinkedIn and Apache Software Foundation.

Apache Kafka helps to decouple the data-producing and data-consuming services by allowing the data-consuming services to source the data directly from the Kafka stream.

Any kind of data can go into Kafka, be it the user’s interaction with the front-end client, log processing, event sourcing, or just some asynchronous actions which need to be performed.

Kafka can scale up to 100 brokers with millions of messages per second. Kafka is a real-time messaging service which means, latency is less than or approximately 10 ms. It is a distributed, fault-tolerant, and reliable system which means data is going to stay in the stream until it is not consumed and processed.

More than 80% of all Fortune 100 companies trust, and use Kafka.

LinkedIn uses Apache Kafka for activity stream data and operational metrics. This powers various products like LinkedIn Newsfeed and recommendations in real-time.
Twitter uses Kafka for handling 5B+ sessions every day and to power real-time search features.

This was pretty much of an introduction to Kafka, in the next part we will start with the Theory of Kafka. In later parts of this series, we will be looking into using Kafka from the command line and Java programming language.

Read the first chapter of the series here: Kafka Theory.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store