Apache Kafka

IEEE Brainwaves
2 min readMar 26, 2021

Introduction:

1)This is a dispensed statistics save for ingesting and processing streaming statistics in real-time. Streaming statistics are statistics that are continuously generated via way of means of heaps of data sources, which send the facts of the record in simultaneously.

2)It stores streams of data in the order in which facts have been generated. It is to build real-time streaming statistics pipelines and packages that adapt to the statistics streams. It is used for messaging, storage, and circulate processing to allow storage and analysis of both historic and real-time statistics.

3)Kafka combines messaging models, queuing, and put up-subscribe, to provide the key advantages of each to consumers. Queuing permits for statistics processing to be disbursed across many purchaser instances, making it incredibly scalable.

4)Each topic has a partitioned log maintaining track of all data in order and appends new ones.

Producer API: used to publish a movement of facts to the subject matter

Consumer API: used to enroll in topics.

Streams API: permits programs to act as movement processors which soak up an entering movement from the subject matter(s).

Connector API: lets in customers to seamlessly automate the addition of any other application.

History:

Apache Kafka is evolved with the aid of using Apache Software Foundation written in languages Scala and Java

Kafka, in the beginning, evolved with the aid of LinkedIn and ultimately became open-source in early 2011. Jay Kreps selected to call the software program after the writer Franz Kafka as it is “a device optimized for writing”, and he appreciated Kafka’s work.

Application:

Apache Kafka is primarily based totally on the devote log, and it lets customers enroll in it and submit statistics to any variety of structures or real-time programs. Example programs consist of coping with passenger and driving force matching at Uber, presenting real-time analytics and predictive protection for British Gas clever home, and acting several real-time offerings throughout all of LinkedIn.

Performance:

Monitoring stop-to-stop overall performance calls for monitoring metrics from brokers, clients, and producers, further to tracking ZooKeeper, which Kafka makes use of for coordination amongst consumers. There are presently numerous tracking systems to tune Kafka’s overall performance. In addition to those systems, amassing Kafka statistics also can be carried out with the use of equipment usually bundled with Java, consisting of JConsole.

Article By : Yash Upadhyay

--

--

IEEE Brainwaves

Representing the IEEE student chapter of Dwarkadas J Sanghvi College of Engineering