blog.allegro.tech/2024/03/kafka-performance-analysis.html

Preview meta tags from the blog.allegro.tech website.

Linked Hostnames

14

Thumbnail

Search Engine Appearance

Google

https://blog.allegro.tech/2024/03/kafka-performance-analysis.html

Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF

At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we noticed the following discrepancy: while median response times for produce requests were in single-digit milliseconds, the tail latency was much worse. Namely, the p99 latency was up to 1 second, and the p999 latency was up to 3 seconds. This was unacceptable for a new project that we were about to start, so we decided to look into this issue. In this blog post, we would like to describe our journey — how we used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.



Bing

Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF

https://blog.allegro.tech/2024/03/kafka-performance-analysis.html

At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we noticed the following discrepancy: while median response times for produce requests were in single-digit milliseconds, the tail latency was much worse. Namely, the p99 latency was up to 1 second, and the p999 latency was up to 3 seconds. This was unacceptable for a new project that we were about to start, so we decided to look into this issue. In this blog post, we would like to describe our journey — how we used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.



DuckDuckGo

https://blog.allegro.tech/2024/03/kafka-performance-analysis.html

Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF

At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we noticed the following discrepancy: while median response times for produce requests were in single-digit milliseconds, the tail latency was much worse. Namely, the p99 latency was up to 1 second, and the p999 latency was up to 3 seconds. This was unacceptable for a new project that we were about to start, so we decided to look into this issue. In this blog post, we would like to describe our journey — how we used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.

  • General Meta Tags

    7
    • title
      Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF | blog.allegro.tech
    • charset
      UTF-8
    • viewport
      width=device-width, initial-scale=1.0
    • generator
      Jekyll v4.4.1
    • description
      At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we noticed the following discrepancy: while median response times for produce requests were in single-digit milliseconds, the tail latency was much worse. Namely, the p99 latency was up to 1 second, and the p999 latency was up to 3 seconds. This was unacceptable for a new project that we were about to start, so we decided to look into this issue. In this blog post, we would like to describe our journey — how we used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.
  • Open Graph Meta Tags

    7
    • og:title
      Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF
    • US country flagog:locale
      en_US
    • og:description
      At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we noticed the following discrepancy: while median response times for produce requests were in single-digit milliseconds, the tail latency was much worse. Namely, the p99 latency was up to 1 second, and the p999 latency was up to 3 seconds. This was unacceptable for a new project that we were about to start, so we decided to look into this issue. In this blog post, we would like to describe our journey — how we used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.
    • og:url
      https://blog.allegro.tech/2024/03/kafka-performance-analysis.html
    • og:site_name
      blog.allegro.tech
  • Twitter Meta Tags

    1
    • twitter:card
      summary
  • Link Tags

    5
    • alternate
      /feed.xml
    • canonical
      https://blog.allegro.tech/2024/03/kafka-performance-analysis.html
    • icon
      /favicon.ico
    • stylesheet
      /assets/css/main.css?v=
    • stylesheet
      /assets/css/highlights.css?v=

Links

44