A step-by-step walkthrough with Kubernetes deployment script

Image for post
Image for post
Image Courtesy of athree23 on Pixabay.com

As recently included in Apache Kafka and introduced in my previous blog, new MirrorMaker becomes the officially certified open-source tool that replicates data between two Kafka instances across datacenters.

To have the first-hand experience of new MirrorMaker, in this article, we will walk through the end-to-end deployment on local Kubernetes.

As a prerequisite, Minikube and an instance of Virtual Machine Monitor (e.g. VirtualBox, VMWare Fusion…) need to be installed on local before the following steps.

Note: the scripts used in the following may be used in a Kubernetes cluster, but do not warrant a production quality deployment

minikube start --driver=<driver_name> --kubernetes-version=v1.15.12 --cpus 4 --memory…


Introduction of a new cross-datacenter replication tool for Apache Kafka

Image for post
Image for post
Image Courtesy of sumanley on Pixabay.com

Apache Kafka is the de-facto data streaming platform for high-performance data pipelines, streaming analytics and mission-critical applications. For enterprises, as business continues to grow, many scenarios will require to evolve from one Kafka instance to multiple instances. For example, critical services can be migrated and run on dedicated instances to achieve better performance and isolation to satisfy Service Level Agreement or Objective.

Another example is Disaster Recovery (DR) — the instance in a primary datacenter is continuously mirrored to the backup datacenter. …


Image for post
Image for post
https://pixabay.com/users/mohamed_hassan-5229782/

This is the first note about tools and tips that migrate from other cross-cluster Kafka replication tools to the new MirrorMaker (or “MirrorMaker 2”)

There are several advantages of new MirrorMaker. To name a few:

  • always open-source in Apache Kafka ecosystem
  • supported by open community with large and diverse user base
  • critical features that have not been supported by other tools, e.g. exactly-once semantics, see my previous blog

Recently users from the community have been migrating from Confluent Replicator (an enterprise commercial “cross-cluster” replication tool) to MirrorMaker and they were facing the following problem:

messages which are already replicated by Confluent Replicator is getting replicated again when the Mirror Maker is started on the same topic. This should not happen as messages are getting duplicated at the target cluster. …


As introduced in our previous posts (link 1, link 2) many applications behind Walmart.com are being powered by the highly scalable and distributed streaming platform, Apache Kafka. With the high-speed revolution, Kafka has a new milestone, release 0.10. With this release, Kafka and its ecosystem have reached a new level of maturity. In this post, I would like share our recent interesting results with Kafka 0.10 release. The next post will more focus on the streaming, big data and Hadoop ecosystem around Kafka.

Image for post
Image for post
Kafka 0.10 and its Downstream Consumers around it

In earlier Kafka releases (before 0.8.2), consumers commit their offsets to ZooKeeper. During the last holiday season, our Zookeeper cluster was experiencing a very high volume of writes caused by many consumers committing offset very frequently. Beyond increasing Zookeeper capacity (e.g. SSD as storage), Kafka now provides an alternative to store consumer offsets into a special and separate Kafka topic, which is replicated and highly available. …


In our previous blog, we introduced “why” we migrated the Kafka service at Walmart from the shared bare-metal machines to the new “self-serving” Kafka deployment that is powered by OpenStack and OneOps. Today, I would like to introduce how the Kafka ecosystem looks like “under-the-hood”.

Image for post
Image for post
Kafka Ecosystem at Walmart

The above picture does not completely capture the all real time pipelines but aims to highlight the key components and the relationships among them.

Core Services

  • Kafka Brokers: we are currently rolling out a Kafka version 0.10.1.0 with the suggested JVM parameters, to take advantage of better stability and reliability, comparing to 0.8 family.
  • MirrorMaker: it is used for replicating certain topics from one Kafka cluster to another Kafka cluster, typically across data centers. One of our internal code patches is created to support complete topic renaming. …


Image for post
Image for post
Photo Credit: Apache Kafka

Many top organizations have been reported to benefit from Service Oriented Architecture (SOA) and Walmart also re-built its eCommerce website (walmart.com) based on the SOA and elastic cloud. An important subset of SOA is the Message-driven architecture, which serves as a channel for asynchronous communication to decouple the bundled components. The result is a more scalable and efficient architecture where each component or service could be independently crafted and scaled out by communicating with others through the messaging platform.

Traditional message queue used to be the solution of Message-driven architecture, but it has become kind of inherently flawed, when it comes to handle the large scale. …


Image for post
Image for post

OneOps is a multi-cloud and open-source orchestration platform for DevOps that has the following major advantages:

  • DevOps orchestration: integrate popular open-source or free DevOps tools and orchestrate them on a nice web UI.
  • Model-driven application template: once a template is created, re-use and deploy the “best practice” unlimited times.
  • Cloud Agnostic: support deployment on major public and private clouds
  • Operation excellence:auto-pilot the application - scales, repairs and even replaces unhealthy instances. Also fully integrate with the monitoring and alerting functions on the web UI.
  • Promote DevOps culture: bring developers, QA and Operations together, leading to an acceleration of the product delivery and a reduction on operation cost. …

About

Ning.Zhang

Use less words to make bigger impact

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store