AWS Docker-Compose Kafka Cluster
얼마 전 GCP 무료계정을 이용해서 3개의 인스턴스 생성 후 Kafka Cluster 구성을 해보려고 했으나 이런저런 이유로 시간이 지체 되서 AWS Docker-Compose를 활용해 하나의 인스턴스에서 Docker-Compose를 활용해 Kafka-Cluster를 구성해 보려고 한다.
PreSetting
AWS EC2 인스턴스 생성
- Ubuntu Server 20.04 LTS (HVM), SSD Volume Type 선택
- t2.medium(2CPU, 4GB) 선택 후 검토 및 시작 버튼 클릭
docker 설치
root 계정으로 변경 후 apt update & apt upgrade를 먼저 진행
$ apt update & apt upgrade
필수 패키지 설치
$ sudo apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
GPG Key 인증
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
Docker Repository 등록
sudo add-apt-repository \
> "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
> $(lsb_release -cs) \
> stable"
Hit:1 http://ap-northeast-2.ec2.archive.ubuntu.com/ubuntu focal InRelease
Get:2 http://ap-northeast-2.ec2.archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:3 http://ap-northeast-2.ec2.archive.ubuntu.com/ubuntu focal-backports InRelease [101 kB]
Get:4 https://download.docker.com/linux/ubuntu focal InRelease [52.1 kB]
Get:5 https://download.docker.com/linux/ubuntu focal/stable amd64 Packages [10.7 kB]
Get:6 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Fetched 391 kB in 2s (225 kB/s)
Reading package lists... Done
apt Docker 설치
$ sudo apt-get update && sudo apt-get install docker-ce docker-ce-cli containerd.io
시스템 부팅시 Docker 시작
$ sudo systemctl enable docker && service docker start
docker-compose 설치
$ apt install docker-compose
Docker-Compose Yaml 설치
version: '2.1'
services:
zoo1:
image: zookeeper:3.4.9
hostname: zoo1
ports:
- "2181:2181"
environment:
ZOO_MY_ID: 1
ZOO_PORT: 2181
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
volumes:
- ./zk-multiple-kafka-multiple/zoo1/data:/data
- ./zk-multiple-kafka-multiple/zoo1/datalog:/datalog
zoo2:
image: zookeeper:3.4.9
hostname: zoo2
ports:
- "2182:2182"
environment:
ZOO_MY_ID: 2
ZOO_PORT: 2182
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
volumes:
- ./zk-multiple-kafka-multiple/zoo2/data:/data
- ./zk-multiple-kafka-multiple/zoo2/datalog:/datalog
zoo3:
image: zookeeper:3.4.9
hostname: zoo3
ports:
- "2183:2183"
environment:
ZOO_MY_ID: 3
ZOO_PORT: 2183
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
volumes:
- ./zk-multiple-kafka-multiple/zoo3/data:/data
- ./zk-multiple-kafka-multiple/zoo3/datalog:/datalog
kafka1:
image: confluentinc/cp-kafka:5.5.1
hostname: kafka1
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER_INTERNAL://kafka1:19092,LISTENER_DOCKER_EXTERNAL://${DOCKER_HOST_IP:-127.0.0.1}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: LISTENER_DOCKER_INTERNAL:PLAINTEXT,LISTENER_DOCKER_EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_DOCKER_INTERNAL
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2182,zoo3:2183"
KAFKA_BROKER_ID: 1
KAFKA_LOG4J_LOGGERS: "kafka.controller=INFO,kafka.producer.async.DefaultEventHandler=INFO,state.change.logger=INFO"
volumes:
- ./zk-multiple-kafka-multiple/kafka1/data:/var/lib/kafka/data
depends_on:
- zoo1
- zoo2
- zoo3
kafka2:
image: confluentinc/cp-kafka:5.5.1
hostname: kafka2
ports:
- "9093:9093"
environment:
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER_INTERNAL://kafka2:19093,LISTENER_DOCKER_EXTERNAL://${DOCKER_HOST_IP:-127.0.0.1}:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: LISTENER_DOCKER_INTERNAL:PLAINTEXT,LISTENER_DOCKER_EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_DOCKER_INTERNAL
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2182,zoo3:2183"
KAFKA_BROKER_ID: 2
KAFKA_LOG4J_LOGGERS: "kafka.controller=INFO,kafka.producer.async.DefaultEventHandler=INFO,state.change.logger=INFO"
volumes:
- ./zk-multiple-kafka-multiple/kafka2/data:/var/lib/kafka/data
depends_on:
- zoo1
- zoo2
- zoo3
kafka3:
image: confluentinc/cp-kafka:5.5.1
hostname: kafka3
ports:
- "9094:9094"
environment:
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER_INTERNAL://kafka3:19094,LISTENER_DOCKER_EXTERNAL://${DOCKER_HOST_IP:-127.0.0.1}:9094
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: LISTENER_DOCKER_INTERNAL:PLAINTEXT,LISTENER_DOCKER_EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_DOCKER_INTERNAL
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2182,zoo3:2183"
KAFKA_BROKER_ID: 3
KAFKA_LOG4J_LOGGERS: "kafka.controller=INFO,kafka.producer.async.DefaultEventHandler=INFO,state.change.logger=INFO"
volumes:
- ./zk-multiple-kafka-multiple/kafka3/data:/var/lib/kafka/data
depends_on:
- zoo1
- zoo2
- zoo3
kafka-manager:
container_name: kafka-manager
image: hlebalbau/kafka-manager:2.0.0.2
restart: on-failure
depends_on:
- kafka1
- kafka2
- kafka3
- zoo1
- zoo2
- zoo3
environment:
ZK_HOSTS: zoo1:2181,zoo2:2182,zoo3:2183
APPLICATION_SECRET: "random-secret"
KM_ARGS: -Djava.net.preferIPv4Stack=true
ports:
- "9000:9000"
설치
# 로그를 확인하면서 실행
$ docker-compose -f kafka-cluster.yaml up
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bdf222c1905f hlebalbau/kafka-manager:2.0.0.2 "/kafka-manager/bin/…" 56 seconds ago Up 54 seconds 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp kafka-manager
7522a122ea8c confluentinc/cp-kafka:5.5.1 "/etc/confluent/dock…" 10 hours ago Up 56 seconds 9092/tcp, 0.0.0.0:9093->9093/tcp, :::9093->9093/tcp ubuntu_kafka2_1
7b23cf081c6c confluentinc/cp-kafka:5.5.1 "/etc/confluent/dock…" 10 hours ago Up 2 hours 0.0.0.0:9092->9092/tcp, :::9092->9092/tcp ubuntu_kafka1_1
16182f6dd6e1 confluentinc/cp-kafka:5.5.1 "/etc/confluent/dock…" 10 hours ago Up 2 hours 9092/tcp, 0.0.0.0:9094->9094/tcp, :::9094->9094/tcp ubuntu_kafka3_1
f31dbbae7c3b zookeeper:3.4.9 "/docker-entrypoint.…" 10 hours ago Up 2 hours 2888/tcp, 0.0.0.0:2181->2181/tcp, :::2181->2181/tcp, 3888/tcp ubuntu_zoo1_1
4215c2e87d3c zookeeper:3.4.9 "/docker-entrypoint.…" 10 hours ago Up 2 hours 2181/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2182->2182/tcp, :::2182->2182/tcp ubuntu_zoo2_1
d69b5b0b608d zookeeper:3.4.9 "/docker-entrypoint.…" 10 hours ago Up 2 hours 2181/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2183->2183/tcp, :::2183->2183/tcp ubuntu_zoo3_1
Docker-Compose가 정상적으로 실행되고 나면 kafka, zookeeper, kafka manager가 정상적으로 동작하는 것을 로그로 확인이 가능하다. yaml 내용중 volumes 부분은 각 경로에 맞게 커스텀해서 사용이 가능하다. 나의 경우는 Apache kafka Installation by docker - DaddyProgrammer yaml 파일에 docker-compose 를 사용하여 kafka Cluster 및 Kafka Manger 세팅하기 kafka manager 설정만 가져와서 사용했다.
kafka manager 접속(AWS 보안그룹설정)
- AWS는 기본적으로 Instance를 생성하면 22번 port만 열려있기 때문에 로컬PC에서 Kafka Manager에 접속을 하기 위해서는 9000번 port를 열어주어야 한다.
- AWS 메뉴 - 서비스 > 네트워크 및 보안 > 보안 그룹 - 우측 상단의 작업 Selectbox > 인바운드 규칙 편집
Kafka Manager를 활용해 Topic 생성,삭제 관리를 해볼 수도 있지만 이번에는 아빠프로그래머님의 블로그에 정리되어 있는 것처럼 Kafka 프로젝트내의 Shell을 이용해 Kafka Cluster 테스트를 해보려고 한다.
Kafka Cluster 테스트
kafka download
$ wget https://mirror.navercorp.com/apache/kafka/2.8.0/kafka_2.12-2.8.0.tgz
압축해제 후 kafka bin폴더로 이동
$ tar xvf kafka_2.12-2.8.0.tgz
$ cd kafka_2.12-2.8.0/bin
$ ls
connect-distributed.sh kafka-console-producer.sh kafka-log-dirs.sh kafka-server-start.sh windows
connect-mirror-maker.sh kafka-consumer-groups.sh kafka-metadata-shell.sh kafka-server-stop.sh zookeeper-security-migration.sh
connect-standalone.sh kafka-consumer-perf-test.sh kafka-mirror-maker.sh kafka-storage.sh zookeeper-server-start.sh
kafka-acls.sh kafka-delegation-tokens.sh kafka-preferred-replica-election.sh kafka-streams-application-reset.sh zookeeper-server-stop.sh
kafka-broker-api-versions.sh kafka-delete-records.sh kafka-producer-perf-test.sh kafka-topics.sh zookeeper-shell.sh
kafka-cluster.sh kafka-dump-log.sh kafka-reassign-partitions.sh kafka-verifiable-consumer.sh
kafka-configs.sh kafka-features.sh kafka-replica-verification.sh kafka-verifiable-producer.sh
kafka-console-consumer.sh kafka-leader-election.sh kafka-run-class.sh trogdor.sh
JDK설치가 안되어 있다면
$ sudo apt-get install openjdk-11-jdk
신규 Topic 생성
$ ./kafka-topics.sh --create --zookeeper localhost:2181,localhost:2182,localhost:2183 --replication-factor 3 --partitions 1 --topic news
Cluster내 모든 Topic 리스트 조회
$ ./kafka-topics.sh --list --bootstrap-server localhost:9092,localhost:9093,localhost:9094 __confluent.support.metrics __consumer_offsets
__confluent.support.metrics
news
Topic 정보조회
# ./kafka-topics.sh --describe --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic news
Topic: news PartitionCount: 1 ReplicationFactor: 3 Configs:
Topic: news Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic 메세지 발행
$ ./kafka-console-producer.sh --broker-list localhost:9092,localhost:9093,localhost:9094 --topic news
>news message-1
>news message-2
>news message-3
Topic 메세지 소비
-from-beginning 옵션을 주면 Topic의 첫 메세지부터 모든 메세지를 받아오게 된다.
$ ./kafka-console-consumer.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic news --from-beginning
news message-1
news message-2
news message-3
Partition을 지정해서 메세지 소비
$ ./kafka-console-consumer.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic news --from-beginning --partition 0
Kafka-Cluster와 Springboot 연동을 위한 기본적인 Cluster 구성이 완료 되었다. 다음에는 Springboot와의 연동을 정리해야 할 것 같다.
참고
Ubuntu 20.04 Docker 설치하기
Apache kafka Installation by docker - DaddyProgrammer
(docker-compose) zookeeper/kafka 클러스터 구성
docker-compose 를 사용하여 kafka Cluster 및 Kafka Manger 세팅하기