<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2877026&amp;fmt=gif">

Confluent Platform Deployment using Ansible Playbook

By Aman Patel - September 22, 2021

This blog talks about installing and deploying Confluent Platform 6.2.0 with ease using Ansible Playbooks on AWS EC2 instances.

We will be using a cp-ansible repository that provides the playbooks and templates that allow you to provision the Confluent Platform in your environment easily.

These playbooks will help you get a proof-of-concept cluster off the ground that is production-ready.

Specifically, this repository:

  • Installs Confluent Platform packages or archives.

  • Starts services using systemd scripts.

  • Provides configuration options for many security options including encryption, authentication, and authorization.

The services that can be installed from this repository are:

  • ZooKeeper

  • Kafka

  • Schema Registry

  • REST Proxy

  • Confluent Control Center

  • Kafka Connect (distributed mode)

  • KSQL Server

  • Replicator

This blog will walk you through the steps to follow to set up a 3 node Confluent Cluster using Ansible.

Prepare Environment

Environment: AWS
No. of nodes: 3
Linux: Ubuntu 18.04 LTS Bionic

1. Capture the Private IPv4 DNS

node1.confluent.com
node2.confluent.com
node3.confluent.com

In this blog, we will use simpler node aliases to better understand service distribution and commands.

For AWS environment, your name node will be something like below e.g.: ip-172–31–42–254.us-east-2.compute.internal

2. Create the hostlist in the root user in one of the hosts where we plan to install Ansible.

In our case, we have selected host — node1.confluent.com

cat HOSTLIST

node1.confluent.com
node2.confluent.com
node3.confluent.com

3. Setup Password-less ssh using user ubuntu

- Run the below to create the ssh key:

ssh-keygen -t rsa

ubuntu1

- Copy the id_rsa.pub entry and append it in the authorized_keys of all the hosts part of the Confluent cluster. This will enable password-less ssh from the Ansible host to the other hosts.

- Verify the password-less ssh using the command below from the host

node1.confluent.com:

for HOST in `cat HOSTLIST`; do echo “*** $HOST”; ssh -qt $HOST
‘hostname -i’; done
(Type ‘yes’ if prompted)

4. Install git and OS dependencies on the Cluster nodes

- Login to the Ansible host (node1.confluent.com) and install git using

the below command (if git is not installed):

sudo apt install git

- Git deploy the “hadoop-deployment-bash”

git clone https://github.com/teamclairvoyant/hadoop-deployment-bash

- Run the below for setting up variables:

cd hadoop-deployment-bash
ln ../HOSTLIST
GITREPO=`pwd`

- Run the below for setting up the OS dependencies and prerequisites (The first set of commands below will securely copy the required script files and the second set of commands will run the scripts on all the nodes part of the HOSTLIST):

for HOST in `cat HOSTLIST`; do

echo “*** $HOST”

scp -p \

${GITREPO}/install_tools.sh \

${GITREPO}/change_swappiness.sh \

${GITREPO}/disable_iptables.sh \

${GITREPO}/disable_selinux.sh \

${GITREPO}/disable_thp.sh \

${GITREPO}/install_ntp.sh \

${GITREPO}/install_nscd.sh \

${GITREPO}/install_jdk.sh \

${GITREPO}/configure_javahome.sh \

${GITREPO}/install_jce.sh \

${GITREPO}/link_openssl.sh \

$HOST:

done

BOPT=”-x”
for HOST in `cat HOSTLIST`; do

echo “*** $HOST”

ssh -t $HOST “ \

sudo bash $BOPT ./install_tools.sh; \

sudo bash $BOPT ./change_swappiness.sh; \

sudo bash $BOPT ./disable_iptables.sh; \

sudo bash $BOPT ./disable_selinux.sh; \

sudo bash $BOPT ./disable_thp.sh; \

sudo bash $BOPT ./install_ntp.sh; \

sudo bash $BOPT ./install_nscd.sh; \

sudo bash $BOPT ./install_jdk.sh — jdktype openjdk — jdkversion 8; \

sudo bash $BOPT ./configure_javahome.sh; \

sudo bash $BOPT ./install_jce.sh; \
sudo bash $BOPT ./link_openssl.sh”

done

Installing Ansible on Ubuntu

- Login to the Ansible host node1.confluent.com Run the below commands to install Ansible on the host:

sudo apt update

sudo apt install software-properties-common

sudo add-apt-repository — yes — update ppa:ansible/ansible

sudo apt install ansible

- After successfully install, you can verify the Ansible installation using the below command:

ansible — version
ansible all -i HOSTLIST -m ping

ubuntu

ubuntu-install

Download Ansible Playbooks for Confluent Platform

- Clone the Ansible Playbooks for Confluent Platform repo:

cd to the home directory of the ubuntu user and run the below command: git clone https://github.com/confluentinc/cp-ansible

- Check out the branch:

cd cp-ansible

git checkout 6.2.0-post

Note: Here we are installing the 6.2.0 version, so need to check out the specific version

ansible-1

Generate an inventory file

- Create the hosts.yaml file under cp-ansible directory as below: (Here SSL is set to enabled)

all:

vars:

ansible_connection: ssh

ansible_user: ubuntu

ansible_become: true
ssl_enabled: true

zookeeper:

hosts:

node1.confluent.com:

node2.confluent.com:

node3.confluent.com:

kafka_broker:

hosts:

node1.confluent.com:

broker_id: 0

node2.confluent.com:

broker_id: 1

node3.confluent.com:

broker_id: 3

schema_registry:

hosts:

node1.confluent.com:

kafka_rest:

hosts:

node1.confluent.com:

ksql:

hosts:

node2.confluent.com:

node3.confluent.com:

kafka_connect:

hosts:

node3.confluent.com:

control_center:

hosts:

node2.confluent.com:

- Test the connectivity:
cd cp-ansible
ansible -i hosts.yml all -m ping

ansible-yml

After your hosts.yml file is ready, to install or update all Confluent Platform components, run the all.yml playbook:

ansible-playbook -i hosts.yml all.yml

recap

This shall take a few mins to complete and if all went well, it will display the output similar to above.

Now that the installation is complete, we can navigate to the Control Center UI using your favorite. browser (need to make sure the AWS security rule allows port 9021):

https://<Public IPv4 DNS address of node2.confluent.com>:9021/

confluent

Your Confluent Platform 6.2.0 is now running!

Note: Since we have not used any license in the installation, the default installation will enable a Trial (Community) license which is valid for 30 days. A trial (evaluation) license allows a free trial of commercial features in a production setting.

More details on licensing are available at:

https://docs.confluent.io/platform/current/installation/license.html

The above setup is for a POC kind of setup in terms of architecture, there are very specific considerations to make to run the Confluent Kafka platform in Production:

production

Reach out to us at Clairvoyant to get the best data engineering solutions for your business.

References

  1. https://docs.confluent.io/ansible/current/overview.html

  2. https://github.com/confluentinc/cp-ansible

  3. https://docs.confluent.io/platform/current/installation/license.html

  4. https://docs.confluent.io/platform/current/installation/system-requirements.html

Author
Aman Patel

Tags: Data Engineering

Fill in your Details