26
loading...
This website collects cookies to deliver better user experience
For the rest of the course, I'll be doing the labs in an AWS EC2 instance
Also, looking back at this script I made 4 months ago, I somehow felt I've grown a new perspective during that four months, I've actually modified and optimized the script below.
Went back and updated this by Aug 2021. This is the script for a single-node kafka cluster.
#!/bin/bash
#----------------------------------------------------------------------------------------------------------------------------------------#
# 02-Startup_script-Kafka_Install
# Maintainer: Jose Eden ([email protected])
# 2021-01-05 04:41:20
#-------------------------------------------------START OF SCRIPT------------------------------------------------------#
# Update instance
yum update -y
# Install java
yum install -y java-1.8.0-openjdk.x86_64
# Install wget, in case it is not installed.
yum install -y wget
# We'll keep all installs in /usr/local/bin
cd /usr/local/bin
# Download Scala 2.13.3 and untar file
# You can check the other scala binaries at
# https://www.scala-lang.org/files/archive/
wget https://www.scala-lang.org/files/archive/scala-2.13.3.tgz
tar -xvzf scala-2.13.3.tgz
# Download kafka and untar file
wget https://downloads.apache.org/kafka/2.7.0/kafka_2.13-2.7.0.tgz -v 2> ./wget_output.log
tar -xvf kafka_2.13-2.7.0.tgz
# Remove tgz files
rm -f kafka_2.13-2.7.0.tgz scala-2.13.3.tgz
# Renames the kafka folder
sudo mv kafka_2.13-2.7.0/ kafka
# Make 2 folders for data - one for kafka, one for zookeeper
cd kafka
mkdir -p data/kafka
mkdir -p data/zookeeper
# Edit .bashrc - put the kafka and scala path in $PATH
echo "export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk" >> /etc/profile
echo "export JRE_HOME=/usr/lib/jvm/jre" >> /etc/profile
echo "export PATH=/usr/local/bin/scala-2.13.3/bin:/usr/local/bin/kafka/bin:$PATH" >> /root/.bashrc
# sudo su -
# source /etc/profile
# Update zookeeper properties and kafka proeprties
cd /usr/local/bin/kafka
sudo sed -i -e "s/dataDir=\/tmp\/zookeeper/dataDir=\/usr\/local\/bin\/kafka\/data\/zookeeper/g" config/zookeeper.properties
sudo sed -i -e "s/log\.dirs=\/tmp\/kafka-logs/log\.dirs=\/usr\/local\/bin\/kafka\/data\/kafka/g" config/server.properties
# Checks if .bashrc and profile is edited, forwards to log file
# Checks if properties files are edited, forwards to log file
tail -5 /etc/profile > /usr/local/bin/edit-properties.log
tail -5 /root/.bashrc >> /usr/local/bin/edit-properties.log
grep dataDir /usr/local/bin/kafka/config/zookeeper.properties >> /usr/local/bin/edit-properties.log
grep log.dirs /usr/local/bin/kafka/config/server.properties >> /usr/local/bin/edit-properties.log
# exit
# OPTIONAL:
# Creating my user and setting user as root
sudo useradd -m -G root eden
#
# Changes the hostname to hcptstkafka1
sudo sed -i "s/.*/hcptstkafka1/" /etc/hostname
sudo sed -i "s/localhost/hcptstkafka1" /etc/hosts
sudo hostname hcptstkafka1
#
# Updates db for the locate command to immediately work
sudo updatedb
ssh -i "my-key.pem" [email protected]
sudo su
.# returns version of the java installed.
java -version
# Starts scala REPL shell. Ctrl-Z to exit
scala
# Run command to return man page/options for the command.
# You should be able to run this from any directory.
kafka-topics.sh
# it should return something like this:
# OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
# Create, delete, describe, or change a topic.
# Run command to return the modified profile, .bashrc, and properties file
cat /usr/local/bin/edit-properties.log
# it should return something like this:
# export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
# export JRE_HOME=/usr/lib/jvm/jre
# export PATH=/usr/local/bin/scala-2.13.3/bin:/usr/local/bin/kafka/# bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
# dataDir=/usr/local/bin/kafka/data/zookeeper
# log.dirs=/usr/local/bin/kafka/data/kafka
/data
in the Kafka directory and inside the /data
, we also created two folders. These two folders are currently empty but they'll soon be filled with files when we start the zookeeper and kafka later.ec2-user:data $ pwd
/usr/local/bin/kafka/data
ec2-user:data $
ec2-user:data $ ll
total 0
drwxr-xr-x 2 root root 187 Jun 27 18:21 kafka
drwxr-xr-x 3 root root 23 Jun 27 18:09 zookeeper
ec2-user:data $
ec2-user:data $ ll kafka
total 0
ec2-user:data $ ll zookeeper
total 0
dataDir
. in the zookeeper properties file to our data/zookeeper
. Any logs regarding the zookeeper will be stored in our folder. We did the same thing with kafka properties and instructed it to use our data/kafka
folder.ec2-user:config $ pwd
/usr/local/bin/kafka/config
ec2-user:config $
ec2-user:config $ grep dataDir zookeeper.properties
dataDir=/usr/local/bin/kafka/data/zookeeper
ec2-user:config $
ec2-user:config $ grep log.dirs server.properties
log.dirs=/usr/local/bin/kafka/data/kafka
/usr/local/bin/kafka/
when you reference properties files.root:config $ zookeeper-server-start.sh zookeeper.properties
# Some parts of output is omitted
[2021-06-27 18:31:52,028] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2021-06-27 18:31:52,042] INFO zookeeper.snapshotSizeFactor = 0.33 (org.apache.zookeeper.server.ZKDatabase)
root:config $ kafka-server-start.sh server.properties
# Some parts of output is omitted
[2021-06-27 18:36:27,975] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
[2021-06-27 18:36:28,004] INFO [broker-0-to-controller-send-thread]: Recorded new controller, from now on will use broker 0 (kafka.server.BrokerToControllerRequestThread)
data
directory, we'll see that both kafka
and zookeeper
now have files inside.root:data $ pwd
/usr/local/bin/kafka/data
root:data $
root:data $ ll kafka/
total 12
-rw-r--r-- 1 root root 0 Jun 27 18:11 cleaner-offset-checkpoint
-rw-r--r-- 1 root root 4 Jun 27 18:40 log-start-offset-checkpoint
-rw-r--r-- 1 root root 88 Jun 27 18:36 meta.properties
-rw-r--r-- 1 root root 4 Jun 27 18:40 recovery-point-offset-checkpoint
-rw-r--r-- 1 root root 0 Jun 27 18:11 replication-offset-checkpoint
root:data $
root:data $ ll zookeeper/
total 0
drwxr-xr-x 2 root root 70 Jun 27 18:31 version-2
data/kafka/
directory.[root@hcptstkafka1 data]# cat kafka/meta.properties
#
#Thu Aug 12 13:21:00 UTC 2021
cluster.id=UqNaEvX8RW2Wn9A4fwuS8g
version=0
broker.id=0
[root@hcptstkafka1 ~]# locate server.log
/usr/local/bin/kafka/logs/server.log
[root@hcptstkafka1 ~]# cd /usr/local/bin/kafka/logs
[root@hcptstkafka1 logs]# tail -10 server.log
[root@hcptstkafka1 logs]# tail -10 server.log
[2021-08-12 18:40:38,180] INFO Socket error occurred: localhost/127.0.0.1:2181: Network is unreachable (org.apache.zookeeper.ClientCnxn)
[2021-08-12 18:40:39,605] INFO Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-08-12 18:40:39,605] ERROR Unable to open socket to localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2021-08-12 18:40:39,605] INFO Socket error occurred: localhost/127.0.0.1:2181: Network is unreachable (org.apache.zookeeper.ClientCnxn)
[2021-08-12 18:40:40,714] INFO Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-08-12 18:40:40,714] ERROR Unable to open socket to localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2021-08-12 18:40:40,714] INFO Socket error occurred: localhost/127.0.0.1:2181: Network is unreachable (org.apache.zookeeper.ClientCnxn)
[2021-08-12 18:40:42,615] INFO Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-08-12 18:40:42,615] ERROR Unable to open socket to localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2021-08-12 18:40:42,616] INFO Socket error occurred: localhost/127.0.0.1:2181: Network is unreachable (org.apache.zookeeper.ClientCnxn)
[root@hcptstkafka1 logs]# tail -10 server.log | grep ERROR
[2021-08-12 18:40:39,605] ERROR Unable to open socket to localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2021-08-12 18:40:40,714] ERROR Unable to open socket to localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2021-08-12 18:40:42,615] ERROR Unable to open socket to localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[root@hcptstkafka1 logs]#
[root@hcptstkafka1 logs]# tail -10 server.log | grep Exception
[root@hcptstkafka1 logs]#
Your machine is now setup. You're good to proceed to the next chapter!😃👍 To finish the Kafka Theory, proceed to the next two articles in this series. You could also skip ahead to the Kafka CLI section.