Monitoring ZooKeeper with Exhibitor

Datetime:2016-08-23 00:31:06          Topic: ZooKeeper  Java           Share

Introduction

ZooKeeper :

ZooKeeper is a  distributed, open source coordination service for distributed applications.

It was started by Yahoo developers to overcome their issues in distributed applications and later on it was undertaken and developed by the Apache foundation. You can find the making of zookeeper at https://developer.yahoo.com/blogs/hadoop/apache-zookeeper-making-417.html

E xhibitor:

To supervise the ZooKeeper instances, periodic backups, checking nodes status and auto restart on znode failures we use a project called Exhibitor that was open sourced by Netflix.

Features of Exhibitor

Instance Monitoring

Each Exhibitor instance monitors the ZooKeeper server running on the same server. If ZooKeeper is not running, Exhibitor will write the zoo.cfg file (see Cluster-wide Configuration below) and start it. If ZooKeeper crashes for some reason, Exhibitor will restart it.

Backup/Restore

Backups in a ZooKeeper ensemble are more complicated than for a traditional data store (e.g. a RDBMS). Generally, most of the data in ZooKeeper is ephemeral. It would be harmful to blindly restore an entire ZooKeeper data set. What is needed is selective restoration to prevent accidental damage to a subset of the data set. Exhibitor enables this.

Exhibitor will periodically backup the ZooKeeper transaction files. Once backed up, you can index any of these transaction files. Once indexed, you can search for individual transactions and “replay” them to restore a given ZNode to ZooKeeper.

Log Cleanup

Exhibitor does this maintenance automatically.

Installation

Prerequisites

  • It is Java based application, requires java 1.6 and above
  • jps CLI tool to check Java processes/instances running or not.
  • maven or gradle – To build Exhibitor
    • maven to build exhibitor.jar file using pom.xml
    • gradle to build exhibitor.jar file using build.gradle

Here I already have a ZooKeeper cluster installed with five nodes. They are as follows.

172.16.20.138 zk1
172.16.20.127 zk2
172.16.20.64  zk3
172.16.20.75  zk4
172.16.20.74  zk5

Building Exhibitor jar

This is required only once in any of the nodes as we can use the same jar in the other nodes.There are two methods to build the jar file.

  • Maven
  • Gradle

Using Maven

Install maven package

root@zk1:~# apt-get install maven

Create a new directory and download pom.xml files ( This file consists of paths of sources to download the jar)

root@zk1:~/maven#wget https://raw.githubusercontent.com/Netflix/exhibitor/master/exhibitor-standalone/src/main/sources/buildscripts/standalone/maven/pom.xml

Run the following command to download necessary code using pom.xml

root@zk1:~/maven# mvn clean package
 Warning: JAVA_HOMEenvironmentvariableis not set.
 [INFO] Scanningfor projects...
 [INFO]
 [INFO] ------------------------------------------------------------------------
 [INFO] Buildingexhibitor 1.5.6
 [INFO] ------------------------------------------------------------------------
 .
 .
 .
 [WARNING] Seehttp://docs.codehaus.org/display/MAVENUSER/Shade+Plugin
 [INFO] Replacingoriginalartifactwithshadedartifact.
 [INFO] Replacing /root/target/exhibitor-1.5.6.jar with /root/target/exhibitor-1.5.6-shaded.jar
 [INFO] ------------------------------------------------------------------------
 [INFO] BUILDSUCCESS
 [INFO] ------------------------------------------------------------------------
 [INFO] Totaltime: 4:05.463s
 [INFO] Finishedat: TueApr 26 12:03:14 IST 2016
 [INFO] Final Memory: 11M/60M
 [INFO] ------------------------------------------------------------

If you receive any warning, you can ignore them safely.

Now build is done and you will be able to see the following files.

root@zk1:~/target# ls
 exhibitor-1.5.6.jar  maven-archiver  original-exhibitor-1.5.6.jar  surefire

Using Gradle

Install gradle package

root@zk1:~# apt-get install gradle

Create a new directory and download the build.gradle file into it.

root@zk1:~/gradle# wget https://raw.github.com/Netflix/exhibitor/master/exhibitor-standalone/src/main/resources/buildscripts/standalone/gradle/build.gradle

Run the following command to get required code using build.gradle

root@zk1:~/gradle# gradle build ( This command fetches all required libs and directories  from git using build.gradle)
 .
 .
 .
 .
 :compileJavaUP-TO-DATE
 :processResourcesUP-TO-DATE
 :classesUP-TO-DATE
 :jar
 :assemble
 :compileTestJavaUP-TO-DATE
 :processTestResourcesUP-TO-DATE
 :testClassesUP-TO-DATE
 :test
 :check
 :build
 
BUILDSUCCESSFUL
Totaltime: 1 mins 31.02 secs

A common problem experienced while building and its solution

FAILURE: Buildfailedwithanexception.
* Where:
 Buildfile '/root/gradle/build.gradle' line: 3
* Whatwentwrong:
 A problemoccurredevaluatingrootproject 'gradle'.
 > Couldnot findmethodjcenter() for arguments [] onrepositorycontainer.
* Try:
 Runwith --stacktraceoptionto getthestacktrace. Runwith --infoor --debugoptionto getmorelogoutput.
BUILDFAILED =====================================

The solution is to edit the build.gradle file and add the below mentioned code

root@zk1:~/gradle# cat build.gradle
 applyplugin: 'java'
 applyplugin: 'maven'
 group = 'exhibitor'
 version = '1.5.1'
repositories {
 mavenCentral()
  maven {
    url "https://repository.jboss.org/nexus/content/groups/public/"
  }
 }
 dependencies {
  compile 'com.netflix.exhibitor:exhibitor-standalone:1.5.1'
 }
 jar {
  from { configurations.compile.collect { it.isDirectory() ? it : zipTree(it) } }
  manifest {
    attributes (
    'Main-Class': 'com.netflix.exhibitor.application.ExhibitorMain',
    'Implementation-Version': project.version
    )
  }
}

Once the build is successful, build the jar

root@zk1:~/gradle# gradle jar
 :compileJavaUP-TO-DATE
 :processResourcesUP-TO-DATE
 :classesUP-TO-DATE
 :jarUP-TO-DATE
BUILDSUCCESSFUL
Totaltime: 21.414 secs

Gradle creates a directory “build/libs” as shown below

root@zk1:~/gradle/build/libs# ls
 gradle-1.5.1.jar
 root@zk1:~/gradle/build/libs#

Rename the gradle-1.5.1.jar to Exhibitor-1.5.1.jar.

Monitoring with Exhibitor

Accessing Exhibitor

Open any of the ZooKeeper’s ensemble node IP in the web browser and give the server IP address in the form http://zk1:8080/exhibitor/v1/ui/index.html

Integrate Exhibitor with single node ZooKeeper

Make sure your ZooKeeper node is up and running.

In the Exhibitor web console, switch on Editing -> add the ZooKeeper Install dir /usr/local/* ( parent directory for ZooKeeper Install). If you put * to the end of the value and Exhibitor will search for the latest version of ZooKeeper in that directory. It does this by choosing the directory with the highest version number in the name. i.e. ‘zookeeper-3.4.3′ will be chosen over ‘ZooKeeper.3.3.5′.

ZooKeeper snapshot Dir : It is nothing but a backup of ZooKeepers transactional log files. In ZooKeeper all the data is stored in .log format. The default max log size is 64MB.

The following options are the paths where we need to save our backups.

Commit  the changes, this will cause Exhibitor to stop and start the ZooKeeper instance. If successful, you should see following.

Integrate Exhibitor with a ZooKeeper cluster

  • To monitor this cluster, we have to copy the Exhibitor-1.5.6.jar file built previously to all the nodes and jar has to be run on all of them.
  • First run the below basic Exhibitor command with default options.
root@zk1:/usr/local/zookeeper# java -jar /usr/local/exhibitor/maven/target/exhibitor-1.5.6.jar -c  file > /var/log/exhibitor/exhi.log 2>&1 &

This command only helps us to monitor the node where we ran this command. This will not update the configuration even if you add all host details in the config tab  under ensemble section in exhibitor GUI.

We want to be able to monitor all the nodes in the cluster. To do this, all the Exhibitor nodes have to share configuration. This sharing can be done in 3 ways – shared file-system, Amazon S3 or a ZooKeeper cluster itself (which may be different from the ensemble we are monitoring).

The -c with ZooKeeper option in command enables shared configuration. This option provide the facility to sync the configuration entry’s between all hosts in a cluster.

Keeping the Configuration in a Shared Filesystem

To know about the syntaxes in command, please refer exhibitor help notes:

 root@zk1:/usr/local/exhibitor/maven/target# java -jar exhibitor-1.5.6.jar –help

I ran my ZooKeeper instances in VirtualBox VMs. In order to create a shared folder, I have used the VirtualBox shared folder concept across all guests with Automount option. Then I mounted it to /mnt/sf_ex-shared in all the nodes in the ZooKeeper cluster.

Now run the following command in all the cluster nodes.

root@zk1:~# java -jar /usr/local/exhibitor/maven/target/exhibitor-1.5.6.jar -c file --fsconfigdir /media/sf_ex-shared --filesystembackup true --servo true > /var/log/exhibitor/exhi.log 2>&1 &

Once you ran this you won’t see anything in exhibitor gui config section at first time and even you can’t able to monitor the cluster. To monitor the cluster, edit any of the exhibitor config section (in any of the nodes) and add required entry’s and do commit.

It will create a exhibitor.properties in shared location. Please do remember, by default exhibitor doesn’t create a exhibitor.properties file. This file looks like

root@zk1:~# cat /media/sf_ex-shared/exhibitor.properties
#Auto-generated by Exhibitor
#Thu Apr 28 11:33:51 IST 2016
com.netflix.exhibitor-rolling-hostnames=
com.netflix.exhibitor-rolling.zookeeper-data-directory=/usr/local/zookeeper/datadir
com.netflix.exhibitor-rolling.servers-spec=9\:zk1,8\:zk2,3\:zk3,7\:zk4,5\:zk5
com.netflix.exhibitor.java-environment=
com.netflix.exhibitor.zookeeper-data-directory=/usr/local/zookeeper/datadir
com.netflix.exhibitor-rolling-hostnames-index=0
com.netflix.exhibitor-rolling.java-environment=
com.netflix.exhibitor-rolling.observer-threshold=999
com.netflix.exhibitor.servers-spec=9\:zk1,8\:zk2,3\:zk3,7\:zk4,5\:zk5
com.netflix.exhibitor.cleanup-period-ms=43200000
com.netflix.exhibitor.auto-manage-instances-fixed-ensemble-size=0
com.netflix.exhibitor.zookeeper-install-directory=/usr/local/*
com.netflix.exhibitor.check-ms=30000
com.netflix.exhibitor.zookeeper-log-directory=
com.netflix.exhibitor-rolling.auto-manage-instances=1
com.netflix.exhibitor-rolling.cleanup-period-ms=43200000
com.netflix.exhibitor-rolling.auto-manage-instances-settling-period-ms=180000
com.netflix.exhibitor-rolling.check-ms=30000
com.netflix.exhibitor.log-index-directory=/usr/local/zookeeper/logdir
com.netflix.exhibitor-rolling.log-index-directory=/usr/local/zookeeper/logdir
com.netflix.exhibitor.backup-period-ms=60000
com.netflix.exhibitor-rolling.connect-port=2888
com.netflix.exhibitor-rolling.election-port=3888
com.netflix.exhibitor-rolling.backup-extra=directory\=%2Fusr%2Flocal%2Fzookeeper%2FBackup
com.netflix.exhibitor.client-port=2181
com.netflix.exhibitor-rolling.zoo-cfg-extra=syncLimit\=5&tickTime\=2000&initLimit\=10
com.netflix.exhibitor-rolling.zookeeper-install-directory=/usr/local/*
com.netflix.exhibitor.cleanup-max-files=3
com.netflix.exhibitor-rolling.auto-manage-instances-fixed-ensemble-size=0
com.netflix.exhibitor-rolling.backup-period-ms=60000
com.netflix.exhibitor-rolling.client-port=2181
com.netflix.exhibitor.backup-max-store-ms=86400000
com.netflix.exhibitor-rolling.cleanup-max-files=3
com.netflix.exhibitor-rolling.backup-max-store-ms=86400000
com.netflix.exhibitor.connect-port=2888
com.netflix.exhibitor.backup-extra=directory\=%2Fusr%2Flocal%2Fzookeeper%2FBackup
com.netflix.exhibitor.observer-threshold=999
com.netflix.exhibitor.log4j-properties=
com.netflix.exhibitor.auto-manage-instances-apply-all-at-once=1
com.netflix.exhibitor.election-port=3888
com.netflix.exhibitor-rolling.auto-manage-instances-apply-all-at-once=1
com.netflix.exhibitor.zoo-cfg-extra=syncLimit\=5&tickTime\=2000&initLimit\=10
com.netflix.exhibitor-rolling.zookeeper-log-directory=
com.netflix.exhibitor.auto-manage-instances-settling-period-ms=180000
com.netflix.exhibitor-rolling.log4j-properties=
com.netflix.exhibitor.auto-manage-instances=1
root@zk1:~#

Your Exhibitor GUI should now look like this -

Keeping the Configuration in ZooKeeper

We can also use a separate ZooKeeper ensemble to achieve the same thing as the shared file system. The command for that is –

root@zk1:/usr/local/exhibitor/maven/target# java -jar exhibitor-1.5.6.jar -c  zookeeper --zkconfigconnect "<b>zk2:2181","zk3:2181","zk4:2181","zk5:2181</b>"<b> --zkconfigzpath</b> /exhibitor/config <b>--filesystembackup true</b> --servo true ><b> kafka.log 2>&1 &</b>

Repeat the above command in all nodes in a cluster. If you are using the same ensemble for storing configuration, then exclude the current host entry and add all other hosts with port at zkconfigconnect. If it’s a different ensemble, include all the hosts. It will initiate the connection between all nodes in a cluster, once this is done, wait for some time to get all nodes available in exhibitor console.

The –filesystembackup enables the Backup and Restore option in exhibitor GUI.

Key Points

  • Paths should not be /tmp
  • Make sure exhibitor is started in all the nodes.
  • ZooKeeper must be in running state before Exhibitor starts.
  • Each Exhibitor instance monitors the ZooKeeper server running on the same server. If ZooKeeper is not running, Exhibitor will write the zoo.cfg fie and start it. If ZooKeeper crashes for some reasons, Exhibitor will restart it.

References

https://github.com/Netflix/exhibitor/wiki





About List