Collecting VM Metrics from vCloud using Cassandra and KairosDB

Hi again,

As we already installed and configured Debian Linux we are fully set and ready to install Cassandra cluster for collecting out VM metrics from vCloud Director 8.11 and 8.20.

As we could read in VMware blog it is quite usefull. and design should look like this:cassandra_cluster_installation_1.png

But in my plans we will implement separately KairosDB and Cassandra VMs.

  • Minimum cluster size is three nodes (must be equal or larger than the replication factor). Use scale out rather than scale up approach because Cassandra performance scales linearly with number of nodes.
  • Estimate I/O requirements based on the expected number of VMs, and correctly size the Cassandra cluster and its storage.

n … expected number of VMs
m … number of metrics per VM (currently 8)
t … retention (days)
r … replication factor

Write I/O per second = n × m × r / 10
Storage = n × m × t × r × 114 kB

For 30,000 VMs, the I/O estimate is 72,000 write IOPS and 3288 GB of storage (worst-case scenario if data retention is 6 weeks and replication factor is 3).

Cassandra structure after installation is the following:

Configuration Files Locations
cassandra.yaml /etc/cassandra /etc/cassandra /etc/cassandra /etc/cassandra /usr/share/cassandra

The packaged releases install into these directories:

Directories Description
/var/lib/cassandra Data directories
/var/log/cassandra Log directory
/var/run/cassandra Runtime files
/usr/share/cassandra Environment settings
/usr/share/cassandra/lib JAR files
/usr/bin Binary files
/etc/cassandra Configuration files
/etc/init.d Service startup script
/etc/security/limits.d Cassandra user limits


Lets start installing Cassandra 3.0 , KairosDB and prerequisites for it!

We need to install Java 8 for Cassandra 3.0 and KairosDB first:

  1. We need to create repository list for java: cassandra_cluster_installation_2.JPG
  2. Add the following repos to the listcassandra_cluster_installation_3.JPG
  3. “apt-get update” – to update the repositories
  4. “apt-get install oracle-java8-installer -y” wait for installation to start installation. accept the licenses.
  5. After installation we shall check the Java version by running “Java -version” and get resultcassandra_cluster_installation_4.JPG

After Java installation we can install Cassandra:

  1. We need to create repository list for cassandra and add repositories in it:
    touch /etc/apt/sources.list.d/cassandra.sources.list
    echo "deb stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
    echo "deb stable main" | tee -a /etc/a pt/sources.list.d/cassandra.sources.list
  2. apt-get update – to update repositories
  3. apt-get install dsc30  – to install cassandra
  4. systemctl status cassandracassandra_cluster_installation_5.JPG
  5. systemctl stop cassandra – to stop the service

As we want to keep our database correctly we have to mount separate hdd to /var/lib/cassandra folder (as all the data located there):

  1. fdisk -l – to see all of the volumes install and search for /dev/sdb (c…)cassandra_cluster_installation_6.JPG
  2. Now as we are sure that our device is called sdb and correct size shall run the following commands:
    fdisk /dev/sdb
    Command (m for help): o
    Command (m for help): n
    Select (default p): p
    Partition number (14, default 1): 1
    Command (m for help): w
    mkfs.ext4 /dev/sdb1
  3. Now we have to map this disk to our lib folder for cassandra (as all of the data located there) we will use rsynch and fstab:
    mkdir /mnt/cassandra
    mount /dev/sdb1 /mnt/cassandra/
    cp -ax /cassandra /mnt/cassandra/
    mv cassandra/ cassandra.old
    mkdir cassandra
    umount /dev/sdb1
    mount /dev/sdb1 cassandra
    chown cassandra:cassandra cassandra
    vim /etc/fstab

    And add following line to the list:

    /dev/sdb1 /var/lib/cassandra ext4 defaults 0 1

As now we are done with preparing one cassandra node we can proceed and clone it to another 2 (as you remember minimal requirement is 3 nodes).

  1. To achieve this goal we need to run simple command in VMware powerCLI:
    NewVM -name Cassandra03.demo.lab VM $cassa ResourcePool $RP
  2. Don’t forget to change hostnames and IP address in:
    vim /etc/hosts
    vim /etc/hostname
    vim /etc/network/interfaces
    vim /etc/ssh/sshd_config


Now the time has come for configuring cassandra itself:

  1. Make sure that your nodes are not running by entering the command:
    systemctl status cassandra
    systemctl stop cassandra
  2. As we cloned our VMs we will have token conflicts inside of cluster so remove all the data by following commands:
    rm -rf /var/lib/cassandra/commitlog/*
    rm -rf /var/lib/cassandra/data/*
    rm -rf /var/lib/cassandra/saved_caches/*
  3. Edit /etc/cassandra/cassandra.yaml
    cluster_name: ‘VM_Metrics’
    num_tokens: 256
    – class_name: org.apache.cassandra.locator.SimpleSeedProvider
    – seeds: ",,"
    start_rpc: true
    rpc_port: 9160
    endpoint_snitch: GossipingPropertyFileSnitch
    auto_bootstrap: false
  4. Edit /etc/cassandra/
  5. Edit /etc/cassandra/
  6. Do the same on other nodes (don’t forget to change the IP-addresses where required)
  7. After it we shall start cassandra service on nodes (one by one):
    systemctl start cassandra
  8. You shall see the following picture when run the command:
    nodetool status

Our Cassandra cluster is ready to recieve data from KairosDB which we gonna install now:

  1. Prepare Linux VM (as base system we use Debian – installation, configuration)
  2. Download and unpack KairosDB
    wget –no-check-certificate
    tar -xzf kairosdb-1.1.3-1.tar.gz
  3. Attach additional vHDD and mount it (same as we did with cassandra)
    mkdir /var/lib/kairosdb
    mount /dev/sdb1 /var/lib/kairosdb/
    mv kairosdb/* /var/lib/kairosdb/
    vim /etc/fstab

    Add the following:

    /dev/sdb1 /var/lib/kairosdb ext4 defaults 0 1
  4. Edit /var/lib/kairosdb/conf/
    # kairosdb.service.datastore=org.kairosdb.datastore.h2.H2Module
  5. Need first run KairosDB:
    /var/lib/kairosdb/bin/ run

    You will see the following picture:

  6. Now we have to start KairosDB in the background:
    /var/lib/kairosdb/bin/ start

    As a result we will see the following picture in our browser (dont forget default port is 8080)

Now we are on final step we have to show vCloud Director where to store data.

  1. First we have to disable
    service vmware-vcd stop
    cd /opt/vmware/vcloud-director/bin/
    ./cell-management-tool configure-metrics –repository-host –repository-port 8080
    service vmware-vcd start ;tail -f /opt/vmware/vcloud-director/logs/cell.log
  2. After this is done give sometime for data to be collected and you can check metrics in KairosDB page.


Thank you for reading.

3 responses to “Collecting VM Metrics from vCloud using Cassandra and KairosDB”

  1. KC LO avatar
    KC LO

    How do you set the retention period? where did you find info for the 114KB per days per metrics?

    1. D.Rusov avatar

      It was actually taken from the whitepapers of VMware. It was a while ago but it was some document regarding Architecture of vCloud.

  2. KC LO avatar
    KC LO

    Thanks! Do you know how to set the retention period?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.