RHEV 4.2 metrics !!

From 4.2 release is possible to collect RHEV events and metrics from kibana.
The new metric store is not native, it must be manually installed. In my case I created a new guest inside RHEV and on top of it I installed the metric store. Basically it contains Elastic Search/Kibana running as container inside Openshift. Yes, you read well, Openshift! So, you will have a small Openshift running containing multiple containers. This is the installation guide

Following one screenshot of kibana examples dashboard:


RHEV 4.2 – Rhevm shell

From 4.2 release of Redhat Virtualization, rhevm shell is deprecated. You can still have the command but don’t use it because it “does not know” the new objects with new API version.
So, how can u deal and work about automation? The answer is: ansible!
This is a big opportunity for have fun using ansible. Where can you begin?
From the first, simple playbook:

- hosts: localhost
connection: local

- name: Obtain SSO token
url: "https://rhevm_url/ovirt-engine/api"
username: "admin@internal"
password: "insert_the_password"
insecure: "true"

- name: List vms
pattern: cluster=Default
auth: "{{ ovirt_auth }}"

- debug:
var: ovirt_vms

You have first to install ovirt-ansible-roles package, then you can run this playbook
It does not use external variables, it connects to rhevm via insecure mode, so this is the simpliest playbook you can use from rhevm to understand how it works.
This playbook will return each detail about vms running on rhevm. It’s extremely verbose but as I said this is just a start point 🙂

Have fun!

Linux Redhat 7: How to clear boot directory

Recently I noticed that multiple Vmware Linux templates had /boot filesystem used more than 90%
If you look on the web u will find a lot of solutions based just on removing kernel rpms. I disagree !
I began to clear the /boot directory removing the oldest kernels but this was not enough.

At this point you must go to /boot directory and look for rescue files:

# ls | grep rescue







Each rescue file does not belong to any rpm package, so you can manually delete the oldest pair files. In order I suggest you to follow this actions:

1) Look for a single rescue pair. If you want to know which kernel does belong to, you can run lsinitrd initram-0-rescue-
2) Try to boot the system using the rescue from the previous point
3) If everything worked fine, you can boot with the latest kernel and delete each old pairs rescue files.
4) Lastly, update the GRUB2 configuration:  grub2-mkconfig -o /boot/grub2/grub.cfg

Thank to Paolo Fruci and Marco Simonetti for helping me dealing with this issue, we played together 🙂

RHEV 4.2: How to check hosted-engine status via cli

Hosted engine is a special virtual machine inside RHEV architecture.
It has a dedicated command, hosted-engine.

How can I check via cli on which host does have the engine is running?
How can I check which hosts are valid candidates for hosting the engine?

The answer is:

[root@dcs-kvm01 ~]# hosted-engine --vm-status

--== Host 1 status ==--

conf_on_shared_storage : True
Status up-to-date : True
Hostname : dcs-kvm01.net
Host ID : 1
Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : caa0ce0d
local_conf_timestamp : 353495
Host timestamp : 353495
Extra metadata (valid at timestamp):
timestamp=353495 (Fri Aug 10 16:34:24 2018)
vm_conf_refresh_time=353495 (Fri Aug 10 16:34:25 2018)

--== Host 2 status ==--

conf_on_shared_storage : True
Status up-to-date : True
Hostname : dcs-kvm02.net
Host ID : 2
Engine status : {"health": "good", "vm": "up", "detail": "Up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : b1f91d1e
local_conf_timestamp : 351112
Host timestamp : 351112
Extra metadata (valid at timestamp):
timestamp=351112 (Fri Aug 10 16:34:20 2018)
vm_conf_refresh_time=351112 (Fri Aug 10 16:34:20 2018)

In my case the hosted engine is running on top of dcs-kvm02 host. Another useful information is the Score: each host has different metrics ( ping to default gw, filesystems, cpu load, etc.etc. ) and each metric has a score. Summing each score u can reach the best result, 3400. This score means the host is perfect for host the engine.
In this case you can also understand that this cluster is composited by 2 hypervisors.

First RHEV 4.2 up & running!

Yesterday I installed and configured RHEV 4.2 new infrastructure, I was so funny!

I installed a couple of hypervisors on DELL M630 blades and rhevm via hosted engine. I saw the new deploy via ansible, it’s more simple and fast than the last one. I also appreciated the cleanup utility after a hosted engine deploy failed. I remember in 4.0, in the same condition I had to manually look for written files and to clean them. It was painful to deal with vdsm configuration files.

I also saw ansible adding the second hypervisor, wow!!
It seems more stable and robust than the previous minor releases. Next step in the project will be to move virtual machines from Vmware and make multiple stress tests.

I was surprised connecting for the first time to the new dashboard, it remembers me Cloudforms 🙂

This is the new dashboard

PDU connection

For an unexpected power outage the customer asked me to connect on a PDU looking for log events. I never connected to a PDU and actually I did not use serial cable from 5-6 years, so I was very very funny!!

I connected via serial cable and I got what I looked for using CTRL-L 🙂

pdu final