User Tools

Site Tools


Sidebar

progetti:cloud-areapd:operations:production_cloud:caos

CAOS

The instances of CAOS are hosted on cld-caos and the dashboards are reachable on:

The links are reachable only from inside the Padova LAN. To connect from outside a SSH tunnel must be opened, e.g:

$ ssh -v -o TCPKeepAlive=yes -N -L 4000:cld-caos.cloud.pd.infn.it:443 gate.pd.infn.it

and then go to:

Operations

The instances are managed by docker (through docker-compose).

The templates are located in the following directories:

  • Cloud Area Padovana: /root/caos/cap-prod
  • Cloud Veneto: /root/caos/cedc-prod
  • EGI Fed-Cloud: /root/caos/egi-prod

Inside each directory a docker-compose.yml file contains the configuration for the particular instance.

MySQL databases

The instances store data in the following MySQL databases:

  • Cloud Area Padovana: 192.168.60.10:6306/caos_prod
  • Cloud Veneto: 192.168.60.180:5306/caos_prod
  • EGI Fed-Cloud: 192.168.114.10:3306/caos_prod

How to start/stop an instance

Instances can be started by docker-compose up, for example:

# cd /root/caos/cap-prod
# docker-compose up -d

Instances can be stopped by docker-compose down, for example:

# cd /root/caos/cap-prod
# docker-compose down

How to update an instance

To update an instance, e.g. after changing the version of an image or changing configuration variables, issue the following command

docker-compose up -d

Update for Ocata (ongoing)

In the collector section of the docker-compose.yml file, set the environment variable

CAOS_COLLECTOR_OPENSTACK_VERSION=ocata

This enables the correct collection of the wall clock time.

How to create a new instance

1. Create a docker network

2. Create a database

Create a database, for example:

CREATE DATABASE caos;
 
GRANT ALL ON caos.* TO 'caos'@'192.168.60.%' IDENTIFIED BY '***';
GRANT ALL ON caos.* TO 'caos'@'localhost' IDENTIFIED BY '***';

3. Create a openstack user

openstack user create --password=*** caos
openstack role add --project admin --user caos admin

Then check connectivity to the DB and eventually migrate the schema with

docker-compose run --rm tsdb dbcheck
docker-compose run --rm tsdb migrate

4. Create a r/o user in mongo

use ceilometer

db.createUser({
  user: "caos",
  pwd: "***",
  roles: [
    { role: "read", db: "ceilometer" }
  ]
})

HTTPS/SSL

HTTPS/SSL communication is managed by the nginx proxy. The certificates (self-signed) are located at:

  • certificate: /root/caos/nginx/certificate.crt
  • private key: /root/caos/nginx/privateKey.key

After changing the above files, remember to restart the proxy:

# cd /root/caos/nginx
# docker-compose restart

Configuration

The TSDB, collector and dashboard can be easily configured through environment variables, whose meaning is explained below.

TSDB

Variable Example Value Description
CAOS_TSDB_LOGGER_LOG_FILE_LEVEL debug Log level
CAOS_TSDB_PORT 4444 Port to which to expose the service
CAOS_TSDB_DB_USERNAME caos_user Database username
CAOS_TSDB_DB_PASSWORD CAOS_DB_PWD Database password
CAOS_TSDB_DB_NAME caos_db Database name
CAOS_TSDB_DB_HOSTNAME db-host Database host
CAOS_TSDB_DB_PORT 3306 Database port
CAOS_TSDB_DB_POOL_SIZE 1 Number of connections to the database
CAOS_TSDB_AUTH_TOKEN_TTL 86400 Time to live of authentication tokens (in seconds)
CAOS_TSDB_AUTH_SECRET_KEY 0aaudveXM4+AcgYDTDj7wWDGfQ0MR4iiS7PpEWaueTo= Some random key used for signing tokens. Can be generated for example with openssl rand -base64 32
CAOS_TSDB_AUTH_IDENTITY_USERNAME admin Username used to access the service
CAOS_TSDB_AUTH_IDENTITY_PASSWORD ADMIN_PASS Password used to access the service

Collector

Variable Example Value Description
OS_* Openstack auth variables
CAOS_COLLECTOR_TSDB_API_URL http://localhost:4444/api/v1 Url of TSDB api
CAOS_COLLECTOR_TSDB_USERNAME admin TSDB username
CAOS_COLLECTOR_TSDB_PASSWORD ADMIN_PASS TSDB password
CAOS_COLLECTOR_OPENSTACK_VERSION ocata Openstack version
CAOS_COLLECTOR_CEILOMETER_POLLING_PERIOD 600 Ceilometer polling period (seconds)
CAOS_COLLECTOR_CEILOMETER_BACKEND gnocchi Backend to query data from ('gnocchi' or 'mongodb')

The following variables are used if CAOS_COLLECTOR_CEILOMETER_BACKEND is set to mongodb:

Variable Example Value Description
CAOS_COLLECTOR_MONGODB mongodb://caos:passmongo-host:27017/ceilometer MongoDB connection string
CAOS_COLLECTOR_MONGODB_CONNECTION_TIMEOUT 1 MongoDB connection timeout (seconds)

The following variables are used if CAOS_COLLECTOR_CEILOMETER_BACKEND is set to gnocchi:

Variable Example Value Description
CAOS_COLLECTOR_CEILOMETER_GNOCCHI_POLICY_GRANULARITY 300 The granularity of the policy used to store data in gnocchi (seconds)

Dashboard

Variable Example Value Description
CAOS_DASHBOARD_TSDB_HOST localhost TSDB host
CAOS_DASHBOARD_TSDB_PORT 4444 TSDB port
CAOS_DASHBOARD_BASE_NAME site If the dashboard is exposed as a sub-url, this must match the sub-url. For example if the dashboard will be exposed at http://some-host/site-name, then set CAOS_DASHBOARD_BASE_NAME to site-name
CAOS_DASHBOARD_SITE_NAME Site Name Name of the site, shown on login page

CPU/RAM allocation ratio

CPU and RAM allocation ratio can be set in the file caos-collector.conf.yaml. For example for cap-prod the file /root/caos/cap-prod/caos-collector.conf.yaml include a section like:

schedulers:
  ...
 
  hypervisors:
    misfire_grace_time: 300
    minute: '*/30'
    jobs:
      - 'hypervisors_state --allocation-ratio="{cpu: {default: 4, cld-np-09.cloud.pd.infn.it: 1 }, ram: {default: 1.5, cld-np-09.cloud.pd.infn.it: 1 } }"'

The default allocation ratios (4 for cpu and 1.5 for ram) are set within the default key of cpu and/or ram. Compute nodes specific values can be set by specifying the node name, like in the example above, where the node cld-np-09.cloud.pd.infn.it has both the cpu and ram allocation ratios set to 1.

If changes are made to the configuration file, remember to restart the collector:

# cd /root/caos/cap-prod
# docker-compose restart collector

Notes for Ocata: starting with Ocata the collector will read the allocation ratios using the nova placement api. Therefore the above configuration is not required anymore, but it can be used to override the values.

Meaning of the graphs

CPU

CPU related data

  • CPU Time: CPU time consumed over the specified granularity
  • TOTAL CPUs: Total CPU time available (number of cores in HT times the granularity)
  • Wall Clock Time: VCPU time consumed over the specified granularity
  • TOTAL VCPUs: Total VCPU time available (taking overcommitment into account)
  • Quota: Total VCPU time available as given by quota.

CPU Efficiency

Ratio between CPU Time and Wall Clock Time

VCPU

  • Used VCPUs: Number of used VCPUs
  • TOTAL VCPUs: Total VCPUs available (taking overcommitment into account).
  • Quota: Total VCPUs available as given by quota.

VRAM

  • VRAM usage
  • Quota

Instances

  • Active VMs: Number of active VMs
  • Deleted VMs: Number of deleted VMs
  • Quota: Total VMs available as set by quota

Usages

Resource usages in percent with respect of their quota.

  • CPU efficiency: ratio between CPU Time and Wall Clock Time
  • VCPU: ratio between Used VCPUs and VCPUs quota
  • VRAM: ratio between Used VRAM and VRAM quota
  • Instances: ratio between Active VMs and VMs quota
progetti/cloud-areapd/operations/production_cloud/caos.txt · Last modified: 2018/05/07 15:14 by chiarel1@infn.it