progetti:cloud-areapd:egi_federated_cloud:rocky-centos7_testbed
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| progetti:cloud-areapd:egi_federated_cloud:rocky-centos7_testbed [2019/02/04 14:18] – [Installing Squid for CVMFS] verlato@infn.it | progetti:cloud-areapd:egi_federated_cloud:rocky-centos7_testbed [2023/11/28 13:26] (current) – verlato@infn.it | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Rocky-CentOS7 Testbed ====== | ||
| + | Fully integrated Resource Provider [[https:// | ||
| + | === EGI Monitoring/ | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[http:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | === Local Monitoring/ | ||
| + | * [[http:// | ||
| + | * [[http:// | ||
| + | * [[http:// | ||
| + | * [[http:// | ||
| + | * [[http:// | ||
| + | === Local dashboard === | ||
| + | * [[https:// | ||
| + | |||
| + | ===== Layout ===== | ||
| + | |||
| + | * Controller + Network node: **egi-cloud.pd.infn.it** | ||
| + | |||
| + | * Compute nodes: **cloud-01: | ||
| + | | ||
| + | * Storage node (images and block storage): **cld-stg-01.pd.infn.it** | ||
| + | |||
| + | * OneData provider: **one-data-01.pd.infn.it** | ||
| + | |||
| + | * Cloudkeeper, | ||
| + | |||
| + | * Cloud site-BDII: **egi-cloud-sbdii.pd.infn.it** (VM on cert-03 server) | ||
| + | |||
| + | * Accounting SSM sender: **cert-37.pd.infn.it** (VM on cert-03 server) | ||
| + | |||
| + | * Network layout available [[http:// | ||
| + | |||
| + | |||
| + | ===== OpenStack configuration ===== | ||
| + | |||
| + | Controller/ | ||
| + | |||
| + | We created one project for each EGI FedCloud VO supported, a router and various nets and subnets obtaining the following network topology: | ||
| + | |||
| + | {{: | ||
| + | |||
| + | We mount the partitions for the glance and cinder services (cinder not in the fstab file) from 192.168.61.100 with nfs driver: | ||
| + | <code bash> | ||
| + | yum install -y nfs-utils | ||
| + | mkdir -p / | ||
| + | cat<< | ||
| + | 192.168.61.100:/ | ||
| + | EOF | ||
| + | mount -a | ||
| + | </ | ||
| + | We use some specific configurations for cinder services using the following documentation [[http:// | ||
| + | | ||
| + | ===== EGI FedCloud specific configuration ===== | ||
| + | |||
| + | (see [[https:// | ||
| + | |||
| + | Install CAs Certificates and the software for fetching the CRLs in both Controller (egi-cloud) Compute (cloud-01: | ||
| + | <code bash> | ||
| + | systemctl stop httpd | ||
| + | curl -L http:// | ||
| + | yum install -y ca-policy-egi-core fetch-crl | ||
| + | systemctl enable fetch-crl-cron.service | ||
| + | systemctl start fetch-crl-cron.service | ||
| + | cd / | ||
| + | ln -s / | ||
| + | update-ca-trust extract | ||
| + | </ | ||
| + | On **egi-cloud-ha** node also install CMD-OS repo: | ||
| + | <code bash> | ||
| + | yum -y install http:// | ||
| + | </ | ||
| + | ==== Install AAI integration and VOMS support components ==== | ||
| + | Taken from [[https:// | ||
| + | |||
| + | To be executed on **egi-cloud.pd.infn.it** node: | ||
| + | <code bash> | ||
| + | vo=(ops dteam fedcloud.egi.eu enmr.eu) | ||
| + | volast=enmr.eu | ||
| + | EGIHOST=egi-cloud.pd.infn.it | ||
| + | KYPORT=443 | ||
| + | HZPORT=8443 | ||
| + | yum install -y gridsite mod_auth_openidc | ||
| + | sed -i " | ||
| + | sed -i " | ||
| + | sed -i " | ||
| + | |||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | /websso/ | ||
| + | curl -L https:// | ||
| + | callback_template.html | ||
| + | systemctl restart httpd.service | ||
| + | source admin-openrc.sh | ||
| + | openstack identity provider create --remote-id https:// | ||
| + | echo [ > mapping.egi.json | ||
| + | echo [ > mapping.voms.json | ||
| + | for i in ${vo[@]} | ||
| + | do | ||
| + | | ||
| + | | ||
| + | | ||
| + | cat << | ||
| + | { | ||
| + | " | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | }, | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | } | ||
| + | ], | ||
| + | " | ||
| + | { | ||
| + | " | ||
| + | }, | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | ] | ||
| + | }, | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | ] | ||
| + | } | ||
| + | ] | ||
| + | EOF | ||
| + | [ $i = $volast ] || ( echo " | ||
| + | [ $i = $volast ] && ( echo " | ||
| + | [ $i = $volast ] && ( echo " | ||
| + | cat << | ||
| + | { | ||
| + | " | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | }, | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | } | ||
| + | ], | ||
| + | " | ||
| + | { | ||
| + | " | ||
| + | }, | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | ], | ||
| + | " | ||
| + | } | ||
| + | ] | ||
| + | EOF | ||
| + | [ $i = $volast ] || ( echo " | ||
| + | [ $i = $volast ] && ( echo " | ||
| + | [ $i = $volast ] && ( echo " | ||
| + | done | ||
| + | openstack mapping create --rules mapping.egi.json egi-mapping | ||
| + | openstack federation protocol create --identity-provider egi.eu --mapping egi-mapping openid | ||
| + | openstack mapping create --rules mapping.voms.json voms | ||
| + | openstack | ||
| + | |||
| + | mkdir -p / | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | mkdir -p / | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | mkdir -p / | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | mkdir -p / | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | cat > / | ||
| + | / | ||
| + | / | ||
| + | EOF | ||
| + | # | ||
| + | cat << | ||
| + | Listen $KYPORT | ||
| + | |||
| + | < | ||
| + | OIDCSSLValidateServer Off | ||
| + | OIDCProviderTokenEndpointAuth client_secret_basic | ||
| + | OIDCResponseType " | ||
| + | OIDCClaimPrefix " | ||
| + | OIDCClaimDelimiter ; | ||
| + | OIDCScope " | ||
| + | OIDCProviderMetadataURL https:// | ||
| + | OIDCClientID <your OIDC client token> | ||
| + | OIDCClientSecret <yout OIDC client secret> | ||
| + | OIDCCryptoPassphrase somePASSPHRASE | ||
| + | OIDCRedirectURI https:// | ||
| + | |||
| + | # OAuth for CLI access | ||
| + | OIDCOAuthIntrospectionEndpoint | ||
| + | OIDCOAuthClientID <yout OIDC client token> | ||
| + | OIDCOAuthClientSecret <yout OIDC client secret> | ||
| + | # OIDCOAuthRemoteUserClaim | ||
| + | |||
| + | # Increase Shm cache size for supporting long entitlements | ||
| + | OIDCCacheShmEntrySizeMax 33297 | ||
| + | |||
| + | # Use the IGTF trust anchors for CAs and CRLs | ||
| + | SSLCACertificatePath / | ||
| + | SSLCARevocationPath / | ||
| + | SSLCACertificateFile $CA_CERT | ||
| + | SSLEngine | ||
| + | SSLCertificateFile | ||
| + | SSLCertificateKeyFile | ||
| + | # Verify clients if they send their certificate | ||
| + | SSLVerifyClient | ||
| + | SSLVerifyDepth | ||
| + | SSLOptions | ||
| + | SSLProtocol | ||
| + | SSLCipherSuite | ||
| + | WSGIDaemonProcess keystone-public processes=5 threads=1 user=keystone group=keystone display-name=%{GROUP} | ||
| + | WSGIProcessGroup keystone-public | ||
| + | WSGIScriptAlias / / | ||
| + | WSGIApplicationGroup %{GLOBAL} | ||
| + | WSGIPassAuthorization On | ||
| + | LimitRequestBody 114688 | ||
| + | < | ||
| + | ErrorLogFormat " | ||
| + | </ | ||
| + | ErrorLog / | ||
| + | CustomLog / | ||
| + | < | ||
| + | < | ||
| + | Require all granted | ||
| + | </ | ||
| + | < | ||
| + | Order allow,deny | ||
| + | Allow from all | ||
| + | </ | ||
| + | </ | ||
| + | < | ||
| + | # populate ENV variables | ||
| + | GridSiteEnvs on | ||
| + | # turn off directory listings | ||
| + | GridSiteIndexes off | ||
| + | # accept GSI proxies from clients | ||
| + | GridSiteGSIProxyLimit 4 | ||
| + | # disable GridSite method extensions | ||
| + | GridSiteMethods "" | ||
| + | |||
| + | Require all granted | ||
| + | Options -MultiViews | ||
| + | </ | ||
| + | < | ||
| + | AuthType | ||
| + | Require | ||
| + | # | ||
| + | LogLevel | ||
| + | </ | ||
| + | |||
| + | < | ||
| + | Authtype oauth20 | ||
| + | Require | ||
| + | # | ||
| + | LogLevel | ||
| + | </ | ||
| + | </ | ||
| + | Alias /identity / | ||
| + | < | ||
| + | SetHandler wsgi-script | ||
| + | Options +ExecCGI | ||
| + | |||
| + | WSGIProcessGroup keystone-public | ||
| + | WSGIApplicationGroup %{GLOBAL} | ||
| + | WSGIPassAuthorization On | ||
| + | </ | ||
| + | EOF | ||
| + | sed -i " | ||
| + | source admin-openrc.sh | ||
| + | for i in public internal admin | ||
| + | do | ||
| + | | ||
| + | | ||
| + | done | ||
| + | systemctl restart httpd.service | ||
| + | | ||
| + | </ | ||
| + | OpenStack Dashboard (Horizon) Configuration: | ||
| + | * Edit / | ||
| + | <code bash> | ||
| + | OPENSTACK_KEYSTONE_URL = " | ||
| + | WEBSSO_ENABLED = True | ||
| + | WEBSSO_INITIAL_CHOICE = " | ||
| + | |||
| + | WEBSSO_CHOICES = ( | ||
| + | (" | ||
| + | ("" | ||
| + | ) | ||
| + | </ | ||
| + | To change the dashboard logo, copy the right svg file in / | ||
| + | |||
| + | For publicly exposing on https some OpenStack services do not forget to create the files / | ||
| + | |||
| + | ==== Install FedCloud BDII ==== | ||
| + | |||
| + | (See [[https:// | ||
| + | Installing the resource bdii and the cloud-info-provider in **egi-cloud-ha** (with CMD-OS repo already installed): | ||
| + | <code bash> | ||
| + | yum -y install bdii cloud-info-provider cloud-info-provider-openstack | ||
| + | </ | ||
| + | Customize the configuration file / | ||
| + | |||
| + | Customize the file / | ||
| + | <code bash> | ||
| + | export OS_AUTH_URL=https:// | ||
| + | export OS_PROJECT_DOMAIN_ID=default | ||
| + | export OS_REGION_NAME=RegionOne | ||
| + | export OS_USER_DOMAIN_ID=default | ||
| + | export OS_PROJECT_NAME=admin | ||
| + | export OS_IDENTITY_API_VERSION=3 | ||
| + | export OS_USERNAME=accounting | ||
| + | export OS_PASSWORD=< | ||
| + | export OS_AUTH_TYPE=password | ||
| + | export OS_CACERT=/ | ||
| + | </ | ||
| + | Create the file / | ||
| + | <code bash> | ||
| + | cat<< | ||
| + | #!/bin/sh | ||
| + | |||
| + | . / | ||
| + | |||
| + | for P in $(openstack project list -c Name -f value); do | ||
| + | cloud-info-provider-service --yaml / | ||
| + | --os-tenant-name $P \ | ||
| + | --middleware openstack | ||
| + | done | ||
| + | EOF | ||
| + | </ | ||
| + | Run manually the cloud-info-provider script and check that the output return the complete LDIF. To do so, execute: | ||
| + | <code bash> | ||
| + | chmod +x / | ||
| + | / | ||
| + | / | ||
| + | </ | ||
| + | Now you can start the bdii service: | ||
| + | <code bash> | ||
| + | systemctl start bdii | ||
| + | </ | ||
| + | Use the command below to see if the information is being published: | ||
| + | <code bash> | ||
| + | ldapsearch -x -h localhost -p 2170 -b o=glue | ||
| + | </ | ||
| + | Do not forget to open port 2170: | ||
| + | <code bash> | ||
| + | firewall-cmd --add-port=2170/ | ||
| + | firewall-cmd --permanent --add-port=2170/ | ||
| + | systemctl restart firewalld | ||
| + | </ | ||
| + | Information on how to set up the site-BDII in **egi-cloud-sbdii.pd.infn.it** is available [[https:// | ||
| + | |||
| + | Add your cloud-info-provider to your site-BDII **egi-cloud-sbdii.pd.infn.it** by adding new lines in the site.def like this: | ||
| + | <code bash> | ||
| + | BDII_REGIONS=" | ||
| + | BDII_CLOUD_URL=" | ||
| + | BDII_BDII_URL=" | ||
| + | </ | ||
| + | |||
| + | ==== Use the same APEL/SSM of grid site ==== | ||
| + | |||
| + | Cloud usage records are sent to APEL through the ssmsend program installed in **cert-37.pd.infn.it**: | ||
| + | <code bash> | ||
| + | [root@cert-37 ~]# cat / | ||
| + | # send buffered usage records to APEL | ||
| + | 30 */24 * * * root / | ||
| + | </ | ||
| + | It is therefore neede to install and configure NFS on **egi-cloud-ha**: | ||
| + | <code bash> | ||
| + | [root@egi-cloud-ha ~]# yum -y install nfs-utils | ||
| + | [root@egi-cloud-ha ~]# mkdir -p / | ||
| + | [root@egi-cloud-ha ~]# cat<< | ||
| + | / | ||
| + | EOF | ||
| + | [root@egi-cloud-ha ~]# systemctl start nfs-server | ||
| + | </ | ||
| + | In case of APEL nagios probe failure, check if / | ||
| + | |||
| + | To check if accounting records are properly received by APEL server look at [[http:// | ||
| + | |||
| + | ==== Install the accounting system (cASO) ==== | ||
| + | |||
| + | (see [[https:// | ||
| + | |||
| + | On **egi-cloud** create accounting user and role, and set the proper policies: | ||
| + | <code bash> | ||
| + | openstack user create --domain default --password < | ||
| + | openstack role create accounting | ||
| + | for i in VO: | ||
| + | cat<< | ||
| + | " | ||
| + | " | ||
| + | EOF | ||
| + | </ | ||
| + | Install cASO on **egi-cloud-ha** (with CMD-OS repo already installed): | ||
| + | <code bash> | ||
| + | yum -y install caso | ||
| + | </ | ||
| + | Edit the / | ||
| + | <code bash> | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | </ | ||
| + | Create the directories | ||
| + | <code bash> | ||
| + | mkdir / | ||
| + | </ | ||
| + | Test it | ||
| + | <code bash> | ||
| + | caso-extract -v -d | ||
| + | </ | ||
| + | Create the cron job | ||
| + | <code bash> | ||
| + | cat << | ||
| + | # extract and send usage records to APEL/ | ||
| + | 10 * * * * root / | ||
| + | EOF | ||
| + | </ | ||
| + | |||
| + | ==== Install Cloudkeeper and Cloudkeeper-OS ==== | ||
| + | |||
| + | On **egi-cloud.pd.infn.it** create a cloudkeeper user in keystone: | ||
| + | <code bash> | ||
| + | openstack user create --domain default --password CLOUDKEEPER_PASS cloudkeeper | ||
| + | </ | ||
| + | and, for each project, add the cloudkeeper user with the user role | ||
| + | <code bash> | ||
| + | for i in VO:ops VO: | ||
| + | </ | ||
| + | Install Cloudkeeper and Cloudkeeper-OS on **egi-cloud-ha** (with CMD-OS repo already installed): | ||
| + | <code bash> | ||
| + | yum -y install cloudkeeper cloudkeeper-os | ||
| + | </ | ||
| + | Edit / | ||
| + | <code bash> | ||
| + | - https:// | ||
| + | - https:// | ||
| + | - https:// | ||
| + | </ | ||
| + | Edit the / | ||
| + | <code bash> | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | openstack-config --set / | ||
| + | </ | ||
| + | Creating the / | ||
| + | <code bash> | ||
| + | cat<< | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | }, | ||
| + | " | ||
| + | " | ||
| + | }, | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | } | ||
| + | EOF | ||
| + | </ | ||
| + | Enable and start the services | ||
| + | <code bash> | ||
| + | systemctl enable cloudkeeper-os | ||
| + | systemctl start cloudkeeper-os | ||
| + | systemctl enable cloudkeeper.timer | ||
| + | systemctl start cloudkeeper.timer | ||
| + | </ | ||
| + | |||
| + | ==== Installing Squid for CVMFS (optional) ==== | ||
| + | |||
| + | Install and configure squid on cloud-01 and cloud-02 for use from VMs (see https:// | ||
| + | <code bash> | ||
| + | yum install -y squid | ||
| + | sed -i " | ||
| + | cat<< | ||
| + | minimum_expiry_time 0 | ||
| + | |||
| + | max_filedesc 8192 | ||
| + | maximum_object_size 1024 MB | ||
| + | |||
| + | cache_mem 128 MB | ||
| + | maximum_object_size_in_memory 128 KB | ||
| + | # 50 GB disk cache | ||
| + | cache_dir ufs / | ||
| + | acl cvmfs dst cvmfs-stratum-one.cern.ch | ||
| + | acl cvmfs dst cernvmfs.gridpp.rl.ac.uk | ||
| + | acl cvmfs dst cvmfs.racf.bnl.gov | ||
| + | acl cvmfs dst cvmfs02.grid.sinica.edu.tw | ||
| + | acl cvmfs dst cvmfs.fnal.gov | ||
| + | acl cvmfs dst cvmfs-atlas-nightlies.cern.ch | ||
| + | acl cvmfs dst cvmfs-egi.gridpp.rl.ac.uk | ||
| + | acl cvmfs dst klei.nikhef.nl | ||
| + | acl cvmfs dst cvmfsrepo.lcg.triumf.ca | ||
| + | acl cvmfs dst cvmfsrep.grid.sinica.edu.tw | ||
| + | acl cvmfs dst cvmfs-s1bnl.opensciencegrid.org | ||
| + | acl cvmfs dst cvmfs-s1fnal.opensciencegrid.org | ||
| + | http_access allow cvmfs | ||
| + | EOF | ||
| + | rm -rf / | ||
| + | mkdir -p / | ||
| + | chown -R squid.squid / | ||
| + | squid -k parse | ||
| + | squid -z | ||
| + | ulimit -n 8192 | ||
| + | systemctl start squid | ||
| + | firewall-cmd --permanent --add-port 3128/tcp | ||
| + | systemctl restart firewalld | ||
| + | </ | ||
| + | Use CVMFS_HTTP_PROXY=" | ||
| + | |||
| + | Actually, better to use already existing squids: | ||
| + | CVMFS_HTTP_PROXY=" | ||
| + | |||
| + | ==== Local Accounting ==== | ||
| + | A local accounting system based on Grafana, InfluxDB and Collectd has been set up following the instructions [[https:// | ||
| + | |||
| + | ==== Local Monitoring ==== | ||
| + | === Ganglia === | ||
| + | * Install ganglia-gmond on all servers | ||
| + | * Configure cluster and host fields in **/ | ||
| + | * Finally: systemctl enable gmond.service; | ||
| + | === Nagios === | ||
| + | * Install on compute nodes nsca-client, | ||
| + | |||
| + | * Copy the file **cld-nagios:/ | ||
| + | |||
| + | * Then do in all compute nodes: | ||
| + | <code bash> | ||
| + | $ echo encryption_method=1 >> / | ||
| + | $ usermod -a -G libvirt nagios | ||
| + | $ sed -i ' | ||
| + | # then be sure the files below are in / | ||
| + | $ ls / | ||
| + | check_kvm | ||
| + | $ cat <<EOF > crontab.txt | ||
| + | # Puppet Name: nagios_check_kvm | ||
| + | 0 */1 * * * / | ||
| + | EOF | ||
| + | $ crontab crontab.txt | ||
| + | $ crontab -l | ||
| + | </ | ||
| + | * On the contoller node, add in / | ||
| + | <code bash> | ||
| + | " | ||
| + | </ | ||
| + | and in / | ||
| + | <code bash> | ||
| + | " | ||
| + | </ | ||
| + | * Create in the VO:dteam project a cirros VM with tiny flavour named nagios-probe and access key named dteam-key (saving the private key file dteam-key.pem in egi-cloud /root directory), and take note of its ID and private IP. Then on the cld-nagios server put its ID in the file **/ | ||
| + | |||
| + | * On the cld-nagios server check/ | ||
| + | |||
| + | ==== Security incindents and IP traceability ==== | ||
| + | See [[https:// | ||
| + | On egi-cloud do install the [[https:// | ||
| + | <code bash> | ||
| + | [root@egi-cloud ~]# os-ip-trace 90.147.77.229 | ||
| + | +--------------------------------------+-----------+---------------------+---------------------+ | ||
| + | | device id | user name | | ||
| + | +--------------------------------------+-----------+---------------------+---------------------+ | ||
| + | | 3002b1f1-bca3-4e4f-b21e-8de12c0b926e | | ||
| + | +--------------------------------------+-----------+---------------------+---------------------+ | ||
| + | </ | ||
| + | Save and archive important log files: | ||
| + | * On egi-cloud and each compute node cloud-0%, add the line "*.* @@192.168.60.31: | ||
| + | * In cld-foreman, | ||
| + | Install ulogd in the controller node | ||
| + | <code bash> | ||
| + | yum install -y libnetfilter_log | ||
| + | yum localinstall -y http:// | ||
| + | yum localinstall -y http:// | ||
| + | </ | ||
| + | and configure / | ||
| + | Start the service | ||
| + | <code bash> | ||
| + | systemctl enable ulogd | ||
| + | systemctl start ulogd | ||
| + | </ | ||
| + | Finally, be sure that / | ||
| + | |||
| + | ==== Troubleshooting ==== | ||
| + | |||
| + | * Passwordless ssh access to egi-cloud from cld-nagios and from egi-cloud to cloud-0* has been already configured | ||
| + | * If cld-nagios does not ping egi-cloud, be sure that the rule "route add -net 192.168.60.0 netmask 255.255.255.0 gw 192.168.114.1" | ||
| + | * In case of Nagios alarms, try to restart all cloud services doing the following: | ||
| + | <code bash> | ||
| + | $ ssh root@egi-cloud | ||
| + | [root@egi-cloud ~]# ./ | ||
| + | [root@egi-cloud ~]# for i in $(seq 1 7); do ssh cloud-0$i ./ | ||
| + | </ | ||
| + | * Resubmit the Nagios probe and check if it works again | ||
| + | * In case the problem persist, check the consistency of the DB by executing (this also fix the issue when quota overview in the dashboard is not consistent with actual VMs active): | ||
| + | <code bash> | ||
| + | [root@egi-cloud ~]# python nova-quota-sync.py | ||
| + | </ | ||
| + | * In case of EGI Nagios alarm, check that the user running the Nagios probes is not belonging also to tenants other than " | ||
| + | |||
| + | * in case of reboot of egi-cloud server: | ||
| + | * check its network configuration (use IPMI if not reachable): all 4 interfaces must be up and the default gateway must be 90.147.77.254. | ||
| + | * check DNS in / | ||
| + | * check routing with $route -n, if needed do: $ip route replace default via 90.147.77.254. Also be sure to have a route for 90.147.77.0 network. | ||
| + | * check if storage mountpoints 192.168.61.100:/ | ||
| + | * check if port 8472 is open on the local firewall (it is used by linuxbridge vxlan networks) | ||
| + | |||
| + | * in case of reboot of cloud-0* server (use IPMI if not reachable): all 3 interfaces must be up and the default destination must have 192.168.114.1 as gateway | ||
| + | * check its network configuration | ||
| + | * check if all partitions in /etc/fstab are properly mounted (do: $ df -h) | ||
| + | |||
| + | * In case of network instabilities, | ||
| + | <code bash> | ||
| + | [root@egi-cloud ~]# / | ||
| + | generic-receive-offload: | ||
| + | </ | ||
| + | |||
| + | * Also check if / | ||
| + | <code bash> | ||
| + | [root@egi-cloud ~]# cat / | ||
| + | #!/bin/bash | ||
| + | case " | ||
| + | em1) | ||
| + | / | ||
| + | ;; | ||
| + | em2) | ||
| + | / | ||
| + | ;; | ||
| + | em3) | ||
| + | / | ||
| + | ;; | ||
| + | em4) | ||
| + | / | ||
| + | ;; | ||
| + | esac | ||
| + | exit 0 | ||
| + | </ | ||
| + | |||
| + | * If you need to change the project quotas, check " | ||
| + | <code bash> | ||
| + | [root@egi-cloud ~]# source admin-openrc.sh | ||
| + | [root@egi-cloud ~]# openstack quota set --cores 184 VO:enmr.eu | ||
| + | </ | ||
