User Tools

Site Tools


strutture:lnf:dr:calcolo:sistemi:okd:okd_on_vsphere

OKD on vSphere

This repo is a collection of Terrraform and Ansible scripts to automate the OKD/OCP 4.x installation on vSphere environment.

The main goal is to require the insertion only of a minimal set of needed information and let Terraform, Ansible and OKD/OCP installer do the hard work.

Included scripts are meant to be ran from a workstation/execution host (it will be called execution host in the following), tested versions are:

  • Terraform v1.3.5 with provider hashicorp/vsphere v2.2.0
  • Ansible [core 2.13.6]

What is included

  • vSphere env (Terraform):
    • bastion host creation, from an existing template
    • definition of a set of roles needed by OKD/OCP installer to perform a smooth deploy
    • given a vSphere user (used as a service account), defined roles are automatically assigned
    • definition of Ansible inventory and templates to be used in the next step
    • returning of the bastion IP
  • Bastion host configuration:
    • ansible sudoer user creation to be used when performing next playbook execution (optional)
    • install-config.yaml generation
    • OKD/OCP installer download and installation
    • OKD/OCP client download and installation
    • enabling oc bash completion
    • download and trusting of vCenter certificates
    • dir installation configuration as a git repo (and commiting install-config.yaml generated version)

What is NOT included

This scripts don’t include:

  • vSphere service account creation, it needs to be created BEFORE Terraform execution
  • A proper VM template on vSphere env running a Linux OS (a RH-like one if it’s possible)
  • Avoiding plain passwords in Terraform state files and Ansible vars, use enrypted dirs and/or vaults by your own
  • Additional configuration and anything not mentioned in the previous section
  • Any further installation/setup/deploy regarding the installed OKD/OCP cluster.
  • Something that can guess your desired configuration and/or env details, so carefully fill out config and var files as indicated below prior to run Terraform/Ansible

In general, all requirements are the same for common OKD/OCP installation:

All design, requirements validation and preparative activities are always needed in advance like a “manual” installation, please follow the above mentioned official documentation.

Information to be gathered

vSphere env

  • vCenter URL/hostname (FQDN or valid alternate subject name)
  • Computing cluster
  • Datacenter
  • Network (to be used as machine network)
  • Administrator level account (to perform Terraform runs)
  • Service account user (used by OKD/OCP installer/MachineConfig integration)
  • Service account password
  • Default Datastore
  • VM template name
  • VM template guest OS type
  • Folder name
  • Bastion VM details:
    • name
    • vCPUs
    • assigned RAM (in MB)
    • disk size (in GB)

OKD/OCP target cluster

  • OKD/OCP version to be installed (list of releases here and here, respectively)
  • Base domain
  • Cluster name
  • Installation dir location (inside the bastion host)
  • Compute (worker) nodes sizing:
    • Assigned cores
    • Cores per socket
    • Assigned RAM memory (in MB)
    • Disk size (in GB)
    • Cardinality (number of replicas)
  • Controlplane (master) nodes sizing:
    • Assigned cores
    • Cores per socket
    • Assigned RAM memory (in MB)
    • Disk size (in GB)
    • Cardinality (number of replicas) 3 is a magic number, don’t change it
  • Cluster network CIDR (if different from default)
  • Service network CIDR (if different from default)
  • Machine network CIDR
  • VIPs:
    • api (api.<cluster_name>.<base_domain>)
    • ingress (*.apps.<cluster_name>.<base_domain>)
  • Pull secret (if RH Insight, etc. are needed, if not, leave the default fake pull secret)
  • Management public key (allows ssh into the nodes while bootstrapping - recommended)

Management pubkeys

Is strongly recommended to use one or more management public keys, they can be found useful when troubleshooting the installation.

Management public keys can be inserted in :

  • pubkeys list in vars/pubkeys.yml (allowed pubkeys grant access to bastion host when running Ansible playbooks )
  • ssh_key var in vars/bastion.yaml (allowed pubkeys grant access to bootstrapping OKD/OCP nodes)

Usage walkthrough

Preface: in order to be executed, Terraform scripts need to use an administrative user account password. To avoid to store it inside configuration files, it can be passed using an environment variable previously valued from different sources like a password manager.

For example, the env var can be defined, used and destroyed in a short bash one-liner while using pass or a similar password manager:

export TF_VAR_vsphere_password=$(pass vcenter/Administrator@vsphere.local); terraform <plan|apply|destroy>; unset TF_VAR_vsphere_password

In the following, when terraform <plan|apply|destroy> commands are mentioned, a similar approach is supposed to be used.

NOTE: when passwords are passed to Terraform, related credentials are stored as PLAIN TEXT in .state files after usage, so be careful while letting others access your Terraform project dir*

As mentioned before, a dedicated user needs to be created in advance on vSphere as a service account.

Installation phase can be resumed to the following steps:

  1. Clone this repo and cd into the local copy:
git clone git@baltig.infn.it:rorru/okd-on-vsphere.git
cd okd-on-vsphere

<HTML><ol start="2" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Copy all configuration example files and edit them properly to reflect your existing/desired env, using information collected at the previous step:<HTML></li></HTML><HTML></ol></HTML>

cp terraform.tfvars.example terraform.tfvars
cp vars/bastion.yaml.example vars/bastion.yaml
cp vars/pubkeys.yml.example vars/pubkeys.yml
vim terraform.tfvars
vim vars/bastion.yaml
vim vars/pubkeys.yml

<HTML><ol start="3" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Initialize the Terraform project:<HTML></li></HTML><HTML></ol></HTML>

terraform init

<HTML><ol start="4" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Define Terraform plan:<HTML></li></HTML><HTML></ol></HTML>

terraform plan

and ALWAYS REVIEW YOUR PLAN OUTPUT

<HTML><ol start="5" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Apply your changes:
<HTML></li></HTML><HTML></ol></HTML>

terraform apply

and ALWAYS REVIEW YOUR APPLY OUTPUT BEFORE CONFIRM. Let Terraform create all needed resources, script execution stops after waiting for bastion host connectivity. At this point, all infrastructure resources are defined and Ansible will take care of configuring the bastion host.

<HTML><ol start="6" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Grant passwordless access to the bastion host from the execution host. To do so, run the specific playbook that creates and configure a sudo ansible user on the bastion host. The playbook runs as root by default, so if your VM template is configured for password access, use -k option to be prompted to insert the root password:
<HTML></li></HTML><HTML></ol></HTML>

ansible-playbook -k enable_ansible_access.yaml

If your template allow for access using a sudoer user (often using a pubkey), use the form:

ansible-playbook -b -u <sudoer_user> enable_ansible_access.yaml

Add -K option to the latter if password access should enforced (no pubkey is authorized on bastion host).

<HTML><ol start="7" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Another playbook needs to ben ran to configure bastion host:<HTML></li></HTML><HTML></ol></HTML>

ansible-playbook bastion_setup.yaml

<HTML><ol start="8" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>At this point, to launch the OKD/OCP installer, we need to access the bastion directly. If you don’t remember your bastion assigned IP, run terraform output to show the information again. Then type:<HTML></li></HTML><HTML></ol></HTML>

ssh -l ansible <bastion_ip>

Escalate to super user privileges (if needed):

[ansible@bastion ~]$ sudo -i
[root@bastion ~]#

and cd to the install dir location specified in the vars/bastion.yaml. For example, supposing a var file content like:

platform:
  version: 4.10.0-0.okd-2022-07-09-073606
  ...
install_location:
  home_path: "/root"
  install_dir: "OKD-{{ platform.version }}"
  ...

you need to cd into /root/4.10.0-0.okd-2022-07-09-073606 dir.

<HTML><ol start="9" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Review the install-config.yaml file content and launch the OKD/OCP installer (use tmux or screen to avoid terminal disconnection):<HTML></li></HTML><HTML></ol></HTML>

openshift-install create cluster --dir <installation_dir>

Complete installation takes aproximatively 40 mins using default sizing parameters for master and worker nodes. Before execution stops, the installer will show administrative credential to access your new cluster. If you lose or forget related access information open <installation_dir>/auth/kubeadmin-password for the kubeadmin user password (usable on graphical console), or if you prefere to user the oc client, a KUBECONFIG env var can be defined:

export KUBECONFIG=<installation_dir>/auth/kubeconfig

Alternatively, copy <installation_dir>/auth/kubeconfig as ~/.kube/config

To obtain the OKD/OCP web console URL, type:

oc whoami --show-console

Destroying the cluster

To destroy the cluster, follow the steps:

  1. Access your bastion, if you don’t remember your bastion assigned IP, run terraform output to show the information again. Type:
ssh -l ansible <bastion_ip>

Escalate to super user privileges (if needed):

[ansible@bastion ~]$ sudo -i
[root@bastion ~]#

and cd in your install directory.

<HTML><ol start="2" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Run:<HTML></li></HTML><HTML></ol></HTML>

openshift-install destroy cluster --dir <installation_dir>

All PKD/OCP nodes VMs will be deleted. Now the bastion need to be destroyed itself.

<HTML><ol start="3" style="list-style-type: decimal;"></HTML> <HTML><li></HTML>Exit the bastion shell, and from the installation directory run:<HTML></li></HTML><HTML></ol></HTML>

terraform destroy

and ALWAYS REVIEW YOUR DESTROY OUTPUT BEFORE CONFIRM The bastion host will be destroyed and all permission assigned to the given service account user will be revoked.

Main caveat

Some permission automatically assigned to the given service account user are related to the root of vSphere computing resources, at “vCenter” level.

When updating an existing plan using terraform <plan|apply|destroy>, Terraform makes an in-place change, often resulting in other “vCenter” level permission removal. Because of this, again, ALWAYS REVIEW YOUR TERRFORM OUTPUT BEFORE CONFIRM, if some “vCenter” level permissions are about to be destroyed:

  1. Stop the current Terraform execution (CTRL-c)
  2. Delete “vCenter” level permissions and related role state on Terraform files:
terraform state rm vsphere_entity_permissions.vcenter-permissions vsphere_role.okd-sa-vcenter-role

<HTML><ol start="3" style="list-style-type: decimal;"></HTML> <HTML><li></HTML><HTML><p></HTML>Then manually remove the permission assigned to the service account user on vSphere<HTML></p></HTML><HTML></li></HTML> <HTML><li></HTML><HTML><p></HTML>Re-run the terraform <plan|apply|destroy> command just interrupted<HTML></p></HTML><HTML></li></HTML><HTML></ol></HTML>

strutture/lnf/dr/calcolo/sistemi/okd/okd_on_vsphere.txt · Last modified: 2022/12/13 00:22 by rorru@infn.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki