Deploying an RKE2 Cluster with CAPI, Rancher Turtles, and Proxmox

If you’ve recently set up Rancher and want to deploy your first Kubernetes cluster on Proxmox, this guide will walk you through the process. By combining Cluster API (CAPI), Rancher Turtles, and Proxmox, you can build and manage clusters directly from Rancher. CAPI enables declarative management of Kubernetes clusters, while Rancher Turtles integrates Rancher’s powerful management features with Proxmox as your infrastructure provider.

Understanding the Components

  1. Cluster API (CAPI): A Kubernetes sub-project that provides declarative APIs for cluster creation, configuration, and management.
  2. Rancher Turtles: An extension to CAPI that enhances its functionality, particularly for managing RKE2 clusters.
  3. Proxmox: An open-source virtualization platform that you will use as your infrastructure provider.
  4. RKE2: A Kubernetes distribution focused on security and compliance.

Prerequisites

Before you begin, ensure you have the following:

  • A Proxmox VE template: An image to clone for your cluster nodes.
  • A Rancher instance: Rancher should be installed and accessible.

Auto-Importing Your Cluster into Rancher Manager

To simplify cluster management, configure Rancher Manager to automatically import your new CAPI cluster. You can:

Import all CAPI clusters in a namespace:
Label the namespace containing your clusters:

kubectl label namespace default cluster-api.cattle.io/rancher-auto-import=true

Import a specific cluster Label an individual cluster:

kubectl label cluster.cluster.x-k8s.io my-cluster cluster-api.cattle.io/rancher-auto-import=true

By applying one of these labels, your newly created CAPI cluster will be automatically imported into Rancher Manager, allowing you to manage it alongside your other Kubernetes clusters.

Installation

Setting Up Proxmox User and API Token

Use a dedicated Proxmox VE user with an API token. You can create this through the UI or by running the following commands on the Proxmox VE node:

pveum user add capmox@pve
pveum aclmod / -user capmox@pve -role PVEAdmin
pveum user token add capmox@pve capi -privsep 0

These commands create a user named capmox, assign the PVEAdmin role, and create an API token for this user.

Install Rancher Turtles

helm repo add rancher-turtles https://rancher-sandbox.github.io/rancher-turtles/
helm repo update helm install rancher-turtles rancher-turtles/rancher-turtles -n cattle-system --create-namespace

Install IPAM Components

Deploy the IP Address Management (IPAM) components. Cluster API IP Address Management (IPAM) provider allows clusters to manage pools of IP addresses using Kubernetes-native resources:

kubectl apply -f https://github.com/kubernetes-sigs/cluster-api-ipam-provider-in-cluster/releases/download/v1.0.0/ipam-components.yaml

Deploy Proxmox Infrastructure Components

Install the Proxmox-specific infrastructure components. The components contain:

  • Custom Resource Definitions (CRDs): New Kubernetes resource types for Proxmox-specific infrastructure (like ProxmoxCluster, ProxmoxMachine, etc.).
  • Controller Deployment: A controller (operator) that watches for Cluster API resources and provisions/manages Proxmox VMs accordingly.
  • RBAC Roles and Bindings Permissions so the controller can manage resources securely.

Download the Proxmox infrastructure components YAML and update the following values before applying:

curl -LO https://github.com/ionos-cloud/cluster-api-provider-proxmox/releases/latest/download/infrastructure-components.yaml

Edit the file to set your Proxmox credentials:

capmox-manager-credentials:   
   secret: ${PROXMOX_SECRET=""}  
   token: ${PROXMOX_TOKEN=""}  
   url: ${PROXMOX_URL=""}

Then apply the YAML:

kubectl apply -f infrastructure-components.yaml

Cluster Configuration

Now, let's look at the key parts of the cluster configuration:

1. Define the Cluster

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: my-cluster
  namespace: default
spec:
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
    kind: ProxmoxCluster
    name: my-cluster
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: RKE2ControlPlane
    name: my-cluster-control-plane
  clusterNetwork:
    pods:
      cidrBlocks: [192.168.0.0/22]

2. Configure the Proxmox Cluster

Modify the allowed Proxmox nodes, control plane endpoint, and IP configuration to match your Proxmox environment and network setup. The controlPlaneEndpoint should reference a loadbalancer that sends traffic to the control-plane nodes. If you don’t have a load balancer, consider deploying kube-vip in your RKE2ControlPlane manifests.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
metadata:
  name: my-cluster
  namespace: default
spec:
  allowedNodes:
    - proxmox1
    - proxmox2
    - proxmox3
  controlPlaneEndpoint:
    host: 192.168.12.219
    port: 6443
  dnsServers:
      - 192.168.12.22
      - 1.1.1.1
  ipv4Config:
    addresses:
      - 192.168.12.220-192.168.12.222
    prefix: 22
    gateway: 192.168.12.1

3. Define the Machine Template

This template defines how new machines should be created in Proxmox. Adjust the source node, template ID, storage, resource allocations, and network settings based on their available resources and requirements.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxMachineTemplate
metadata:
  name: my-cluster-control
  namespace: default
spec:
  template:
    spec:
      sourceNode: "proxmox1"
      templateID: 100
      full: true
      storage: ceph
      numCores: 2
      memoryMiB: 8192
      disks:
        bootVolume:
          sizeGb: 100
          disk: scsi0
      network:
        default:
          vlan: 200
          bridge: vmbr0
          model: virtio

4. Set Up the RKE2 Control Plane

Update the RKE2 version, adjust the number of replicas for high availability, and modify or add any custom configurations as needed. The manifest also includes a template for kube-vip; if you plan to deploy kube-vip, update the content section of /var/lib/rancher/rke2/server/manifests/kubevip.yaml accordingly.

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: RKE2ControlPlane
metadata:
  name: my-cluster-control-plane
  namespace: default
spec:
  version: v1.31.4+rke2r1
  replicas: 1
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
    kind: ProxmoxMachineTemplate
    name:  my-cluster-control
  rolloutStrategy:
    type: RollingUpdate
  files:
    - path: "/var/lib/rancher/rke2/server/manifests/coredns-config.yaml"
      owner: "root:root"
      permissions: "0640"
      content: |
        apiVersion: helm.cattle.io/v1
        kind: HelmChartConfig
        metadata:
          name: rke2-coredns
          namespace: kube-system
        spec:
          valuesContent: |-
            tolerations:
            - key: "node.cloudprovider.kubernetes.io/uninitialized"
              value: "true"
              effect: "NoSchedule"
    - path: "/var/lib/rancher/rke2/server/manifests/kubevip.yaml"
      owner: "root:root"
      permissions: "0640"
      content: |
        # KubeVIP configuration...
  preRKE2Commands:
    - apt-get update && apt-get install -y qemu-guest-agent
    - systemctl enable qemu-guest-agent --now
  serverConfig:
    cni: cilium
  agentConfig:
    kubelet:
      extraArgs:
        - --cloud-provider=external
        - --provider-id=proxmox://{{ ds.meta_data.instance_id }}

Deploying Your Cluster

To deploy your cluster, save the entire YAML configuration to a file (e.g., my-cluster.yaml) and apply it using kubectl. This will initiate the cluster creation process. CAPI and Rancher Turtles will work together to provision the necessary resources in Proxmox and set up your RKE2 cluster.

kubectl apply -f my-cluster.yaml

capi cluster imported

Troubleshooting

If you encounter issues during deployment or management, check the logs of these system pods:

  • rancher-turtles-system: Manages Rancher Turtles integration.
  • rke2-bootstrap-system: Handles bootstrapping for RKE2 nodes.
  • rke2-control-plane-system: Manages RKE2 control plane operations.
  • capi-system: Runs core Cluster API controllers.
  • capmox-system: Contains Proxmox-specific CAPI provider controllers.
  • capi-ipam-in-cluster-system: Manages in-cluster IP address allocation.