Tigera sponsored this post.
Amazon’s custom-built Graviton processor allows users to create Arm instances in the AWS public cloud, and Rancher K3s is an excellent way to run Kubernetes in these instances. By allowing a lightweight implementation of Kubernetes optimized for Arm with a single binary, K3s simplifies the cluster initialization process down to executing a simple command.
In an earlier article, I discussed how Arm architecture is becoming a rival to x86 in cloud computing, and steps that can be taken to leverage this situation and be prepared for this new era. Following the same narrative, in this article I’ll look at an example of the Calico eBPF data plane running on AWS, using Terraform to bootstrap our install to AWS and Rancher K3s to deploy the cluster.
A few changes to Calico are needed for Arm compatibility, including updating parts, enabling eBPF and compiling operators for the ARM64 environment:
- Tigera Operator – The recommended way to install Calico.
- go build – A container environment packed with all the utilities that Calico requires in its compilation process.
- Calico-node – The pod that hosts Felix, the brain that carries control plane decisions to your cluster.
Let’s discuss how to run an Arm-powered Kubernetes cluster equipped with Calico for security and eBPF as the data plane.
What is eBPF?
eBPF is a virtual machine embedded within the Linux kernel. It allows small programs to be loaded into the kernel and attached to hooks, which are triggered when some event occurs. This allows the behavior of the kernel to be customized.
Calico’s eBPF data plane makes use of BPF functionality to allow source IP preservation, direct server return (DSR) and even better performance. A full explanation of eBPF and its capabilities is out of the scope of this article, but if you are interested in learning more you can check out this post, which talks about eBPF capabilities and why you should use it.
Performance
Calico eBPF data plane is an alternative to Calico’s standard Linux data plane (based on iptables) that pushes the performance limits further. Calico uses BPF programs to process packets rapidly and efficiently without ever leaving the packet-processing context of the Linux kernel. The efficiency achieved by this method is close to natively compiled code in the kernel.
Note: AWS has a limit on single flows of traffic between EC2 instances that restricts this type of traffic to 5 Gbps.
The chart below shows the total CPU utilization consumed by the benchmarks, measured in vCPUs. This is the sum of both client and server CPU utilization.
These statistics were collected using two A1.xlarge
instances, this K8s benchmark suite, and multiple instances of iperf with parallel connections to each other. Note that AWS has a limit on single flows of traffic between EC2 instances that limits this type of traffic to 5 Gbps.
Demo
This section provides you with the steps necessary to deploy a Rancher K3s Arm-based cluster equipped with Calico’s eBPF data plane on AWS public cloud infrastructures.
Before We Begin
This demo section uses A1 instances from Amazon. In addition to an Amazon account, make sure the following applications are installed on your system:
- Install Terraform v1.0.0+
- Install aws-cli v2.2.13+
- Install calicoctl
Cluster Preparation
Use the following link to download the Terraform template cluster:
1 | curl –OL https://github.com/frozenprocess/demo-cluster/archive/refs/heads/master.zip |
Browse to the calico-k3s-aws
folder within demo-cluster
:
1 | cd calico–k3s–aws |
Inside the calico-k3s-aws
folder, you will find an example variable file called terraform.tfvars-example
. Removing the example
suffix from this file provides an easy way to modify some of the attributes of the cluster that will be deployed using Terraform.
For this example, since our document focuses on setting up an Arm environment, you need to modify the following values accordingly:
1 2 3 | availability_zone_names = [“us-west-2a”,“us-west-2c”] image_id = “ami-06d1fcb7a93046a55” instance_type = “m6g.large” |
Note: At the time of writing, AWS us-west-2 region only offers ARM64 instances in us-west-2a
and us-west-2c
availability zones.
Terraform uses “providers” to connect to a variety of environments. Use the following command to download the “providers” related to this demo:
1 | terraform init |
Now that our Terraform project is ready, use apply
to populate the resources. Two vanilla Ubuntu EC2 VMs, one virtual private cloud (VPC), one internet gateway (IGW), two subnets and one default route will be created after running the following command.
Note: Executing the following command will create resources in your AWS Account. Take a minute to review the resources before approving the deployment.
1 | terraform apply |
There should be an output similar to the following after a successful deployment:
1 2 3 4 | instance_1_private_ip = “172.16.2.183” instance_1_public_ip = “34.219.2.61” instance_2_private_ip = “172.16.1.81” instance_2_public_ip = “54.187.109.181” |
K3s Server Installation
K3s server is the process that plays the role of control plane. This installation process creates important files that are crucial in controlling and maintaining the cluster.
Use instance_1_public_ip
value and the calico-demo.pem
file to connect to instance one.
Note: In Windows, use PuTTY to complete this step. More information can be found here.
1 | ssh ubuntu@34.219.2.61 –i calico–demo.pem |
K3s’ single binary approach provides a nearly magical experience for installing a cluster. Simply run the following command and the cluster will be ready in a matter of seconds!
1 | curl –sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.21.3+k3s1 K3S_KUBECONFIG_MODE=”644″ INSTALL_K3S_EXEC=”–flannel-backend=none –tls-san=k3s.local –cluster-cidr=192.168.0.0/16 –disable-network-policy –disable=traefik” sh – |
Despite its tiny size, K3s is packed with a lot of features. In this demo, we are going to disable Flannel, Traefik, and K3s default network policy to simplify the process. (If you would like to know more about these features, please check out this link.)
Let’s check the state of our cluster using kubectl:
1 | kubectl get pods –A |
You should see a result similar to the below. Notice that all of the pods are in pending
state. This is because there is no container network interface (CNI) installed in our cluster.
1 2 3 4 | NAMESPACE NAME READY STATUS RESTARTS AGE kube–system local–path–provisioner–5ff76fc89d–pm2f5 0/1 Pending 0 23s kube–system metrics–server–86cbb8457f–6mmbd 0/1 Pending 0 23s kube–system coredns–7448499f4d–g6d9c 0/1 Pending 0 23s |
Installing Calico
Tigera Operator is a great way to install Calico. Operators are a great way to interact with custom resources in a controlled manner.
Note: Calico ARM64 eBPF support is in its early stages. You’ll need to modify the tigera-operator
manifest and add -arm64
to the end of the image: quay.io/tigera/operator:master
line. In the below command, I’ve used sed
to automatically do the procedure. However, if you are using Windows or having trouble with the following command, consider doing this step manually using a text editor and then applying your modified manifest using kubetl create -f <<myfile.yaml>>
.
Use the following command to install Tigera Operator on the cluster:
1 | curl https://docs.projectcalico.org/master/manifests/tigera-operator.yaml | sed ‘s#:master#:master-arm64#’ | kubectl create -f – |
It is possible to verify operator installation using the following command:
1 | kubectl get pods –n tigera–operator |
Next, you should see an output similar to the following:
1 2 | NAME READY STATUS RESTARTS AGE tigera–operator–86c4fc874f–86x8r 1/1 Running 0 56s |
Calico is packed with a lot of features and tunable to its basic components. After installing the operator, it will constantly check for a configuration in the default namespace that contains the kind: Installation
header to configure Calico in the cluster.
Use the following command to begin the installation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | kubectl create –f – <<EOF apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: calicoNetwork: bgp: “Disabled” ipPools: – blockSize: 26 cidr: 192.168.0.0/16 encapsulation: VXLAN natOutgoing: Enabled nodeSelector: all() EOF |
Use the following command to verify Calico deployment has finished:
1 | kubectl rollout status ds/calico–node –n calico–system |
There should be an output similar to the following:
1 | daemon set “calico-node” successfully rolled out |
At this point, we have a working node in our K3s cluster equipped with Calico using the standard Linux data plane.
K3s Agent Installation
Adding K3s worker nodes is as easy as installing the server. Simply copy the token:
1 | sudo cat /var/lib/rancher/k3s/server/node–token |
You should see an output similar to the following:
1 | K10d650693ae9d1c33239dee97b00c5a5f669c9921525f9b02f83c68cd7decae829::server:695a8f016aeb78245bb527f81fe42cd6 |
Use the instance_2_public_ip
value and calico-demo.pem
that was created earlier by our Terraform project and SSH into the second instance:
1 | ssh ubuntu@54.187.109.181 –i calico–demo.pem |
Next, use the following command to install the agent and join the worker node to the cluster.
(Note: Please change the IP address, port and token in the following command to suit your environment. You need to specify the internal IP instance_1_private_ip
of the first node, not the external IP.)
1 | curl –sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.21.3+k3s1 K3S_URL=https://172.16.2.183:6443 K3S_TOKEN=K10d650693ae9d1c33239dee97b00c5a5f669c9921525f9b02f83c68cd7decae829::server:695a8f016aeb78245bb527f81fe42cd6 sh – |
Back at the server, execute the following command to verify the newly installed worker:
1 | kubectl get nodes |
There should be an output similar to the following:
1 2 3 | NAME STATUS ROLES AGE VERSION ip–172–16–2–183 Ready control–plane,master 7m39s v1.21.1+k3s1 ip–172–16–1–81 Ready <none> 16s v1.21.1+k3s1 |
Enable eBPF
By default, Calico is set to use the iptables
data plane, which provides a service named kubernetes
in the default namespace and proxies it to the API server using kube-proxy
pods.
Since Calico can take over the kube-proxy
responsibilities in eBPF mode, we can safely remove these pods. However, we have to allow Calico to directly talk to the api-server to prevent any interruption.
Use the following command to determine the API server information:
1 2 3 | kubectl get endpoints kubernetes –o wide NAME ENDPOINTS AGE kubernetes 172.16.1.211:6443 4m18s |
By using a ConfigMap, we can tell Calico how to directly contact the cluster API server.
Note: Address and port might differ. Make sure these values are typed correctly, since an incorrect value can trigger a crash loop for tigera-operator
pod.
Use the following command to create the required ConfigMap:
1 2 3 4 5 6 7 8 9 10 | cat << EOF > kubernetes–services–endpoint.yaml kind: ConfigMap apiVersion: v1 metadata: name: kubernetes–services–endpoint namespace: tigera–operator data: KUBERNETES_SERVICE_HOST: “172.16.2.183” KUBERNETES_SERVICE_PORT: “6443” EOF |
Use kubectl apply
to apply the file and create the ConfigMap
:
1 | Kubectl apply –f kubernetes–services–endpoint.yaml |
It might take 60 seconds for the ConfigMap
to be picked up by the cluster. After that, use the following command to restart the operator in order to pick up the changes:
1 | kubectl delete pod –n tigera–operator –l k8s–app=tigera–operator |
K3s embeds the kube-proxy
process, which makes it hard to disable. Since both kube-proxy and eBPF are trying to interact with cluster data flow, we must change the Felix configuration parameter BPFKubeProxyIptablesCleanupEnabled
to false
. If both kube-proxy
and BPFKubeProxyIptablesCleanupEnabled
are enabled, kube-proxy
will write its iptables rules and Felix will try to clean them up, resulting in iptables flapping between the two.
Export the cluster credentials file that was created by the server installation process:
1 | export KUBECONFIG=/etc/rancher/k3s/k3s.yaml |
Use the following command to disable iptables cleanup:
1 | calicoctl patch felixconfiguration default —patch=‘{“spec”: {“bpfKubeProxyIptablesCleanupEnabled”: false}}’ |
Execute the following command to change the cluster data plane with eBPF:
1 | kubectl patch installation.operator.tigera.io default —type merge –p ‘{“spec”:{“calicoNetwork”:{“linuxDataplane”:”BPF”, “hostPorts”:null}}}’ |
Let’s verify our cluster’s health and node architecture by executing the following command:
1 2 3 4 | kubectl get nodes –L kubernetes.io/arch NAME STATUS ROLES AGE VERSION ARCH ip–172–16–2–244 Ready <none> 65s v1.21.3+k3s1 arm64 ip–172–16–1–224 Ready control–plane,master 2m21s v1.21.3+k3s1 arm64 |
That’s it! You now have a multinode K3s cluster secured with Calico that uses the eBPF data plane.
Clean Up
In the demo folder, issue the following command to remove all the resources created for this demo:
1 | terraform destroy –auto–approve |
Conclusion
K3s’s minimalistic approach makes it invaluable for resource-constrained environments, and its single binary approach allows for faster deployment across any infrastructure. Users can push this resource efficiency even further by using the Calico eBPF data plane, while also benefiting from Calico’s other features such as its feature-rich network policy language, BGP routing and more.
In this article, I explored how to install a multinode K3s cluster equipped with Calico and how to swap the standard Linux data plane with eBPF. I’ve only scratched the surface of Calico’s eBPF capabilities here. For more information, take a look at our eBPF documentation.
If you enjoyed this blog post, you might also like:
- Blog: Introducing the Calico eBPF data plane
- Getting started with Calico network policy
- Free, online webinars, workshops and resources
- Learn about Calico Cloud
Did you know you can become a certified Calico operator? Learn Kubernetes and container networking and security fundamentals using Calico in this free, self-paced certification course.
InApps Technology is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Tigera.
Featured image via Pixabay.