Kubernetes on AWS Resources

KUBERNETES

Aug 23, 2024 ∙ 17 min read

MetricFire Blogger

Table of Contents

Introduction
Why is Managing Access to AWS Services a Problem?
Diving into Implementation with Kube2iam
- Overall Architecture
- Implementation
Kiam
- Overall Architecture
- Implementation
IAM Roles for Service Accounts (IRSA)
Conclusion

Introduction

Kubernetes is an open-source container orchestration system that allows you to get the most out of your machine. Using Kubernetes, however, raises the problem of managing pod access to various Amazon Web Services (AWS). This article covers how to overcome these problems by using specific tools. Here’s how we’ve organized the information:

Why managing access can be a problem
Managing access through Kube2iam
Managing access through KIAM
IAM Roles for Service Accounts (IRSA)

‍For a quick and easy method to get started with monitoring Kubernetes clusters, check out our tutorial on using the Telegraf agent as a Daemonset to forward node/pod metrics to a data source and use that data to create custom dashboards and alerts.

Why is Managing Access to AWS Services a Problem?

Imagine this: A Kubernetes node is hosting an application pod that needs access to AWS DynamoDB tables. Meanwhile, another pod on the same node needs access to an AWS S3 bucket. For both applications to work properly, the Kubernetes worker node must access both the DynamoDB tables and the S3 bucket at the same time.

Now, think about this happening to hundreds of pods, all requiring access to various AWS resources. The pods are constantly being scheduled on a Kubernetes cluster that needs to access several different AWS services simultaneously… It’s a lot!

One way to solve this would be to give the Kubernetes node—and, therefore, the pods—access to all AWS resources. However, this leaves your system an easy target for any potential attacker: if a single pod or node is compromised, an attacker will gain access to your entire AWS infrastructure. To avoid this, you can use tools like Kube2iam, Kiam, and IAM IRSA to improve Kubernetes pod access to the AWS resources. The best part? All the access API calls and authentication metrics can be pulled by Prometheus and visualized in Grafana. If you want to try the Prometheus/Grafana part, get onto our MetricFire free trial and start sending your data.

‍

Diving into Implementation with Kube2iam

Overall Architecture

Kube2iam is deployed as a DaemonSet in your cluster. Therefore, a Kube2iam pod will be scheduled to run on every worker node of your Kubernetes cluster. Whenever a different pod makes an AWS API call to access resources, that call will be intercepted by the Kube2iam pod running on that node. Kube2iam then ensures the pod is assigned appropriate credentials to access the resource.

You must also specify an Identity and Access Management (IAM) role in the pod spec. Under the hood, the Kube2iam pod will retrieve temporary credentials for the IAM role of the caller and return them to said caller. All the Amazon Elastic Compute Cloud (EC2) metadata API calls are made into a proxy. (A Kube2iam pod should run with host networking enabled to make the EC2 metadata API calls.)

Implementation

Creating and Attaching IAM Roles

Create an IAM role named my-role, which will have access to the required AWS resources (such as an AWS S3 bucket).
Follow these steps to enable trust relationship between the role and the role attached to the Kubernetes worker nodes. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kube2iam, so the worker node IAM roles do not need access to a large number of AWS Resources.)

a. Go to the newly created role in the AWS console and select the ‘Trust relationships’ tab

b. Click ‘Edit trust relationship’

c. Add the following content to the policy:

{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KUBERNETES_NODES_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

d. Enable ‘Assume role’ for Node Pool IAM roles. Add the following content to Nodes IAM policy:

{
        "Sid": "",
    "Effect": "Allow",
    "Action": [
    	"sts:AssumeRole"
    ],
    "Resource": [
        "arn:aws:iam::810085094893:instance-profile/*"
    ]
}

3. Add the IAM role's name to deployment as an annotation.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mydeployment
  namespace: default
spec:
...
  minReadySeconds: 5
  template:
      annotations:
        iam.amazonaws.com/role: my-role
    spec:
      containers:
...

Deploying Kube2iam

Create the service account, ClusterRole and ClusterRoleBinding, to be used by Kube2iam pods. The ClusterRole should have 'get', 'watch' and 'list' access to namespaces and pods under all API groups. You can use the manifest below to create them:

‍

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube2iam
  namespace: kube-system
---
apiVersion: v1
kind: List
items:
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRole
    metadata:
      name: kube2iam
    rules:
      - apiGroups: [""]
        resources: ["namespaces","pods"]
        verbs: ["get","watch","list"]
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: kube2iam
    subjects:
    - kind: ServiceAccount
      name: kube2iam
      namespace: kube-system
    roleRef:
      kind: ClusterRole
      name: kube2iam
      apiGroup: rbac.authorization.k8s.io
---

‍

2. Deploy the Kube2iam DaemonSet by using the manifest below:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
  namespace: kube-system
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      serviceAccount: kube2iam
      containers:
        - image: jtblin/kube2iam:latest
          name: kube2iam
          args:
            - "--auto-discover-base-arn"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=cali+"
            - "--verbose"
            - "--debug"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true
---

‍

Note: The Kube2iam container is being run with the arguments --iptables=true and --host-ip=$(HOST_IP), and in privileged mode as true.

‍

...
    securityContext:
        privileged: true
...

‍

The following settings prevent containers running in other pods from directly accessing the EC2 metadata API and gaining unwanted access to AWS resources. The traffic to 169.254.169.254 must be made into a proxy for Docker containers. This can be alternatively applied by running the following command on each Kubernetes worker node:

‍

iptables \
  --append PREROUTING \
  --protocol tcp \
  --destination 169.254.169.254 \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
--to-destination `curl 169.254.169.254/latest/meta-data/local-ipv4`:8181

Testing Access from a Test Pod

To check whether your Kube2iam deployment and IAM settings work, you can deploy a test pod with an IAM role specified as an annotation. If everything works, you should be able to check which IAM node gets attached to your pod. This can be easily verified by querying the EC2 Metadata API. Let’s deploy a test pod using the manifest below:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: access-test
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: access-test
  minReadySeconds: 5
  template:
    metadata:
      labels:
        app: access-test
      annotations:
        iam.amazonaws.com/role: my-role
    spec:
      containers:
      - name: access-test
        image: "iotapi322/worker:v4"

‍

Run the following command in the test pod created:

‍

curl 169.254.169.254/latest/meta-data/iam/security-credentials/

‍

You should get myrole as the response to this API.

I highly recommend tailing the logs of the Kube2iam pod running on that node to gain a deeper understanding of how and when the API calls are being intercepted. Once the setup works as expected, you should turn off verbosity in the Kube2iam deployment to avoid bombarding your logging backend.

‍

Kiam

While very helpful, Kube2iam has two significant issues that Kiam aims to resolve:

Data races under load conditions: When there are very high spikes in application load and several pods in the cluster, sometimes Kube2iam returns incorrect credentials to those pods. The GitHub issue can be referenced here.
Pre-fetch credentials: Access credentials are assigned to the IAM role specified in the pod spec before the container processes boot in the pod. By assigning the credentials before, Kiam reduces start latency and improves reliability.

Additional features of Kiam include:

Use of structured logging to improve the integration into your Elacsticsearch, Logstash, Kibana (ELK) setup with pod names, roles, access key IDs, etc.
Metrics are used to track response times, cache hit rates, etc. Prometheus readily scrapes these metrics and renders them over Grafana.

Overall Architecture

Kiam is based on agent-server architecture.

Kiam Agent: This process would typically be deployed as a DaemonSet to ensure that pods have no access to the AWS Metadata API. Instead, the Kiam agent runs an HTTP proxy that intercepts credentials requests and passes on everything else.
Kiam Server: This process is responsible for connecting the Kubernetes API servers to watch pods and communicating with the AWS Security Token Service (STS) to request credentials. It also maintains a cache of credentials for roles currently in use by running pods, ensuring that credentials are refreshed every few minutes and stored before the pods need them.

Implementation

Similar to Kube2iam, a pod must get credentials for any IAM role by specifying it as an annotation in the deployment manifest. Additionally, you need to specify which IAM roles can be allocated inside a particular namespace using appropriate annotations. This enhances security and lets you fine-tune control of IAM roles.

Creating and Attaching IAM Roles

1. Create an IAM role named kiam-server with appropriate access to AWS resources.

2. Enable trust relationship between the kiam-server role and the role attached to Kubernetes master nodes by following these steps. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kiam. The worker node IAM roles do not need access to many AWS resources.)

a. Go to the newly created role in the AWS console and select the ‘Trust relationships’ tab.

b. Click on ‘Edit trust relationship’.

c. Add the following content to the policy:

‍

{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KUBERNETES_MASTER_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

3. Add an in-line policy to the kiam-server role.

‍

{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "sts:AssumeRole"
       ],
       "Resource": "*"
   }
 ]
}

4. Create the IAM role (let's call it my-role) with appropriate access to AWS resources.

5. Enable a trust between the newly created and Kiam server roles.

To do so:

a. Go to the newly created role in the AWS console and select ‘Trust relationships’

b. Click ‘Edit trust relationship’.

c. Add the following content to the policy:

‍

{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KIAM-SERVER_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

6. Enable ‘Assume Role’ for Master Pool IAM roles. Add the following content as an in-line policy to master IAM roles:

‍

{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "sts:AssumeRole"
       ],
       "Resource": "<ARN_KIAM-SERVER_IAM_ROLE>"
   }
 ]

‍

All communication between Kiam agents and servers is TLS encrypted, which enhances security. To do this, we need to first deploy cert-manager in our Kubernetes cluster and generate certificates for our agent-server communication.

‍

Deploying Cert Manager and Generating Certificates

1. Install the custom resource definition resources separately.

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml

2. Create the namespace for cert-manager.

kubectl create namespace cert-manager

3. Label the cert-manager namespace to disable resource validation.

kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true

4. Add the Jetstack Helm repository.

helm repo add jetstack https://charts.jetstack.io

5. Update your local Helm chart repository cache.

helm repo update

6. Install the cert-manager Helm chart.

helm install --name cert-manager --namespace cert-manager --version v0.8.0 jetstack/cert-manager

Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS

1. Generate the CRT file.

‍

openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=kiam" -out kiam.cert -days 3650 -reqexts v3_req -extensions v3_ca -out ca.crt

2. Save the CA key pair as a secret in Kubernetes.

‍

kubectl create secret tls kiam-ca-key-pair \
  --cert=ca.crt \
  --key=ca.key \
  --namespace=cert-manager

3. Deploy cluster issuer and issue the certificate.

a. Create the Kiam namespace.

‍

apiVersion: v1
kind: Namespace
metadata:
  name: kiam
  annotations:
    iam.amazonaws.com/permitted: ".*"
---

‍

b. Deploy the cluster issuer and issue the certificate.

‍

apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
  name: kiam-ca-issuer
  namespace: kiam
spec:
  ca:
    secretName: kiam-ca-key-pair
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: kiam-agent
  namespace: kiam
spec:
  secretName: kiam-agent-tls
  issuerRef:
    name: kiam-ca-issuer
    kind: ClusterIssuer
  commonName: kiam
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: kiam-server
  namespace: kiam
spec:
  secretName: kiam-server-tls
  issuerRef:
    name: kiam-ca-issuer
    kind: ClusterIssuer
  commonName: kiam
  dnsNames:
  - kiam-server
  - kiam-server:443
  - localhost
  - localhost:443
  - localhost:9610
---

‍

4. Test if certificates are issued correctly.

‍

kubectl -n kiam get secret kiam-agent-tls -o yaml
kubectl -n kiam get secret kiam-server-tls -o yaml

‍

Annotating Resources

1. Add the IAM role’s name to deployment as an annotation.

‍

apiVersion: extensions/v1beta1
 kind: Deployment
 metadata:
 	name: mydeployment
 	namespace: default
 spec:
 ...
 	minReadySeconds: 5
 	template:
     	annotations:
       	iam.amazonaws.com/role: my-role
   	spec:
     	containers:
 ...

‍

2. Add role annotation to the namespace in which the pods will run. You don’t need to do this with Kube2iam.

‍

apiVersion: v1
 kind: Namespace
 metadata:
 	name: default
 	annotations:
 		iam.amazonaws.com/permitted: ".*"

The default is not to allow any roles. You can use a regex as shown above to allow all roles or can even specify a particular role per namespace.

Deploying Kiam Agent and Server

Kiam Server

The manifest below deploys the following:

Kiam Server DaemonSet, which will run on Kubernetes master nodes (configure to use the TLS secret created above)
Kiam Server service
Service account, ClusterRole and ClusterRoleBinding required by Kiam server

‍

---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: kiam-server
  namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kiam-read
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  verbs:
  - watch
  - get
  - list
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kiam-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-read
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kiam-write
rules:
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kiam-write
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-write
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kiam
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kiam
  name: kiam-server
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: kiam
        role: server
    spec:
      tolerations:
       - key: node-role.kubernetes.io/master
         effect: NoSchedule
      serviceAccountName: kiam-server
      nodeSelector:
        kubernetes.io/role: master
      volumes:
        - name: ssl-certs
          hostPath:
      nodeSelector:
      nodeSelector:
        kubernetes.io/role: master
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/ssl/certs
        - name: tls
          secret:
            secretName: kiam-server-tls
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - server
            - --level=info
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --role-base-arn-autodetect
            - --assume-role-arn=<KIAM_SERVER_ROLE_ARN>
            - --sync=1m
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: kiam-server
  namespace: kiam
spec:
  clusterIP: None
  selector:
    app: kiam
    role: server
  ports:
  - name: grpclb
    port: 443
    targetPort: 443
    protocol: TCP

‍

Note:

The scheduler toleration and node selector that we have in place here make sure that the Kiam pods get scheduled on Kiam master nodes only. This is why we enable the trust relationship between the Kiam-server IAM role and the IAM role attached to the Kubernetes master nodes (above).

...
       tolerations:
       - key: node-role.kubernetes.io/master
         effect: NoSchedule 
...

...
      nodeSelector:
        kubernetes.io/role: master
....

The kiam-server role ARN is provided as an argument to the Kiam server container. Make sure you update the <KIAM_SERVER_ROLE_ARN> field in the manifest above to the ARN of the role you created.
The ClusterRole and ClusterRoleBinding created for a Kiam server grant it the minimal permissions required to operate effectively. Please consider them thoroughly before changing them.
Ensure the path to SSL Certs is set correctly according to the secret you created using cert-manager certificates. This is important to establish secure communication between the Kiam server and Kiam agent pods.

Kiam Agent

The manifest provided below will deploy the following Kiam Agent DaemonSet, which will run on Kubernetes worker nodes only:

‍

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kiam
  name: kiam-agent
spec:
  template:
    metadata:
      labels:
        app: kiam
        role: agent
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/ssl/certs
        - name: tls
          secret:
            secretName: kiam-agent-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]
          image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=cali+
            - --json-log
            - --port=8181
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --server-address=kiam-server:443
            - --gateway-timeout-creation=30s
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

‍

It should be noted that Kiam agent also runs with host networking set to true, similar to Kube2iam. Also, one of the arguments to the Kiam agent’s container is the name of the Kiam service to access Kiam server, in this case kiam-server:443 Therefore, we should deploy the Kiam server before deploying the Kiam agent.

Also, the container argument gateway-timeout-creation defines the waiting period for the Kiam server pod to be up before the agent tries to connect. It can be tweaked depending on how long the pod takes to come up in your Kubernetes cluster. Ideally, a thirty-second waiting period is enough.

‍

Testing

The processes for testing the Kiam and Kube2iam setups are the same. You can use a test pod and curl the metadata to check the assigned role. Please ensure that both deployment and namespace are properly annotated.

‍

IAM Roles for Service Accounts (IRSA)

Recently, AWS released its service to allow pods to access AWS resources: IAM roles for Service Accounts (IRSA). Since a role is authenticated with a service account, it can be shared by all pods to which that service account is attached. This service is available in both AWS EKS and a KOPS-based installation. You can read more about it here.

Conclusion

The tools covered in this blog help manage access from Kubernetes pods to AWS resources, and each has pros and cons.

While Kube2IAM is the easiest to implement, the ease of setup compromises efficiency: Kube2iam might not perform reliably under high load conditions. It is more suited for non-production environments or scenarios that don’t experience major traffic surges.

IAM IRSA requires more work than Kube2iam, but given Amazon’s detailed documentation, it may be easier to implement. Since it is so recent, there is not enough implementation of IRSA in the industry at the time this article was written.

KIAM’s implementation needs cert-manager running, and unlike with Kube2iam, you need to annotate the namespace along with the deployment. Regardless, we highly recommend using Kiam because it can be used in all cases, provided you have the resources for running cert-manager and your master nodes are equipped to handle a DaemonSet running on them. Using the manifests provided in this post will make your set-up seamless and production-ready.

If you want to try visualizing your metrics on Grafana dashboards powered by Prometheus, sign up for the MetricFire free trial today. You can also sign up for a demo and talk to us directly about what monitoring solutions work for you.

This article was written by our guest blogger Vaibhav Thakur. If you liked this article, check out his LinkedIn for more.

Start your free trial

Kubernetes on AWS Resources

Introduction

Why is Managing Access to AWS Services a Problem?

Diving into Implementation with Kube2iam

Overall Architecture

Implementation

Creating and Attaching IAM Roles

Deploying Kube2iam

Testing Access from a Test Pod

Kiam

Overall Architecture

Implementation

Creating and Attaching IAM Roles

Deploying Cert Manager and Generating Certificates

Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS

Annotating Resources

Deploying Kiam Agent and Server

Kiam Server

IAM Roles for Service Accounts (IRSA)

Conclusion

Comprehensive Guide to Developing and Deploying a Python API with Docker and Kubernetes (Part II)

Comprehensive Guide to Developing and Deploying a Python API with Docker and Kubernetes (Part I)

Managing a Kubernetes Cluster Using Terraform

We strive for 99.95% uptime

Try MetricFire now!