Table of Contents
Introduction
Kubernetes is an open-source container orchestration system that allows you to get the most out of your machine. Using Kubernetes, however, raises the problem of managing pod access to various Amazon Web Services (AWS). This article covers how to overcome these problems by using specific tools. Here’s how we’ve organized the information:
- Why managing access can be a problem
- Managing access through Kube2iam
- Managing access through KIAM
- IAM Roles for Service Accounts (IRSA)
Why is Managing Access to AWS Services a Problem?
Imagine this: A Kubernetes node is hosting an application pod that needs access to AWS DynamoDB tables. Meanwhile, another pod on the same node needs access to an AWS S3 bucket. For both applications to work properly, the Kubernetes worker node must access both the DynamoDB tables and the S3 bucket at the same time.
Now, think about this happening to hundreds of pods, all requiring access to various AWS resources. The pods are constantly being scheduled on a Kubernetes cluster that needs to access several different AWS services simultaneously… It’s a lot!
One way to solve this would be to give the Kubernetes node—and, therefore, the pods—access to all AWS resources. However, this leaves your system an easy target for any potential attacker: if a single pod or node is compromised, an attacker will gain access to your entire AWS infrastructure. To avoid this, you can use tools like Kube2iam, Kiam, and IAM IRSA to improve Kubernetes pod access to the AWS resources. The best part? All the access API calls and authentication metrics can be pulled by Prometheus and visualized in Grafana. If you want to try the Prometheus/Grafana part, get onto our MetricFire free trial and start sending your data.
Diving into Implementation with Kube2iam
Overall Architecture
Kube2iam is deployed as a DaemonSet in your cluster. Therefore, a Kube2iam pod will be scheduled to run on every worker node of your Kubernetes cluster. Whenever a different pod makes an AWS API call to access resources, that call will be intercepted by the Kube2iam pod running on that node. Kube2iam then ensures the pod is assigned appropriate credentials to access the resource.
You must also specify an Identity and Access Management (IAM) role in the pod spec. Under the hood, the Kube2iam pod will retrieve temporary credentials for the IAM role of the caller and return them to said caller. All the Amazon Elastic Compute Cloud (EC2) metadata API calls are made into a proxy. (A Kube2iam pod should run with host networking enabled to make the EC2 metadata API calls.)
Implementation
Creating and Attaching IAM Roles
- Create an IAM role named my-role, which will have access to the required AWS resources (such as an AWS S3 bucket).
- Follow these steps to enable trust relationship between the role and the role attached to the Kubernetes worker nodes. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kube2iam, so the worker node IAM roles do not need access to a large number of AWS Resources.)
a. Go to the newly created role in the AWS console and select the ‘Trust relationships’ tab
b. Click ‘Edit trust relationship’
c. Add the following content to the policy:
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "<ARN_KUBERNETES_NODES_IAM_ROLE>"
},
"Action": "sts:AssumeRole"
}
d. Enable ‘Assume role’ for Node Pool IAM roles. Add the following content to Nodes IAM policy:
{
"Sid": "",
"Effect": "Allow",
"Action": [
"sts:AssumeRole"
],
"Resource": [
"arn:aws:iam::810085094893:instance-profile/*"
]
}
3. Add the IAM role's name to deployment as an annotation.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mydeployment
namespace: default
spec:
...
minReadySeconds: 5
template:
annotations:
iam.amazonaws.com/role: my-role
spec:
containers:
...
Deploying Kube2iam
- Create the service account, ClusterRole and ClusterRoleBinding, to be used by Kube2iam pods. The ClusterRole should have 'get', 'watch' and 'list' access to namespaces and pods under all API groups. You can use the manifest below to create them:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube2iam
namespace: kube-system
---
apiVersion: v1
kind: List
items:
- apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: kube2iam
rules:
- apiGroups: [""]
resources: ["namespaces","pods"]
verbs: ["get","watch","list"]
- apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kube2iam
subjects:
- kind: ServiceAccount
name: kube2iam
namespace: kube-system
roleRef:
kind: ClusterRole
name: kube2iam
apiGroup: rbac.authorization.k8s.io
---
2. Deploy the Kube2iam DaemonSet by using the manifest below:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube2iam
labels:
app: kube2iam
namespace: kube-system
spec:
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: kube2iam
spec:
hostNetwork: true
serviceAccount: kube2iam
containers:
- image: jtblin/kube2iam:latest
name: kube2iam
args:
- "--auto-discover-base-arn"
- "--iptables=true"
- "--host-ip=$(HOST_IP)"
- "--host-interface=cali+"
- "--verbose"
- "--debug"
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
ports:
- containerPort: 8181
hostPort: 8181
name: http
securityContext:
privileged: true
---
Note: The Kube2iam container is being run with the arguments --iptables=true and --host-ip=$(HOST_IP), and in privileged mode as true.
...
securityContext:
privileged: true
...
The following settings prevent containers running in other pods from directly accessing the EC2 metadata API and gaining unwanted access to AWS resources. The traffic to 169.254.169.254 must be made into a proxy for Docker containers. This can be alternatively applied by running the following command on each Kubernetes worker node:
iptables \
--append PREROUTING \
--protocol tcp \
--destination 169.254.169.254 \
--dport 80 \
--in-interface docker0 \
--jump DNAT \
--table nat \
--to-destination `curl 169.254.169.254/latest/meta-data/local-ipv4`:8181
Testing Access from a Test Pod
To check whether your Kube2iam deployment and IAM settings work, you can deploy a test pod with an IAM role specified as an annotation. If everything works, you should be able to check which IAM node gets attached to your pod. This can be easily verified by querying the EC2 Metadata API. Let’s deploy a test pod using the manifest below:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: access-test
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: access-test
minReadySeconds: 5
template:
metadata:
labels:
app: access-test
annotations:
iam.amazonaws.com/role: my-role
spec:
containers:
- name: access-test
image: "iotapi322/worker:v4"
Run the following command in the test pod created:
curl 169.254.169.254/latest/meta-data/iam/security-credentials/
You should get myrole as the response to this API.
I highly recommend tailing the logs of the Kube2iam pod running on that node to gain a deeper understanding of how and when the API calls are being intercepted. Once the setup works as expected, you should turn off verbosity in the Kube2iam deployment to avoid bombarding your logging backend.
Kiam
While very helpful, Kube2iam has two significant issues that Kiam aims to resolve:
- Data races under load conditions: When there are very high spikes in application load and several pods in the cluster, sometimes Kube2iam returns incorrect credentials to those pods. The GitHub issue can be referenced here.
- Pre-fetch credentials: Access credentials are assigned to the IAM role specified in the pod spec before the container processes boot in the pod. By assigning the credentials before, Kiam reduces start latency and improves reliability.
Additional features of Kiam include:
- Use of structured logging to improve the integration into your Elacsticsearch, Logstash, Kibana (ELK) setup with pod names, roles, access key IDs, etc.
- Metrics are used to track response times, cache hit rates, etc. Prometheus readily scrapes these metrics and renders them over Grafana.
Overall Architecture
Kiam is based on agent-server architecture.
- Kiam Agent: This process would typically be deployed as a DaemonSet to ensure that pods have no access to the AWS Metadata API. Instead, the Kiam agent runs an HTTP proxy that intercepts credentials requests and passes on everything else.
- Kiam Server: This process is responsible for connecting the Kubernetes API servers to watch pods and communicating with the AWS Security Token Service (STS) to request credentials. It also maintains a cache of credentials for roles currently in use by running pods, ensuring that credentials are refreshed every few minutes and stored before the pods need them.
Implementation
Similar to Kube2iam, a pod must get credentials for any IAM role by specifying it as an annotation in the deployment manifest. Additionally, you need to specify which IAM roles can be allocated inside a particular namespace using appropriate annotations. This enhances security and lets you fine-tune control of IAM roles.
Creating and Attaching IAM Roles
1. Create an IAM role named kiam-server with appropriate access to AWS resources.
2. Enable trust relationship between the kiam-server role and the role attached to Kubernetes master nodes by following these steps. (Make sure that the role attached to the Kubernetes API worker has very limited permissions—all the API calls or access requests are made by containers running on the node and will receive credentials using Kiam. The worker node IAM roles do not need access to many AWS resources.)
a. Go to the newly created role in the AWS console and select the ‘Trust relationships’ tab.
b. Click on ‘Edit trust relationship’.
c. Add the following content to the policy:
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "<ARN_KUBERNETES_MASTER_IAM_ROLE>"
},
"Action": "sts:AssumeRole"
}
3. Add an in-line policy to the kiam-server role.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"sts:AssumeRole"
],
"Resource": "*"
}
]
}
4. Create the IAM role (let's call it my-role) with appropriate access to AWS resources.
5. Enable a trust between the newly created and Kiam server roles.
To do so:
a. Go to the newly created role in the AWS console and select ‘Trust relationships’
b. Click ‘Edit trust relationship’.
c. Add the following content to the policy:
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "<ARN_KIAM-SERVER_IAM_ROLE>"
},
"Action": "sts:AssumeRole"
}
6. Enable ‘Assume Role’ for Master Pool IAM roles. Add the following content as an in-line policy to master IAM roles:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"sts:AssumeRole"
],
"Resource": "<ARN_KIAM-SERVER_IAM_ROLE>"
}
]
All communication between Kiam agents and servers is TLS encrypted, which enhances security. To do this, we need to first deploy cert-manager in our Kubernetes cluster and generate certificates for our agent-server communication.
Deploying Cert Manager and Generating Certificates
1. Install the custom resource definition resources separately.
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
2. Create the namespace for cert-manager.
kubectl create namespace cert-manager
3. Label the cert-manager namespace to disable resource validation.
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
4. Add the Jetstack Helm repository.
helm repo add jetstack https://charts.jetstack.io
5. Update your local Helm chart repository cache.
helm repo update
6. Install the cert-manager Helm chart.
helm install --name cert-manager --namespace cert-manager --version v0.8.0 jetstack/cert-manager
Generate CA Private Key and Self-signed Certificate for Kiam Agent-server TLS
1. Generate the CRT file.
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=kiam" -out kiam.cert -days 3650 -reqexts v3_req -extensions v3_ca -out ca.crt
2. Save the CA key pair as a secret in Kubernetes.
kubectl create secret tls kiam-ca-key-pair \
--cert=ca.crt \
--key=ca.key \
--namespace=cert-manager
3. Deploy cluster issuer and issue the certificate.
a. Create the Kiam namespace.
apiVersion: v1
kind: Namespace
metadata:
name: kiam
annotations:
iam.amazonaws.com/permitted: ".*"
---
b. Deploy the cluster issuer and issue the certificate.
apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
name: kiam-ca-issuer
namespace: kiam
spec:
ca:
secretName: kiam-ca-key-pair
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: kiam-agent
namespace: kiam
spec:
secretName: kiam-agent-tls
issuerRef:
name: kiam-ca-issuer
kind: ClusterIssuer
commonName: kiam
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: kiam-server
namespace: kiam
spec:
secretName: kiam-server-tls
issuerRef:
name: kiam-ca-issuer
kind: ClusterIssuer
commonName: kiam
dnsNames:
- kiam-server
- kiam-server:443
- localhost
- localhost:443
- localhost:9610
---
4. Test if certificates are issued correctly.
kubectl -n kiam get secret kiam-agent-tls -o yaml
kubectl -n kiam get secret kiam-server-tls -o yaml
Annotating Resources
1. Add the IAM role’s name to deployment as an annotation.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mydeployment
namespace: default
spec:
...
minReadySeconds: 5
template:
annotations:
iam.amazonaws.com/role: my-role
spec:
containers:
...
2. Add role annotation to the namespace in which the pods will run. You don’t need to do this with Kube2iam.
apiVersion: v1
kind: Namespace
metadata:
name: default
annotations:
iam.amazonaws.com/permitted: ".*"
The default is not to allow any roles. You can use a regex as shown above to allow all roles or can even specify a particular role per namespace.
Deploying Kiam Agent and Server
Kiam Server
The manifest below deploys the following:
- Kiam Server DaemonSet, which will run on Kubernetes master nodes (configure to use the TLS secret created above)
- Kiam Server service
- Service account, ClusterRole and ClusterRoleBinding required by Kiam server
---
kind: ServiceAccount
apiVersion: v1
metadata:
name: kiam-server
namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: kiam-read
rules:
- apiGroups:
- ""
resources:
- namespaces
- pods
verbs:
- watch
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kiam-read
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kiam-read
subjects:
- kind: ServiceAccount
name: kiam-server
namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: kiam-write
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kiam-write
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kiam-write
subjects:
- kind: ServiceAccount
name: kiam-server
namespace: kiam
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
namespace: kiam
name: kiam-server
spec:
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: kiam
role: server
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: kiam-server
nodeSelector:
kubernetes.io/role: master
volumes:
- name: ssl-certs
hostPath:
nodeSelector:
nodeSelector:
kubernetes.io/role: master
volumes:
- name: ssl-certs
hostPath:
path: /etc/ssl/certs
- name: tls
secret:
secretName: kiam-server-tls
containers:
- name: kiam
image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
imagePullPolicy: Always
command:
- /kiam
args:
- server
- --level=info
- --bind=0.0.0.0:443
- --cert=/etc/kiam/tls/tls.crt
- --key=/etc/kiam/tls/tls.key
- --ca=/etc/kiam/tls/ca.crt
- --role-base-arn-autodetect
- --assume-role-arn=<KIAM_SERVER_ROLE_ARN>
- --sync=1m
volumeMounts:
- mountPath: /etc/ssl/certs
name: ssl-certs
- mountPath: /etc/kiam/tls
name: tls
livenessProbe:
exec:
command:
- /kiam
- health
- --cert=/etc/kiam/tls/tls.crt
- --key=/etc/kiam/tls/tls.key
- --ca=/etc/kiam/tls/ca.crt
- --server-address=localhost:443
- --gateway-timeout-creation=1s
- --timeout=5s
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 10
readinessProbe:
exec:
command:
- /kiam
- health
- --cert=/etc/kiam/tls/tls.crt
- --key=/etc/kiam/tls/tls.key
- --ca=/etc/kiam/tls/ca.crt
- --server-address=localhost:443
- --gateway-timeout-creation=1s
- --timeout=5s
initialDelaySeconds: 3
periodSeconds: 10
timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: kiam-server
namespace: kiam
spec:
clusterIP: None
selector:
app: kiam
role: server
ports:
- name: grpclb
port: 443
targetPort: 443
protocol: TCP
Note:
- The scheduler toleration and node selector that we have in place here make sure that the Kiam pods get scheduled on Kiam master nodes only. This is why we enable the trust relationship between the Kiam-server IAM role and the IAM role attached to the Kubernetes master nodes (above).
...
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
...
...
nodeSelector:
kubernetes.io/role: master
....
- The kiam-server role ARN is provided as an argument to the Kiam server container. Make sure you update the <KIAM_SERVER_ROLE_ARN> field in the manifest above to the ARN of the role you created.
- The ClusterRole and ClusterRoleBinding created for a Kiam server grant it the minimal permissions required to operate effectively. Please consider them thoroughly before changing them.
- Ensure the path to SSL Certs is set correctly according to the secret you created using cert-manager certificates. This is important to establish secure communication between the Kiam server and Kiam agent pods.
Kiam Agent
The manifest provided below will deploy the following Kiam Agent DaemonSet, which will run on Kubernetes worker nodes only:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
namespace: kiam
name: kiam-agent
spec:
template:
metadata:
labels:
app: kiam
role: agent
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
volumes:
- name: ssl-certs
hostPath:
path: /etc/ssl/certs
- name: tls
secret:
secretName: kiam-agent-tls
- name: xtables
hostPath:
path: /run/xtables.lock
type: FileOrCreate
containers:
- name: kiam
securityContext:
capabilities:
add: ["NET_ADMIN"]
image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
imagePullPolicy: Always
command:
- /kiam
args:
- agent
- --iptables
- --host-interface=cali+
- --json-log
- --port=8181
- --cert=/etc/kiam/tls/tls.crt
- --key=/etc/kiam/tls/tls.key
- --ca=/etc/kiam/tls/ca.crt
- --server-address=kiam-server:443
- --gateway-timeout-creation=30s
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- mountPath: /etc/ssl/certs
name: ssl-certs
- mountPath: /etc/kiam/tls
name: tls
- mountPath: /var/run/xtables.lock
name: xtables
livenessProbe:
httpGet:
path: /ping
port: 8181
initialDelaySeconds: 3
periodSeconds: 3
It should be noted that Kiam agent also runs with host networking set to true, similar to Kube2iam. Also, one of the arguments to the Kiam agent’s container is the name of the Kiam service to access Kiam server, in this case kiam-server:443 Therefore, we should deploy the Kiam server before deploying the Kiam agent.
Also, the container argument gateway-timeout-creation defines the waiting period for the Kiam server pod to be up before the agent tries to connect. It can be tweaked depending on how long the pod takes to come up in your Kubernetes cluster. Ideally, a thirty-second waiting period is enough.
Testing
The processes for testing the Kiam and Kube2iam setups are the same. You can use a test pod and curl the metadata to check the assigned role. Please ensure that both deployment and namespace are properly annotated.
IAM Roles for Service Accounts (IRSA)
Recently, AWS released its service to allow pods to access AWS resources: IAM roles for Service Accounts (IRSA). Since a role is authenticated with a service account, it can be shared by all pods to which that service account is attached. This service is available in both AWS EKS and a KOPS-based installation. You can read more about it here.
Conclusion
The tools covered in this blog help manage access from Kubernetes pods to AWS resources, and each has pros and cons.
While Kube2IAM is the easiest to implement, the ease of setup compromises efficiency: Kube2iam might not perform reliably under high load conditions. It is more suited for non-production environments or scenarios that don’t experience major traffic surges.
IAM IRSA requires more work than Kube2iam, but given Amazon’s detailed documentation, it may be easier to implement. Since it is so recent, there is not enough implementation of IRSA in the industry at the time this article was written.
KIAM’s implementation needs cert-manager running, and unlike with Kube2iam, you need to annotate the namespace along with the deployment. Regardless, we highly recommend using Kiam because it can be used in all cases, provided you have the resources for running cert-manager and your master nodes are equipped to handle a DaemonSet running on them. Using the manifests provided in this post will make your set-up seamless and production-ready.
If you want to try visualizing your metrics on Grafana dashboards powered by Prometheus, sign up for the MetricFire free trial today. You can also sign up for a demo and talk to us directly about what monitoring solutions work for you.
This article was written by our guest blogger Vaibhav Thakur. If you liked this article, check out his LinkedIn for more.