ekscloudwatchを試す

EKSでFalcoを試す2の続きでekscloudwatchを試したメモ。

コンポーネント バージョン
EKS 1.19
プラットフォームバージョン eks.5
Falco 0.29.1
Falcoチャート 1.15.3
ekscloudwatch ekscloudwatch-0.3

参考リンク

クラスターの準備

1.19でクラスターを作成する。

cat << EOF > cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: ekscloudwatch
  region: ap-northeast-1
  version: "1.19"
vpc:
  cidr: "10.0.0.0/16"

availabilityZones:
  - ap-northeast-1a
  - ap-northeast-1c

managedNodeGroups:
  - name: managed-ng-1
    minSize: 2
    maxSize: 2
    desiredCapacity: 2
    privateNetworking: true

cloudWatch:
  clusterLogging:
    enableTypes: ["*"]

iam:
  withOIDC: true
EOF
eksctl create cluster -f cluster.yaml

Falcoのデプロイ

初期化コンテナを使う方法でFalcoをインストールする。auditLog.enabled=trueとすることでServiceが作られる。

cat << EOF > values.yaml
image:
  repository: falcosecurity/falco-no-driver

auditLog:
  enabled: true

extraInitContainers:
  - name: driver-loader
    image: docker.io/falcosecurity/falco-driver-loader:0.29.1
    imagePullPolicy: Always
    securityContext:
      privileged: true
    volumeMounts:
      - mountPath: /host/proc
        name: proc-fs
        readOnly: true
      - mountPath: /host/boot
        name: boot-fs
        readOnly: true
      - mountPath: /host/lib/modules
        name: lib-modules
      - mountPath: /host/usr
        name: usr-fs
        readOnly: true
      - mountPath: /host/etc
        name: etc-fs
        readOnly: true

falcosidekick:
  enabled: true
  webui:
    enabled: true
EOF
$ helm upgrade --install falco falcosecurity/falco -n falco --create-namespace -f values.yaml
Release "falco" does not exist. Installing it now.
NAME: falco
LAST DEPLOYED: Wed Jul 14 03:23:33 2021
NAMESPACE: falco
STATUS: deployed
REVISION: 1
NOTES:
Falco agents are spinning up on each node in your cluster. After a few
seconds, they are going to start monitoring your containers looking for
security issues.


No further action should be required.

PodとServiceを確認する。

$ k -n falco get po
NAME                                      READY   STATUS    RESTARTS   AGE
falco-falcosidekick-5cbc97b7d9-887hg      1/1     Running   0          5m27s
falco-falcosidekick-5cbc97b7d9-rnbwn      1/1     Running   0          5m27s
falco-falcosidekick-ui-79c4d8b546-c9jcp   1/1     Running   0          5m27s
falco-nrffv                               1/1     Running   0          5m27s
falco-v8hh2                               1/1     Running   0          5m27s
$ k -n falco get svc
NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
falco                    ClusterIP   172.20.175.25    <none>        8765/TCP   5m31s
falco-falcosidekick      ClusterIP   172.20.61.171    <none>        2801/TCP   5m31s
falco-falcosidekick-ui   ClusterIP   172.20.179.147   <none>        2802/TCP   5m31s

ポートフォワードでfalcosidekick-uiにアクセスする。

k -n falco port-forward svc/falco-falcosidekick-ui 2802

ekscloudwatchのデプロイの前提となるEKSのセットアップ

コントロールプレーンのロギングは有効化済み。

VPCエンドポイントを作り、クラスターセキュリティグループからのアクセスを許可するように説明があるが、インターネット経由でアクセスする場合は不要なはず。

CloudWatchReadOnlyAccessが必要なため、後ほどIRSAで付与する。

ekscloudwatchのデプロイ

リポジトリをクローンする。

git clone https://github.com/sysdiglabs/ekscloudwatch.git

クローンしたリポジトリekscloudwatch-config.yamlをベースにfalco用にNamespaceとサービスの宛先をカスタマイズする。k8s_auditではなくk8s-auditなので注意。クラスター名とリージョンはオプションとあるが、取得できなかったので明示的に指定した。ポーリング間隔もテスト目的なので短くした。

cat << "EOF" > ekscloudwatch-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: ekscloudwatch-config
  namespace: falco
data:
  # Required: Endpoint to forward audit events to, such as Sysdig Secure agent
  # The agent must expose a k8s audit server (k8s_audit_server_port must be configured in the agent as well)
  endpoint: "http://falco:8765/k8s-audit"

  # Required: Cloudwatch polling interval
  cw_polling: "1m"

  # Required: CloudWatch query filter
  cw_filter: '{ $.sourceIPs[0] != "::1" && $.sourceIPs[0] != "127.0.0.1" }'

  # Optional: both the EKS cluster name and region must be set
  # This can be omitted if the EC2 instance can perform the ec2metadata and ec2:DescribeInstances action
  cluster_name: "ekscloudwatch"
  aws_region: "ap-northeast-1"
EOF

ConfiMapを作成する。

k apply -f ekscloudwatch-config.yaml

ServiceAccountを作成する。

k -n falco create sa eks-cloudwatch

IRSAで必要な権限を与える。

eksctl create iamserviceaccount \
    --name eks-cloudwatch \
    --namespace falco \
    --cluster ekscloudwatch \
    --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchReadOnlyAccess \
    --override-existing-serviceaccounts \
    --approve

deployment.yamlもNamespaceを修正し、ServiceAccoutNameを指定する。

cat << EOF > ekscloudwatch-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: eks-cloudwatch
  namespace: falco
spec:
  minReadySeconds: 5
  replicas: 1
  selector:
    matchLabels:
      app: eks-cloudwatch
  template:
    metadata:
      labels:
        app: eks-cloudwatch
    spec:
      serviceAccountName: eks-cloudwatch
      containers:
        - image: sysdiglabs/k8sauditlogforwarder:ekscloudwatch-0.3
          imagePullPolicy: Always
          name: eks-cloudwatch-container
          env:
            - name: ENDPOINT
              valueFrom:
                configMapKeyRef:
                  name: ekscloudwatch-config
                  key: endpoint
            - name: CLUSTER_NAME
              valueFrom:
                configMapKeyRef:
                  name: ekscloudwatch-config
                  key: cluster_name
            - name: AWS_REGION
              valueFrom:
                configMapKeyRef:
                  name: ekscloudwatch-config
                  key: aws_region
            - name: CW_POLLING
              valueFrom:
                configMapKeyRef:
                  name: ekscloudwatch-config
                  key: cw_polling
            - name: CW_FILTER
              valueFrom:
                configMapKeyRef:
                  name: ekscloudwatch-config
                  key: cw_filter
EOF

デプロイする。

k apply -f ekscloudwatch-deployment.yaml

Podが動いていることを確認する。

$ k -n falco get po
NAME                                      READY   STATUS    RESTARTS   AGE
eks-cloudwatch-66f84688d9-9rtjj           1/1     Running   0          13s
falco-falcosidekick-5cbc97b7d9-887hg      1/1     Running   0          16m
falco-falcosidekick-5cbc97b7d9-rnbwn      1/1     Running   0          16m
falco-falcosidekick-ui-79c4d8b546-c9jcp   1/1     Running   0          16m
falco-nrffv                               1/1     Running   0          16m
falco-v8hh2                               1/1     Running   0          16m
$ k -n falco logs eks-cloudwatch-66f84688d9-9rtjj
2021/07/13 18:39:35 Release 0.3
2021/07/13 18:39:35 Cloudwatch EKS log started
2021/07/13 18:39:38 386 logs sent to the agent (386 total)
2021/07/13 18:39:38 386 total logs

以下を参考にテストする。

テストのため、AWSアクセスキーを含むConfigMapを作成する。

cat << "EOF" > my-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-config
  namespace: default
data:
  ui.properties: |
    color.good=purple
    color.bad=yellow
    allow.textmode=true
  access.properties: |
    aws_access_key_id = MY-ID
    aws_secret_access_key = MY-KEY
EOF
k apply -f my-config.yaml

しばらくログを見ておく。

以下のログが出た。falcosidekickを有効にしたのでjson形式になっている。

{"output":"18:41:09.285203968: Notice K8s ConfigMap Created (user=kubernetes-admin configmap=my-config ns=default resp=201 decision=allow reason=)","priority":"Notice","rule":"K8s ConfigMap Created","time":"2021-07-13T18:41:09.285203968Z", "output_fields": {"jevt.time":"18:41:09.285203968","ka.auth.decision":"allow","ka.auth.reason":"","ka.response.code":"201","ka.target.name":"my-config","ka.target.namespace":"default","ka.user.name":"kubernetes-admin"}}

f:id:sotoiwa:20210714041900p:plain

Falcoのドキュメントには以下のようにあったのでちょっと違う。

17:18:28.428398080: Warning K8s ConfigMap with private credential (user=minikube-user verb=create configmap=my-config config={"access.properties":"aws_access_key_id = MY-ID\naws_secret_access_key = MY-KEY\n","ui.properties":"color.good=purple\ncolor.bad=yellow\nallow.textmode=true\n"})

Logs Insightsでクエリする。

fields @timestamp, @message
| sort @timestamp desc
| filter @message like /my-config/
| limit 20

以下のようなイベントが記録されている。

{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "Metadata",
    "auditID": "3f90731e-1bd7-449d-bd0a-158d5e702ec4",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/namespaces/default/configmaps?fieldManager=kubectl-client-side-apply",
    "verb": "create",
    "user": {
        "username": "kubernetes-admin",
        "uid": "heptio-authenticator-aws:190189382900:AIDASYSBLVT2NYMBDBBGS",
        "groups": [
            "system:masters",
            "system:authenticated"
        ],
        "extra": {
            "accessKeyId": [
                "AKIASYSBLVT2CGD36GGA"
            ]
        }
    },
    "sourceIPs": [
        "27.0.3.145"
    ],
    "userAgent": "kubectl/v1.21.2 (darwin/amd64) kubernetes/092fbfb",
    "objectRef": {
        "resource": "configmaps",
        "namespace": "default",
        "name": "my-config",
        "apiVersion": "v1"
    },
    "responseStatus": {
        "metadata": {},
        "code": 201
    },
    "requestReceivedTimestamp": "2021-07-13T18:41:09.271404Z",
    "stageTimestamp": "2021-07-13T18:41:09.285204Z",
    "annotations": {
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": ""
    }
}

FalcoのドキュメントではRequestResponseのレベルのイベントを前提としているが、EKSの場合Metadataレベルでしかとっていないためと思われる。

EKSではaudit-log policyはカスタマイズ不可。

EKSベスプラによると、aws-authの編集はRequestResponceでとるようになっている。

ベースのポリシーの参照先がずれていてわからないが、今だとたぶんここ。

他に以下のようなログも出ているがこれはよくわからない。

{"output":"18:39:28.992314112: Notice K8s Serviceaccount Created (user=system:node:ip-10-0-104-136.ap-northeast-1.compute.internal user=eks-cloudwatch ns=falco resp=201 decision=allow reason=)","priority":"Notice","rule":"K8s Serviceaccount Created","time":"2021-07-13T18:39:28.992314112Z", "output_fields": {"jevt.time":"18:39:28.992314112","ka.auth.decision":"allow","ka.auth.reason":"","ka.response.code":"201","ka.target.name":"eks-cloudwatch","ka.target.namespace":"falco","ka.user.name":"system:node:ip-10-0-104-136.ap-northeast-1.compute.internal"}}
{"output":"18:41:01.996801024: Notice K8s Serviceaccount Created (user=system:kube-controller-manager user=generic-garbage-collector ns=kube-system resp=201 decision=allow reason=RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\")","priority":"Notice","rule":"K8s Serviceaccount Created","time":"2021-07-13T18:41:01.996801024Z", "output_fields": {"jevt.time":"18:41:01.996801024","ka.auth.decision":"allow","ka.auth.reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\"","ka.response.code":"201","ka.target.name":"generic-garbage-collector","ka.target.namespace":"kube-system","ka.user.name":"system:kube-controller-manager"}}

EKSでFalcoを試す2

以前、Falcoを試したり、EKSでFalcoを試したりしたが、再びEKSでFalcoを試すメモ。

コンポーネント バージョン
EKS 1.19
プラットフォームバージョン eks.5
Falco 0.29.1
Falcoチャート 1.15.3

クラスターの準備

1.19でクラスターを作成する。

cat << EOF > cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: falco
  region: ap-northeast-1
  version: "1.19"
vpc:
  cidr: "10.2.0.0/16"

availabilityZones:
  - ap-northeast-1a
  - ap-northeast-1c

managedNodeGroups:
  - name: managed-ng-1
    minSize: 2
    maxSize: 2
    desiredCapacity: 2
    privateNetworking: true

cloudWatch:
  clusterLogging:
    enableTypes: ["*"]

iam:
  withOIDC: true
EOF
eksctl create cluster -f cluster.yaml

サンプルアプリのデプロイ

サンプルのNginx Deploymentを作成する。

cat << EOF > deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: beta.kubernetes.io/arch
                operator: In
                values:
                - amd64
                - arm64
      containers:
      - name: nginx
        image: nginx:1.19.2
        ports:
        - containerPort: 80
EOF
$ kubectl apply -f deployment.yaml
deployment.apps/nginx created

Deploymentを確認する。

$ kubectl get deployments --all-namespaces
NAMESPACE     NAME      READY   UP-TO-DATE   AVAILABLE   AGE
default       nginx     3/3     3            3           13s
kube-system   coredns   2/2     2            2           4h17m

Fluent Bitのデプロイ

以下のリポジトリではなく、

今回はこちらの手順を使う。

amazon-cloudwatch Namespaceを作成する。

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cloudwatch-namespace.yaml

ConfigMapを作成する。

ClusterName=falco
RegionName=ap-northeast-1
FluentBitHttpPort='2020'
FluentBitReadFromHead='Off'
[[ ${FluentBitReadFromHead} = 'On' ]] && FluentBitReadFromTail='Off'|| FluentBitReadFromTail='On'
[[ -z ${FluentBitHttpPort} ]] && FluentBitHttpServer='Off' || FluentBitHttpServer='On'
kubectl create configmap fluent-bit-cluster-info \
--from-literal=cluster.name=${ClusterName} \
--from-literal=http.server=${FluentBitHttpServer} \
--from-literal=http.port=${FluentBitHttpPort} \
--from-literal=read.head=${FluentBitReadFromHead} \
--from-literal=read.tail=${FluentBitReadFromTail} \
--from-literal=logs.region=${RegionName} -n amazon-cloudwatch

Fluent Bitをデプロイする。

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/fluent-bit/fluent-bit.yaml

IRSAでCloudWatchAgentServerPolicyポリシーをアタッチする。

eksctl create iamserviceaccount \
    --name fluent-bit \
    --namespace amazon-cloudwatch \
    --cluster falco \
    --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \
    --override-existing-serviceaccounts \
    --approve

Fluent Bitの設定を確認する。

$ k -n amazon-cloudwatch get cm fluent-bit-config -o yaml | k neat
apiVersion: v1
data:
  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Path                /var/log/containers/*.log
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_container.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

    [INPUT]
        Name                tail
        Tag                 application.*
        Path                /var/log/containers/fluent-bit*
        Parser              docker
        DB                  /var/fluent-bit/state/flb_log.db
        Mem_Buf_Limit       5MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Read_from_Head      ${READ_FROM_HEAD}

    [INPUT]
        Name                tail
        Tag                 application.*
        Path                /var/log/containers/cloudwatch-agent*
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  cwagent_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_cwagent.db
        Mem_Buf_Limit       5MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Read_from_Head      ${READ_FROM_HEAD}

    [FILTER]
        Name                kubernetes
        Match               application.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_Tag_Prefix     application.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off
        Labels              Off
        Annotations         Off

    [OUTPUT]
        Name                cloudwatch_logs
        Match               application.*
        region              ${AWS_REGION}
        log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
        log_stream_prefix   ${HOST_NAME}-
        auto_create_group   true
        extra_user_agent    container-insights
  dataplane-log.conf: |
    [INPUT]
        Name                systemd
        Tag                 dataplane.systemd.*
        Systemd_Filter      _SYSTEMD_UNIT=docker.service
        Systemd_Filter      _SYSTEMD_UNIT=kubelet.service
        DB                  /var/fluent-bit/state/systemd.db
        Path                /var/log/journal
        Read_From_Tail      ${READ_FROM_TAIL}

    [INPUT]
        Name                tail
        Tag                 dataplane.tail.*
        Path                /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_dataplane_tail.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

    [FILTER]
        Name                modify
        Match               dataplane.systemd.*
        Rename              _HOSTNAME                   hostname
        Rename              _SYSTEMD_UNIT               systemd_unit
        Rename              MESSAGE                     message
        Remove_regex        ^((?!hostname|systemd_unit|message).)*$

    [FILTER]
        Name                aws
        Match               dataplane.*
        imds_version        v1

    [OUTPUT]
        Name                cloudwatch_logs
        Match               dataplane.*
        region              ${AWS_REGION}
        log_group_name      /aws/containerinsights/${CLUSTER_NAME}/dataplane
        log_stream_prefix   ${HOST_NAME}-
        auto_create_group   true
        extra_user_agent    container-insights
  fluent-bit.conf: "[SERVICE]\n    Flush                     5\n    Log_Level                 info\n
    \   Daemon                    off\n    Parsers_File              parsers.conf\n
    \   HTTP_Server               ${HTTP_SERVER}\n    HTTP_Listen               0.0.0.0\n
    \   HTTP_Port                 ${HTTP_PORT}\n    storage.path              /var/fluent-bit/state/flb-storage/\n
    \   storage.sync              normal\n    storage.checksum          off\n    storage.backlog.mem_limit
    5M\n    \n@INCLUDE application-log.conf\n@INCLUDE dataplane-log.conf\n@INCLUDE
    host-log.conf\n"
  host-log.conf: |
    [INPUT]
        Name                tail
        Tag                 host.dmesg
        Path                /var/log/dmesg
        Parser              syslog
        DB                  /var/fluent-bit/state/flb_dmesg.db
        Mem_Buf_Limit       5MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Read_from_Head      ${READ_FROM_HEAD}

    [INPUT]
        Name                tail
        Tag                 host.messages
        Path                /var/log/messages
        Parser              syslog
        DB                  /var/fluent-bit/state/flb_messages.db
        Mem_Buf_Limit       5MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Read_from_Head      ${READ_FROM_HEAD}

    [INPUT]
        Name                tail
        Tag                 host.secure
        Path                /var/log/secure
        Parser              syslog
        DB                  /var/fluent-bit/state/flb_secure.db
        Mem_Buf_Limit       5MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Read_from_Head      ${READ_FROM_HEAD}

    [FILTER]
        Name                aws
        Match               host.*
        imds_version        v1

    [OUTPUT]
        Name                cloudwatch_logs
        Match               host.*
        region              ${AWS_REGION}
        log_group_name      /aws/containerinsights/${CLUSTER_NAME}/host
        log_stream_prefix   ${HOST_NAME}.
        auto_create_group   true
        extra_user_agent    container-insights
  parsers.conf: |
    [PARSER]
        Name                docker
        Format              json
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

    [PARSER]
        Name                syslog
        Format              regex
        Regex               ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key            time
        Time_Format         %b %d %H:%M:%S

    [PARSER]
        Name                container_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

    [PARSER]
        Name                cwagent_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
kind: ConfigMap
metadata:
  labels:
    k8s-app: fluent-bit
  name: fluent-bit-config
  namespace: amazon-cloudwatch

EKSのノードにこのポリシーをアタッチする。今ならIRSAを使うところだがこのままにする。

POLICY_ARN=$(aws iam list-policies | jq -r '.[][] | select(.PolicyName == "EKS-CloudWatchLogs") | .Arn')
ROLE_NAME=$(aws iam list-roles | jq -r '.[][] | select( .RoleName | contains("falco") and contains("NodeInstanceRole") ) | .RoleName')
aws iam attach-role-policy --role-name ${ROLE_NAME} --policy-arn ${POLICY_ARN}

Fluent Bitの設定ファイルは以下のようになっているので、リージョンだけap-northeast-1に直しておく。

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  labels:
    app.kubernetes.io/name: fluentbit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File  parsers.conf
    [INPUT]
        Name              tail
        Tag               falco.*
        Path              /var/log/containers/falco*.log
        Parser            falco
        DB                /var/log/flb_falco.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10
    [OUTPUT]
        Name cloudwatch
        Match falco.**
        region ap-northeast-1
        log_group_name falco
        log_stream_name alerts
        auto_create_group true
  parsers.conf: |
    [PARSER]
        Name        falco
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   Off
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        Decode_Field_As   json    log

ログがとれていることをマネジメントコンソールで確認する。

Falcoのデプロイ

Helm

FalcoのHelmチャートリポジトリを追加する。

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

リポジトリを確認する。

$ helm repo list
NAME                    URL                                                            
(省略)
falcosecurity           https://falcosecurity.github.io/charts   

リポジトリをクローンする。

git clone https://github.com/falcosecurity/charts

falcoチャートのディレクトリのrulesフォルダにルールファイルが格納されている。

チャートを確認する。

$ helm search repo falco
NAME                            CHART VERSION   APP VERSION DESCRIPTION
falcosecurity/falco             1.15.3          0.29.1      Falco
falcosecurity/falco-exporter    0.5.1           0.5.0       Prometheus Metrics Exporter for Falco output ev...
falcosecurity/falcosidekick     0.3.9           2.23.1      A simple daemon to help you with falco's outputs
stable/falco                    1.1.8           0.0.1       DEPRECATED - incubator/falco

デフォルトの設定を確認する。

helm inspect values falcosecurity/falco

初期化コンテナを使う方法でFalcoをインストールする。

cat << EOF > values.yaml
image:
  repository: falcosecurity/falco-no-driver

extraInitContainers:
  - name: driver-loader
    image: docker.io/falcosecurity/falco-driver-loader:0.29.1
    imagePullPolicy: Always
    securityContext:
      privileged: true
    volumeMounts:
      - mountPath: /host/proc
        name: proc-fs
        readOnly: true
      - mountPath: /host/boot
        name: boot-fs
        readOnly: true
      - mountPath: /host/lib/modules
        name: lib-modules
      - mountPath: /host/usr
        name: usr-fs
        readOnly: true
      - mountPath: /host/etc
        name: etc-fs
        readOnly: true
EOF
$ helm upgrade --install falco falcosecurity/falco -n falco --create-namespace -f values.yaml
Release "falco" does not exist. Installing it now.
NAME: falco
LAST DEPLOYED: Mon Jul 12 17:37:37 2021
NAMESPACE: falco
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Falco agents are spinning up on each node in your cluster. After a few
seconds, they are going to start monitoring your containers looking for
security issues.


No further action should be required.


Tip:
You can easily forward Falco events to Slack, Kafka, AWS Lambda and more with falcosidekick.
Full list of outputs: https://github.com/falcosecurity/charts/falcosidekick.
You can enable its deployment with `--set falcosidekick.enabled=true` or in your values.yaml.
See: https://github.com/falcosecurity/charts/blob/master/falcosidekick/values.yaml for configuration values.

Podを確認する。

$ k -n falco get po
NAME          READY   STATUS    RESTARTS   AGE
falco-jcqg4   1/1     Running   0          73s
falco-q5dfx   1/1     Running   0          73s

ログを見るとちゃんと検知してそう。

$ k -n falco logs -f falco-jcqg4
Mon Jul 12 09:20:36 2021: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d)
Mon Jul 12 09:20:36 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Mon Jul 12 09:20:36 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Mon Jul 12 09:20:36 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Mon Jul 12 09:20:36 2021: Starting internal webserver, listening on port 8765
09:20:36.994128000: Notice Privileged container started (user=<NA> user_loginuid=0 command=container:98f37173b8a2 k8s.ns=falco k8s.pod=falco-jcqg4 container=98f37173b8a2 image=falcosecurity/falco-no-driver:0.29.1) k8s.ns=falco k8s.pod=falco-jcqg4 container=98f37173b8a2
09:20:43.432799926: Notice Unexpected connection to K8s API Server from container (command=flb-pipeline -e /fluent-bit/firehose.so -e /fluent-bit/cloudwatch.so -e /fluent-bit/kinesis.so -c /fluent-bit/etc/fluent-bit.conf k8s.ns=amazon-cloudwatch k8s.pod=fluent-bit-2hx86 container=f836f31bf693 image=amazon/aws-for-fluent-bit:2.10.0 connection=10.2.122.129:34174->172.20.0.1:443) k8s.ns=amazon-cloudwatch k8s.pod=fluent-bit-2hx86 container=f836f31bf693
09:20:43.542446397: Notice Unexpected connection to K8s API Server from container (command=flb-pipeline -e /fluent-bit/firehose.so -e /fluent-bit/cloudwatch.so -e /fluent-bit/kinesis.so -c /fluent-bit/etc/fluent-bit.conf k8s.ns=amazon-cloudwatch k8s.pod=fluent-bit-2hx86 container=f836f31bf693 image=amazon/aws-for-fluent-bit:2.10.0 connection=10.2.122.129:34176->172.20.0.1:443) k8s.ns=amazon-cloudwatch k8s.pod=fluent-bit-2hx86 container=f836f31bf693

別のターミナルでkubectl execを実行して検知されることを確認する。

$ k -n default exec -it nginx-c75788bfd-wwczv -- bash
root@nginx-c75788bfd-wwczv:/# exit
exit

検知された。

09:23:51.478516126: Notice A shell was spawned in a container with an attached terminal (user=<NA> user_loginuid=-1 k8s.ns=default k8s.pod=nginx-c75788bfd-wwczv container=0da1313c2a74 shell=bash parent=runc cmdline=bash terminal=34816 container_id=0da1313c2a74 image=nginx) k8s.ns=default k8s.pod=nginx-c75788bfd-wwczv container=0da1313c2a74
09:24:09.330380954: Warning Shell history had been deleted or renamed (user=<NA> user_loginuid=-1 type=openat command=bash fd.name=/root/.bash_history name=/root/.bash_history path=<NA> oldpath=<NA> k8s.ns=default k8s.pod=nginx-c75788bfd-wwczv container=0da1313c2a74) k8s.ns=default k8s.pod=nginx-c75788bfd-wwczv container=0da1313c2a74

インスタンスにログインしてカーネルモジュールを確認する。

[root@ip-10-2-115-196 ~]# lsmod | grep falco
falco                 647168  2

なお、チャートを削除しても、ロードされたカーネルモジュールはそのまま。ノードを再起動すればなくなる。

チャートを削除する。

helm delete falco -n falco
k delete ns falco

Systemd

インストールする。

rpm --import https://falco.org/repo/falcosecurity-3672BA8F.asc
curl -s -o /etc/yum.repos.d/falcosecurity.repo https://falco.org/repo/falcosecurity-rpm.repo
yum -y install kernel-devel-$(uname -r)
yum -y install falco

Falcoを実行する。

systemctl enable falco
systemctl start falco

確認する。

[root@ip-10-2-115-196 ~]# systemctl status falco
● falco.service - Falco: Container Native Runtime Security
   Loaded: loaded (/usr/lib/systemd/system/falco.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2021-07-12 09:31:48 UTC; 39s ago
     Docs: https://falco.org/docs/
  Process: 5448 ExecStartPre=/sbin/modprobe falco (code=exited, status=0/SUCCESS)
 Main PID: 5473 (falco)
    Tasks: 11
   Memory: 27.6M
   CGroup: /system.slice/falco.service
           └─5473 /usr/bin/falco --pidfile=/var/run/falco.pid

Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Falco initialized with configuration file /etc/falco/falco.yaml
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Loading rules from file /etc/falco/falco_rules.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Loading rules from file /etc/falco/falco_rules.local.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Jul 12 09:31:49 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Starting internal webserver, listening on port 8765
Jul 12 09:31:49 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:49 2021: Starting internal webserver, listening on port 8765
[root@ip-10-2-115-196 ~]#

ログを確認する。

[root@ip-10-2-115-196 ~]# journalctl -fu falco
-- Logs begin at Sun 2021-07-11 22:45:25 UTC. --
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Falco initialized with configuration file /etc/falco/falco.yaml
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Loading rules from file /etc/falco/falco_rules.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Loading rules from file /etc/falco/falco_rules.local.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Jul 12 09:31:48 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:48 2021: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Jul 12 09:31:49 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Starting internal webserver, listening on port 8765
Jul 12 09:31:49 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: Mon Jul 12 09:31:49 2021: Starting internal webserver, listening on port 8765

ユニット定義ファイルを確認する。

[root@ip-10-2-115-196 ~]# cat /usr/lib/systemd/system/falco.service
[Unit]
Description=Falco: Container Native Runtime Security
Documentation=https://falco.org/docs/

[Service]
Type=simple
User=root
ExecStartPre=/sbin/modprobe falco
ExecStart=/usr/bin/falco --pidfile=/var/run/falco.pid
ExecStopPost=/sbin/rmmod falco
UMask=0077
TimeoutSec=30
RestartSec=15s
Restart=on-failure
PrivateTmp=true
NoNewPrivileges=yes
ProtectHome=read-only
ProtectSystem=full
ProtectKernelTunables=true
RestrictRealtime=true
RestrictAddressFamilies=~AF_PACKET

[Install]
WantedBy=multi-user.target

先ほどと同様にこのノードのPodにkubectl execしてみると検知された。

Jul 12 09:35:00 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: 09:35:00.109286185: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 k8s_nginx_nginx-c75788bfd-wwczv_default_cb43721e-2e33-4d7d-94a3-ada53f591889_1 (id=0da1313c2a74) shell=bash parent=runc cmdline=bash terminal=34816 container_id=0da1313c2a74 image=nginx)
Jul 12 09:35:00 ip-10-2-115-196.ap-northeast-1.compute.internal falco[5473]: 09:35:00.109286185: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 k8s_nginx_nginx-c75788bfd-wwczv_default_cb43721e-2e33-4d7d-94a3-ada53f591889_1 (id=0da1313c2a74) shell=bash parent=runc cmdline=bash terminal=34816 container_id=0da1313c2a74 image=nginx)

2行なのは以下あたりの設定が重複しているからと思われる。

# Send information logs to stderr and/or syslog Note these are *not* security
# notification logs! These are just Falco lifecycle (and possibly error) logs.
log_stderr: true
log_syslog: true

syslog_output:
  enabled: true

stdout_output:
  enabled: true

/var/log/messagesにも2行出た。

Jul 12 09:48:26 ip-10-2-115-196 falco: 09:48:26.733184775: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 k8s_nginx_nginx-c75788bfd-wwczv_default_cb43721e-2e33-4d7d-94a3-ada53f591889_1 (id=0da1313c2a74) shell=bash parent=runc cmdline=bash terminal=34816 container_id=0da1313c2a74 image=nginx)
Jul 12 09:48:26 ip-10-2-115-196 falco: 09:48:26.733184775: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 k8s_nginx_nginx-c75788bfd-wwczv_default_cb43721e-2e33-4d7d-94a3-ada53f591889_1 (id=0da1313c2a74) shell=bash parent=runc cmdline=bash terminal=34816 container_id=0da1313c2a74 image=nginx)

systemdで動かしている場合は、

syslog_output:
  enabled: true

とするのがよいと思われる。

Container InsightsのFluent Bitでも収集されていた。

{
    "host": "ip-10-2-115-196",
    "ident": "falco",
    "message": "09:48:26.733184775: Notice A shell was spawned in a container with an attached terminal (user=root user_loginuid=-1 k8s_nginx_nginx-c75788bfd-wwczv_default_cb43721e-2e33-4d7d-94a3-ada53f591889_1 (id=0da1313c2a74) shell=bash parent=runc cmdline=bash terminal=34816 container_id=0da1313c2a74 image=nginx)",
    "az": "ap-northeast-1c",
    "ec2_instance_id": "i-02304c16a8ae787c7"
}

eksctlでマネージド型ノードグループを追加する。以下のように書けばユーザーデータでインストールできる。

cat << "EOF" > managed-ng-2.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: falco
  region: ap-northeast-1

managedNodeGroups:
  - name: managed-ng-2
    minSize: 1
    maxSize: 1
    desiredCapacity: 1
    privateNetworking: true
    preBootstrapCommands:
      - |
        #!/bin/bash

        set -o errexit
        set -o pipefail
        set -o nounset

        rpm --import https://falco.org/repo/falcosecurity-3672BA8F.asc
        curl -s -o /etc/yum.repos.d/falcosecurity.repo https://falco.org/repo/falcosecurity-rpm.repo
        yum -y install kernel-devel-$(uname -r)
        yum -y install falco
        sed -i -e '/syslog_output/ { N; s/enabled: true/enabled: false/ }' /etc/falco/falco.yaml
        systemctl enable falco
        systemctl start falco
EOF
eksctl create nodegroup -f managed-ng-2.yaml

sedで検索した1つ下の行を置換する

sedで検索した1つ下の行を置換するメモ。

syslog_output:
  enabled: true

stdout_output:
  enabled: true

こういうyamlsyslog_output:の下の行のenabledfalseにしたい場合はどうすればよいか?

こうする。

cat << EOF > test.yaml
syslog_output:
  enabled: true

stdout_output:
  enabled: true
EOF

Macなのでgsed)

$ gsed -e '/syslog_output/ { N; s/enabled: true/enabled: false/ }' test.yaml
syslog_output:
  enabled: false

stdout_output:
  enabled: true

Logs Insightsのメモ

Logs Insightsの使い方のメモ。

CloudTrailログ

CloudTrailログで特定のIAMロールが読んだAPIを確認する。

fields @timestamp, eventSource, eventName, @message
| filter @message like /eksctl-audit-bridge-addon-iamserviceaccount-Role1-1522OIC96C2M1/
| sort @timestamp desc
| limit 20

EKS のコントロールプレーンコンポーネントのフラグ確認

kube-controller-manager のフラグを確認する。

fields @timestamp, @message, @logStream, @log
| filter @logStream like "kube-controller-manager"
| filter @message like "FLAG"
| sort @timestamp desc

Podのログ

fluent-bitで収集した以下のような形式のログから、

{
    "log": "[2021/07/12 07:14:27] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Sent 2 events to CloudWatch\n",
    "stream": "stderr",
    "kubernetes": {
        "pod_name": "fluent-bit-2hx86",
        "namespace_name": "amazon-cloudwatch",
        "pod_id": "41e471dc-ec14-4bbe-bd83-8caecbb4de99",
        "host": "ip-10-2-115-196.ap-northeast-1.compute.internal",
        "container_name": "fluent-bit",
        "docker_id": "c23dcbc5d1fa4fdbca3b0ab5be549495553c2830b7518dec64c31a9249dbff80",
        "container_hash": "amazon/aws-for-fluent-bit@sha256:1d1519cec7815c9cd665c989d745697f0feb07f5c1c73c192548a6cf53250466",
        "container_image": "amazon/aws-for-fluent-bit:2.10.0"
    }
}

Pod名でフィルターしてログ本文だけみる。

fields @timestamp, log
| filter kubernetes.pod_name like "fluent-bit"
| sort @timestamp desc
| limit 20

CLI からの使い方

CLIから使ってみる。

start_time=$(gdate --date "1 hour ago" +%s)
end_time=$(gdate --date now +%s)
aws logs start-query \
  --log-group-name '/aws/containerinsights/falco/application' \
  --start-time ${start_time} \
  --end-time ${end_time} \
  --query-string 'fields @timestamp, log
| filter kubernetes.pod_name like "fluent-bit"
| sort @timestamp desc
| limit 20'
{
    "queryId": "2d8a1658-6b62-4db8-b687-99c8790d0161"
}
$ aws logs get-query-results --query-id "2d8a1658-6b62-4db8-b687-99c8790d0161"
{
    "results": [
        [
            {
                "field": "@timestamp",
                "value": "2021-07-12 07:40:12.600"
            },
            {
                "field": "log",
                "value": "[2021/07/12 07:40:12] [ info] [output:cloudwatch_logs:cloudwatch_logs.2] Sent 2 events to CloudWatch\n"
            },
            {
                "field": "@ptr",
                "value": "CnIKOQo1MTkwMTg5MzgyOTAwOi9hd3MvY29udGFpbmVyaW5zaWdodHMvZmFsY28vYXBwbGljYXRpb24QABI1GhgCBgJ8V/EAAAAEj2q/kgAGDr8WoAAAABIgASiirpzNqS8wuIujzakvOBJAkFhIlBlQ8hEQERgB"
            }
        ],
        [

(snip)

        ]
    ],
    "statistics": {
        "recordsMatched": 1970.0,
        "recordsScanned": 1994.0,
        "bytesScanned": 1253144.0
    },
    "status": "Complete"
}

EKSでCiliumを試す2

以前EKSでCiliumを試したときに、そのままだと上手く動かなかったので、半年くらい経ってもう一度各コンポーネントの最新バージョンで試してみたメモ。

コンポーネント バージョン 備考
eksctl 0.54.0
Kubernetes バージョン 1.20
プラットフォームのバージョン eks.1
VPC CNI Plugin 1.7.10
Cilium 1.10.1

VPC CNI Plugin は最新は1.8.0だが、1.7の最新のパッチバージョンが推奨のようなのでそちらで試す。

クラスターの作成

クラスターを作成する。

cat <<EOF > cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: cilium
  region: ap-northeast-1
  version: "1.20"
vpc:
  cidr: "10.0.0.0/16"

availabilityZones:
  - ap-northeast-1a
  - ap-northeast-1c

managedNodeGroups:
  - name: managed-ng-1
    minSize: 2
    maxSize: 2
    desiredCapacity: 2
    ssh:
      allow: true
      publicKeyName: default
      # enableSsm: true

cloudWatch:
  clusterLogging:
    enableTypes: ["*"]

iam:
  withOIDC: true
EOF
eksctl create cluster -f cluster.yaml

VPC CNI Plugin の最新化

バージョンを確認する。Ciliumのマニュアル上も1.7.9以上にしろと書いてある。

$ k get ds -n kube-system -o wide
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS   IMAGES                                                                                SELECTOR
aws-node     2         2         2       2            2           <none>          23m   aws-node     602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon-k8s-cni:v1.7.5-eksbuild.1    k8s-app=aws-node
kube-proxy   2         2         2       2            2           <none>          23m   kube-proxy   602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/eks/kube-proxy:v1.20.4-eksbuild.2   k8s-app=kube-proxy

eksctlがaws-nodeを勝手にIRSAで動かしてくれるので、そのARNを確認する。

$ k get sa -n kube-system aws-node -o yaml | grep role-arn
    eks.amazonaws.com/role-arn: arn:aws:iam::XXXXXXXXXXXX:role/eksctl-cilium-addon-iamserviceaccount-kube-s-Role1-PUQJWEEGQJXC

EKS addon化してバージョンアップする。

eksctl create addon --cluster cilium \
  --name vpc-cni --version 1.7.10 \
  --service-account-role-arn=arn:aws:iam::XXXXXXXXXXXX:role:role/eksctl-cilium-addon-iamserviceaccount-kube-s-Role1-PUQJWEEGQJXC \
  --force
$ k get ds -n kube-system -o wide
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS   IMAGES                                                                                SELECTOR
aws-node     2         2         2       2            2           <none>          31m   aws-node     602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon-k8s-cni:v1.7.10-eksbuild.1   k8s-app=aws-node
kube-proxy   2         2         2       2            2           <none>          31m   kube-proxy   602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/eks/kube-proxy:v1.20.4-eksbuild.2   k8s-app=kube-proxy

Ciliumのインストール

Helmリポジトリを追加する。

helm repo add cilium https://helm.cilium.io/
helm repo update

HelmでCiliumをインストールする。

helm install cilium cilium/cilium --version 1.10.1 \
  --namespace kube-system \
  --set cni.chainingMode=aws-cni \
  --set enableIPv4Masquerade=false \
  --set tunnel=disabled \
  --set nodeinit.enabled=true \
  --set endpointRoutes.enabled=true

Ciliumがインストールされたことを確認する。

$ kubectl get po -A
NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE
kube-system   aws-node-v8zjq                     1/1     Running   0          4m11s
kube-system   aws-node-zsc4s                     1/1     Running   0          3m37s
kube-system   cilium-57vtb                       1/1     Running   0          39s
kube-system   cilium-dfr7x                       1/1     Running   0          39s
kube-system   cilium-node-init-5cxj2             1/1     Running   0          39s
kube-system   cilium-node-init-rnt69             1/1     Running   0          39s
kube-system   cilium-operator-689d85cb47-bmtjd   1/1     Running   0          39s
kube-system   cilium-operator-689d85cb47-jk4tm   1/1     Running   0          39s
kube-system   coredns-54bc78bc49-bmqkk           1/1     Running   0          13s
kube-system   coredns-54bc78bc49-kphgv           1/1     Running   0          28s
kube-system   kube-proxy-n6sdq                   1/1     Running   0          20m
kube-system   kube-proxy-rcp65                   1/1     Running   0          20m

Ciliumをインストールした後、Ciliumでポリシーを適用するために自動で再起動されたCoreDNSが起動しなくなるという問題はなく、問題なさそう。

以下のIssueもクローズされている。

手順にあるとおり、再起動が必要なPodを確認する。

for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
     ceps=$(kubectl -n "${ns}" get cep \
         -o jsonpath='{.items[*].metadata.name}')
     pods=$(kubectl -n "${ns}" get pod \
         -o custom-columns=NAME:.metadata.name,NETWORK:.spec.hostNetwork \
         | grep -E '\s(<none>|false)' | awk '{print $1}' | tr '\n' ' ')
     ncep=$(echo "${pods} ${ceps}" | tr ' ' '\n' | sort | uniq -u | paste -s -d ' ' -)
     for pod in $(echo $ncep); do
       echo "${ns}/${pod}";
     done
done

特になし。

テスト

テストがCLIできるようになっているので、CLIでテストしてみる。

CLIをインストールする。

curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-darwin-amd64.tar.gz{,.sha256sum}
shasum -a 256 -c cilium-darwin-amd64.tar.gz.sha256sum
tar xzvfC cilium-darwin-amd64.tar.gz ${HOME}/bin
rm cilium-darwin-amd64.tar.gz{,.sha256sum}
$ cilium version
cilium-cli: v0.8.2 compiled with go1.16.5 on darwin/amd64
$ cilium status --wait
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

DaemonSet         cilium             Desired: 2, Ready: 2/2, Available: 2/2
Deployment        cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
Containers:       cilium             Running: 2
                  cilium-operator    Running: 2
Image versions    cilium             quay.io/cilium/cilium:v1.10.1@sha256:f5fcdfd4929af5a8903b02da61332eea41dcdb512420b8c807e2e2904270561c: 2
                  cilium-operator    quay.io/cilium/operator-generic:v1.10.1@sha256:a1588ee00a15f2f2b419e4acd36bd57d64a5f10eb52d0fd4de689e558a913cd8: 2

テストを実行する。

$ cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
✨ [cilium.ap-northeast-1.eksctl.io] Creating namespace for connectivity check...
✨ [cilium.ap-northeast-1.eksctl.io] Deploying echo-same-node service...
✨ [cilium.ap-northeast-1.eksctl.io] Deploying same-node deployment...
✨ [cilium.ap-northeast-1.eksctl.io] Deploying client deployment...
✨ [cilium.ap-northeast-1.eksctl.io] Deploying client2 deployment...
✨ [cilium.ap-northeast-1.eksctl.io] Deploying echo-other-node service...
✨ [cilium.ap-northeast-1.eksctl.io] Deploying other-node deployment...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for deployments [client client2 echo-same-node] to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for deployments [echo-other-node] to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for CiliumEndpoint for pod cilium-test/client-7b7bf54b85-75nmw to appear...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for CiliumEndpoint for pod cilium-test/client2-666976c95b-n29pg to appear...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for CiliumEndpoint for pod cilium-test/echo-other-node-697d5d69b7-qxfm5 to appear...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-7967996674-qvm6t to appear...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for NodePort 10.0.1.129:30561 (cilium-test/echo-other-node) to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for NodePort 10.0.1.129:31548 (cilium-test/echo-same-node) to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for NodePort 10.0.58.170:30561 (cilium-test/echo-other-node) to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for NodePort 10.0.58.170:31548 (cilium-test/echo-same-node) to become ready...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for Cilium pod kube-system/cilium-57vtb to have all the pod IPs in eBPF ipcache...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for Cilium pod kube-system/cilium-dfr7x to have all the pod IPs in eBPF ipcache...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for pod cilium-test/client-7b7bf54b85-75nmw to reach kube-dns service...
⌛ [cilium.ap-northeast-1.eksctl.io] Waiting for pod cilium-test/client2-666976c95b-n29pg to reach kube-dns service...
🔭 Enabling Hubble telescope...
⚠️  Unable to contact Hubble Relay, disabling Hubble telescope and flow validation: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:4245: connect: connection refused"
ℹ️  Expose Relay locally with: kubectl port-forward -n kube-system deployment/hubble-relay 4245:4245
🏃 Running tests...

[=] Test [no-policies]
.............................
[=] Test [client-ingress]
..
[=] Test [echo-ingress]
....
[=] Test [to-fqdns]
..
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-fqdns-google' to namespace 'cilium-test'..
  [-] Scenario [to-fqdns/pod-to-world]
  [.] Action [to-fqdns/pod-to-world/https-to-google: cilium-test/client2-666976c95b-n29pg (10.0.39.210) -> google-https (google.com:443)]
  [.] Action [to-fqdns/pod-to-world/http-to-google: cilium-test/client-7b7bf54b85-75nmw (10.0.31.143) -> google-http (google.com:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://google.com:80" failed: command terminated with exit code 22
  [.] Action [to-fqdns/pod-to-world/http-to-www-google: cilium-test/client-7b7bf54b85-75nmw (10.0.31.143) -> www-google-http (www.google.com:80)]
  ℹ️  📜 Deleting CiliumNetworkPolicy 'client-egress-to-fqdns-google' from namespace 'cilium-test'..

[=] Test [to-entities-world]
...
[=] Test [allow-all]
.........................
[=] Test [dns-only]
.......
[=] Test [client-egress]
....
[=] Test [to-cidr-1111]
....
📋 Test Report
❌ 1/9 tests failed (1/81 actions), 0 warnings, 0 tests skipped, 0 scenarios skipped:
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-google: cilium-test/client-7b7bf54b85-75nmw (10.0.31.143) -> google-http (google.com:80)

Error: Connectivity test failed: 1 tests failed

Googleへの疎通確認に失敗している。

ポリシーなしの状態では問題ない。

$ k run pod1 --image=nginx
pod/pod1 created
$ k get po
NAME   READY   STATUS    RESTARTS   AGE
pod1   1/1     Running   0          10s
$ k exec -it pod1 -- bash
root@pod1:/# curl http://google.com/
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
root@pod1:/# exit
exit

注意点でいくつかアドバンスドな機能に制限があると書いてあるので、これがその辺かもしれない。

Starboard Operatorを試す

Starboard Operatorを試してみたメモ。

StarboardはCLIとOperatorとある。

導入

インストールはマニフェスト、Helm、Operator Lifecycle Managerによる方法がある。ここではHelmでインストールする。

helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm repo update
helm upgrade --install starboard-operator aqua/starboard-operator \
  -n starboard-operator --create-namespace \
  --set=targetNamespaces="" \
  --version 0.5.3

オペレーターにECRの読み取り権限を与える。

eksctl create iamserviceaccount \
  --name starboard-operator \
  --namespace starboard-operator \
  --cluster staging \
  --attach-policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly \
  --approve \
  --override-existing-serviceaccounts

GitHubのレートリミットを避けるため、パーソナルアクセストークンを作成して設定する。権限は不要。

GITHUB_TOKEN=<your token>

kubectl patch secret starboard -n starboard-operator \
  --type merge \
  -p "$(cat <<EOF
{
  "data": {
    "trivy.githubToken": "$(echo -n $GITHUB_TOKEN | base64)"
  }
}
EOF
)"

念のため一度Podを削除。

k delete pod -n starboard-operator --all

Podを確認する。

$ k get pod -n starboard-operator
NAME                                  READY   STATUS    RESTARTS   AGE
starboard-operator-7fff5747c4-zwp59   1/1     Running   0          55s

スキャン

脆弱性スキャンの結果を確認する。

$ kubectl get vulnerabilityreports -o wide -A
NAMESPACE            NAME                                                                      REPOSITORY                                     TAG                  SCANNER   AGE     CRITICAL   HIGH   MEDIUM   LOW   UNKNOWN
argocd               replicaset-argocd-dex-server-5dd657bd9-dex                                dexidp/dex                                     v2.27.0              Trivy     80s     0          9      8        3     0
argocd               replicaset-argocd-dex-server-66ff89cb7b-dex                               dexidp/dex                                     v2.27.0              Trivy     111s    0          9      8        3     0
argocd               replicaset-argocd-dex-server-fd74c7c8c-dex                                dexidp/dex                                     v2.27.0              Trivy     110s    0          9      8        3     0
argocd               replicaset-argocd-redis-66b48966cb-redis                                  library/redis                                  5.0.10-alpine        Trivy     56s     0          5      2        0     0
argocd               replicaset-argocd-redis-759b6bc7f4-redis                                  library/redis                                  6.2.1-alpine         Trivy     112s    0          0      0        0     0
argocd               replicaset-argocd-repo-server-6c495f858f-argocd-repo-server               argoproj/argocd                                v2.0.0               Trivy     28s     0          6      43       110   0
argocd               replicaset-argocd-repo-server-79d884f4f6-argocd-repo-server               argoproj/argocd                                v1.8.2               Trivy     40s     4          138    95       471   5
argocd               replicaset-argocd-repo-server-84d58ff546-argocd-repo-server               argoproj/argocd                                v2.0.1               Trivy     81s     0          3      38       108   0
argocd               replicaset-argocd-server-6dccb89f65-argocd-server                         argoproj/argocd                                v1.8.2               Trivy     35s     4          138    95       471   5
argocd               replicaset-argocd-server-7fd556c67c-argocd-server                         argoproj/argocd                                v2.0.1               Trivy     2m51s   0          3      38       108   0
argocd               replicaset-argocd-server-859b4b5578-argocd-server                         argoproj/argocd                                v2.0.0               Trivy     2m29s   0          6      43       110   0
argocd               statefulset-argocd-application-controller-argocd-application-controller   argoproj/argocd                                v2.0.1               Trivy     2m53s   0          3      38       108   0
backend              replicaset-backend-678944684b-backend                                     backend                                        75994d8              Trivy     75s     2          22     11       74    0
backend              replicaset-backend-7945cd669c-backend                                     backend                                        c65764f              Trivy     3m23s   2          22     11       74    0
backend              replicaset-backend-7d8b8f99cc-backend                                     backend                                        9ac248f              Trivy     80s     2          22     11       74    0
backend              replicaset-backend-b68bc665c-backend                                      backend                                        3d5f54d              Trivy     2m28s   0          7      4        2     0
calico-system        daemonset-calico-node-calico-node                                         calico/node                                    v3.17.1              Trivy     3m23s   0          0      0        0     0
calico-system        replicaset-calico-kube-controllers-5d786d9bbc-calico-kube-controllers     calico/kube-controllers                        v3.17.1              Trivy     110s    0          0      0        0     0
calico-system        replicaset-calico-typha-74fdb8b6f-calico-typha                            calico/typha                                   v3.17.1              Trivy     3m22s   0          0      0        0     0
cert-manager         replicaset-cert-manager-649c5f88bc-cert-manager                           jetstack/cert-manager-controller               v1.0.2               Trivy     2m26s   0          0      0        0     0
cert-manager         replicaset-cert-manager-68ff46b886-cert-manager                           jetstack/cert-manager-controller               v1.1.1               Trivy     82s     0          0      0        0     0
cert-manager         replicaset-cert-manager-cainjector-7cdbb9c945-cert-manager                jetstack/cert-manager-cainjector               v1.1.1               Trivy     2m26s   0          0      0        0     0
cert-manager         replicaset-cert-manager-cainjector-9747d56-cert-manager                   jetstack/cert-manager-cainjector               v1.0.2               Trivy     2m51s   0          0      0        0     0
cert-manager         replicaset-cert-manager-webhook-67584ff488-cert-manager                   jetstack/cert-manager-webhook                  v1.1.1               Trivy     3m22s   0          0      0        0     0
cert-manager         replicaset-cert-manager-webhook-849c7b574f-cert-manager                   jetstack/cert-manager-webhook                  v1.0.2               Trivy     82s     0          0      0        0     0
default              replicaset-nginx-6d4cf56db6-nginx                                         library/nginx                                  1.16                 Trivy     2m48s   13         45     29       92    0
default              replicaset-nginx-db749865c-nginx                                          library/nginx                                  1.17                 Trivy     2m52s   13         43     27       92    0
external-secrets     replicaset-external-secrets-56fbfc9687-kubernetes-external-secrets        external-secrets/kubernetes-external-secrets   7.2.1                Trivy     2m47s   0          0      0        0     0
external-secrets     replicaset-external-secrets-658cc9b744-kubernetes-external-secrets        godaddy/kubernetes-external-secrets            6.0.0                Trivy     3m18s   0          12     9        2     0
external-secrets     replicaset-external-secrets-69444c8577-kubernetes-external-secrets        external-secrets/kubernetes-external-secrets   6.1.0                Trivy     2m47s   0          10     9        2     0
external-secrets     replicaset-external-secrets-7cfc59f6d7-kubernetes-external-secrets        external-secrets/kubernetes-external-secrets   7.2.1                Trivy     2m51s   0          0      0        0     0
frontend             replicaset-frontend-57b979f9bb-frontend                                   frontend                                       bc03a29              Trivy     2m26s   2          22     11       74    0
frontend             replicaset-frontend-66bc7f9b57-frontend                                   frontend                                       0845ad7              Trivy     110s    0          7      4        2     0
frontend             replicaset-frontend-66d48f89df-frontend                                   frontend                                       9f0263c              Trivy     3m21s   2          22     11       74    0
frontend             replicaset-frontend-675b6f8bfb-frontend                                   frontend                                       a12db35              Trivy     2m27s   2          22     11       74    0
frontend             replicaset-frontend-7cc57c4fb4-frontend                                   frontend                                       48aa94e              Trivy     76s     2          22     11       74    0
frontend             replicaset-frontend-844fb64db4-frontend                                   frontend                                       aa38612              Trivy     55s     2          22     11       74    0
frontend             replicaset-frontend-dc89db794-frontend                                    frontend                                       0845ad7              Trivy     112s    0          7      4        2     0
frontend             replicaset-frontend-f487b9f88-frontend                                    frontend                                       8b40ef7              Trivy     78s     2          22     11       74    0
gatekeeper-system    replicaset-gatekeeper-audit-54b5f86d57-manager                            openpolicyagent/gatekeeper                     v3.3.0               Trivy     110s    0          0      0        0     0
gatekeeper-system    replicaset-gatekeeper-controller-manager-5b96bd668-manager                openpolicyagent/gatekeeper                     v3.3.0               Trivy     55s     0          0      0        0     0
kube-system          daemonset-aws-node-aws-node                                               amazon-k8s-cni                                 v1.7.10-eksbuild.1   Trivy     3m20s   0          0      0        0     0
kube-system          daemonset-kube-proxy-kube-proxy                                           eks/kube-proxy                                 v1.19.6-eksbuild.2   Trivy     3m20s   2          22     12       75    0
kube-system          replicaset-aws-load-balancer-controller-85ff4bfbc7-controller             amazon/aws-alb-ingress-controller              v2.1.0               Trivy     52s     0          0      0        0     0
kube-system          replicaset-aws-load-balancer-controller-dd979d56b-controller              amazon/aws-alb-ingress-controller              v2.1.3               Trivy     2m44s   0          0      0        0     0
kube-system          replicaset-coredns-59847d77c8-coredns                                     eks/coredns                                    v1.8.0-eksbuild.1    Trivy     2m30s   0          0      0        0     0
kube-system          replicaset-coredns-86f7d88d77-coredns                                     eks/coredns                                    v1.7.0-eksbuild.1    Trivy     111s    0          0      0        0     0
starboard-operator   replicaset-starboard-operator-7fff5747c4-starboard-operator               aquasec/starboard-operator                     0.10.3               Trivy     110s    0          0      0        0     0
tigera-operator      replicaset-tigera-operator-657cc89589-tigera-operator                     tigera/operator                                v1.13.2              Trivy     83s     0          0      0        0     0

構成スキャンの結果を確認する。

$ kubectl get configauditreports -o wide -A
NAMESPACE            NAME                                                 SCANNER   AGE   DANGER   WARNING   PASS
argocd               replicaset-argocd-dex-server-5dd657bd9               Polaris   25m   2        12        11
argocd               replicaset-argocd-dex-server-66ff89cb7b              Polaris   25m   2        12        11
argocd               replicaset-argocd-dex-server-fd74c7c8c               Polaris   25m   2        12        11
argocd               replicaset-argocd-redis-66b48966cb                   Polaris   25m   1        8         8
argocd               replicaset-argocd-redis-759b6bc7f4                   Polaris   25m   1        8         8
argocd               replicaset-argocd-repo-server-6c495f858f             Polaris   25m   0        7         10
argocd               replicaset-argocd-repo-server-79d884f4f6             Polaris   25m   1        7         9
argocd               replicaset-argocd-repo-server-84d58ff546             Polaris   25m   0        7         10
argocd               replicaset-argocd-server-6dccb89f65                  Polaris   22m   1        7         9
argocd               replicaset-argocd-server-7fd556c67c                  Polaris   25m   0        7         10
argocd               replicaset-argocd-server-859b4b5578                  Polaris   23m   0        7         10
argocd               statefulset-argocd-application-controller            Polaris   25m   0        7         10
backend              replicaset-backend-678944684b                        Polaris   24m   1        9         7
backend              replicaset-backend-7945cd669c                        Polaris   23m   1        9         7
backend              replicaset-backend-7d8b8f99cc                        Polaris   24m   1        9         7
backend              replicaset-backend-b68bc665c                         Polaris   22m   1        9         7
calico-system        daemonset-calico-node                                Polaris   25m   4        11        10
calico-system        replicaset-calico-kube-controllers-5d786d9bbc        Polaris   23m   1        8         8
calico-system        replicaset-calico-typha-74fdb8b6f                    Polaris   24m   1        9         7
cert-manager         replicaset-cert-manager-649c5f88bc                   Polaris   23m   1        1         7
cert-manager         replicaset-cert-manager-68ff46b886                   Polaris   22m   1        1         7
cert-manager         replicaset-cert-manager-cainjector-7cdbb9c945        Polaris   24m   1        1         7
cert-manager         replicaset-cert-manager-cainjector-9747d56           Polaris   24m   1        1         7
cert-manager         replicaset-cert-manager-webhook-67584ff488           Polaris   23m   1        1         7
cert-manager         replicaset-cert-manager-webhook-849c7b574f           Polaris   22m   1        1         7
default              replicaset-nginx-6d4cf56db6                          Polaris   23m   1        9         7
default              replicaset-nginx-db749865c                           Polaris   22m   1        9         7
external-secrets     replicaset-external-secrets-56fbfc9687               Polaris   25m   1        8         8
external-secrets     replicaset-external-secrets-658cc9b744               Polaris   22m   1        8         8
external-secrets     replicaset-external-secrets-69444c8577               Polaris   23m   1        8         8
external-secrets     replicaset-external-secrets-7cfc59f6d7               Polaris   24m   1        8         8
frontend             replicaset-frontend-57b979f9bb                       Polaris   22m   1        9         7
frontend             replicaset-frontend-66bc7f9b57                       Polaris   24m   1        9         7
frontend             replicaset-frontend-66d48f89df                       Polaris   24m   1        9         7
frontend             replicaset-frontend-675b6f8bfb                       Polaris   24m   1        9         7
frontend             replicaset-frontend-7cc57c4fb4                       Polaris   23m   1        9         7
frontend             replicaset-frontend-844fb64db4                       Polaris   23m   1        9         7
frontend             replicaset-frontend-dc89db794                        Polaris   22m   1        9         7
frontend             replicaset-frontend-f487b9f88                        Polaris   24m   1        9         7
gatekeeper-system    replicaset-gatekeeper-audit-54b5f86d57               Polaris   25m   0        1         16
gatekeeper-system    replicaset-gatekeeper-controller-manager-5b96bd668   Polaris   23m   0        1         16
kube-system          daemonset-aws-node                                   Polaris   25m   4        11        10
kube-system          daemonset-kube-proxy                                 Polaris   24m   1        1         3
kube-system          replicaset-aws-load-balancer-controller-85ff4bfbc7   Polaris   23m   0        2         15
kube-system          replicaset-aws-load-balancer-controller-dd979d56b    Polaris   23m   0        2         15
kube-system          replicaset-coredns-59847d77c8                        Polaris   23m   0        3         14
kube-system          replicaset-coredns-86f7d88d77                        Polaris   24m   0        3         14
starboard-operator   replicaset-starboard-operator-7fff5747c4             Polaris   20m   0        5         12
tigera-operator      replicaset-tigera-operator-657cc89589                Polaris   23m   1        10        6

kube-benchのスキャン結果を確認する。

$ kubectl get ciskubebenchreports -o wide
NAME                                             SCANNER      AGE   FAIL   WARN   INFO   PASS
ip-10-1-108-42.ap-northeast-1.compute.internal   kube-bench   28m   0      38     0      14
ip-10-1-71-185.ap-northeast-1.compute.internal   kube-bench   28m   0      38     0      14

kube-hunterは未対応。

$ kubectl get kubehunterreports -o wide
No resources found

Operatorにはダッシュボードが付属しているわけではないので、この先の結果の確認はCLIの場合と同じ。

Starboard CLIを試す

Starboard CLIを試してみたメモ。

StarboardはCLIとOperatorとある。

導入

CLIの導入はバイナリ、kubectlプラグイン、コンテナイメージなどが利用可能だが、今回はバイナリをダウンロードした。

$ starboard version
Starboard Version: {Version:0.10.3 Commit:5bd33431a239b98be4a3287563b8664a9b3d5707 Date:2021-05-14T12:20:34Z}

starboard initクラスターにセットアップが行われる。ネームスペースが作成され、セキュリティレポートのCRDや設定を格納したConfigMapが作成される。

$ starboard init
$ k get cm -n starboard
NAME                       DATA   AGE
starboard                  11     8s
starboard-polaris-config   1      7s
$ kubectl api-resources --api-group aquasecurity.github.io
NAME                   SHORTNAMES    APIVERSION                        NAMESPACED   KIND
ciskubebenchreports    kubebench     aquasecurity.github.io/v1alpha1   false        CISKubeBenchReport
configauditreports     configaudit   aquasecurity.github.io/v1alpha1   true         ConfigAuditReport
kubehunterreports      kubehunter    aquasecurity.github.io/v1alpha1   false        KubeHunterReport
vulnerabilityreports   vuln,vulns    aquasecurity.github.io/v1alpha1   true         VulnerabilityReport

実行

4種類のスキャンが実行可能。

Trivy

NginxのDeploymentを作成する。

$ kubectl -n default create deployment nginx --image nginx:1.16
deployment.apps/nginx created

スキャナを実行する。対象のリソース指定は必須。

starboard -n default scan vulnerabilityreports deployment/nginx

このとき、裏側ではJob経由でPodが実行される。

レポートが作成されたことを確認する。レポートはコンテナ単位。

$ kubectl get vulnerabilityreports -o wide -A
NAMESPACE   NAME                     REPOSITORY      TAG    SCANNER   AGE   CRITICAL   HIGH   MEDIUM   LOW   UNKNOWN
default     deployment-nginx-nginx   library/nginx   1.16   Trivy     52s   13         45     29       92    0

レポートを確認する。出力が多いので結果は省略。

starboard -n default get vulnerabilities deployment/nginx -o yaml
# or
kubectl -n default get vulnerabilityreports deployment-nginx-nginx -o yaml

Polaris

Porarisを実行する。対象のリソース指定は必須。

starboard -n default scan configauditreports deployment/nginx

レポートが作成されたことを確認する。

$ kubectl get configauditreport -o wide -A
NAME               SCANNER   AGE   DANGER   WARNING   PASS
deployment-nginx   Polaris   24s   1        9         7

レポートを確認する。出力が多いので結果は省略。

starboard -n default get configaudit deployment/nginx -o yaml

HTML出力

レポートをHTML出力して確認する。

starboard -n default get report deployment/nginx > nginx.deploy.html
open nginx.deploy.html

TrivyとPolarisの両方の結果がHTMLで出力される。

f:id:sotoiwa:20210519071514p:plain

f:id:sotoiwa:20210519071530p:plain

対象はNamespaceやNodeの指定も可能。

kube-bench

スキャナを実行する。対象の指定は不要で全ノードに対して実行される。

starboard scan ciskubebenchreports

レポートが作成されたことを確認する。

$ kubectl get ciskubebenchreports -o wide
NAME                                             SCANNER      AGE    FAIL   WARN   INFO   PASS
ip-10-1-108-42.ap-northeast-1.compute.internal   kube-bench   8m1s   0      38     0      14
ip-10-1-71-185.ap-northeast-1.compute.internal   kube-bench   8m1s   0      38     0      14

レポートを確認する。こちらはstarboard getコマンドは対応していない。出力が多いので結果は省略。

kubectl get ciskubebenchreports ip-10-1-108-42.ap-northeast-1.compute.internal -o yaml

HTML出力は可能。最近マージされたばかり(#396)。

starboard get report node/ip-10-1-108-42.ap-northeast-1.compute.internal > ip-10-1-108-42.ap-northeast-1.compute.internal.html
open ip-10-1-108-42.ap-northeast-1.compute.internal.html

f:id:sotoiwa:20210519071555p:plain

kube-hunter

スキャナを実行する。対象はクラスターのため対象指定は不要。

starboard scan kubehunterreports

レポートが作成されたことを確認する。

$ kubectl get kubehunterreports -o wide
NAME      SCANNER       AGE   HIGH   MEDIUM   LOW
cluster   kube-hunter   65s   0      1        1

レポートを確認する。

$ kubectl get kubehunterreports cluster -o yaml
apiVersion: aquasecurity.github.io/v1alpha1
kind: KubeHunterReport
metadata:
  creationTimestamp: "2021-05-11T16:05:19Z"
  generation: 1
  labels:
    starboard.resource.kind: Cluster
    starboard.resource.name: cluster
  name: cluster
  resourceVersion: "44234551"
  selfLink: /apis/aquasecurity.github.io/v1alpha1/kubehunterreports/cluster
  uid: 37c647e2-f006-45d0-8f98-3a418d39bfb4
report:
  scanner:
    name: kube-hunter
    vendor: Aqua Security
    version: 0.4.1
  summary:
    highCount: 0
    lowCount: 1
    mediumCount: 1
    unknownCount: 0
  updateTimestamp: "2021-05-11T16:05:19Z"
  vulnerabilities:
  - avd_reference: https://avd.aquasec.com/kube-hunter/none/
    category: Access Risk
    description: |-
      CAP_NET_RAW is enabled by default for pods.
          If an attacker manages to compromise a pod,
          they could potentially take advantage of this capability to perform network
          attacks on other pods running on the same node
    evidence: ""
    severity: low
    vulnerability: CAP_NET_RAW Enabled
  - avd_reference: https://avd.aquasec.com/kube-hunter/khv002/
    category: Information Disclosure
    description: 'The kubernetes version could be obtained from the /version endpoint '
    evidence: v1.19.6-eks-49a6c0
    severity: medium
    vulnerability: K8s Version Disclosure

こちらはHTML出力は未対応。

クリーンアップ

掃除する。

starboard cleanup