28
loading...
This website collects cookies to deliver better user experience
kubectl
or gcloud
CLIs. Which I consider a last resort that should be avoided at any cost for Terraform modules, because long term maintainability. Frankly, calling CLI commands like this has no place in declarative infrastructure as far as I'm concerned.kubectl
and gcloud
like the official Google modules do.google_service_account
, google_project_iam_member
and google_service_account_key
resources is easy enough. I'm sure this is overly simplified and I may have to add more roles as my evaluation continues.resource "google_service_account" "current" {
project = local.project_id
account_id = local.cluster_name
display_name = "${local.cluster_name} gke-connect agent"
}
resource "google_project_iam_member" "current" {
project = local.project_id
role = "roles/gkehub.connect"
member = "serviceAccount:${google_service_account.current.email}"
}
resource "google_service_account_key" "current" {
service_account_id = google_service_account.current.name
}
google_gke_hub_membership
resource.resource "google_gke_hub_membership" "current" {
provider = google-beta
project = local.project_id
membership_id = local.cluster_name
description = "${local.cluster_name} hub membership"
}
gcloud beta container hub memberships register
CLI command. But the command has a --manifest-output-file
parameter, that allows writing the Kubernetes resources to a file instead of applying it to the cluster directly.gcloud
command from Terraform, I opted to write the manifests to a YAML file and use them as the base that I patch in a kustomization_overlay
using my Kustomization provider.data "kustomization_overlay" "current" {
namespace = "gke-connect"
resources = [
"${path.module}/upstream_manifest/anthos.yaml"
]
secret_generator {
name = "creds-gcp"
type = "replace"
literals = [
"creds-gcp.json=${base64decode(google_service_account_key.current.private_key)}"
]
}
patches {
# this breaks if the order of env vars in the upstream YAML changes
patch = <<-EOF
- op: replace
path: /spec/template/spec/containers/0/env/6/value
value: "//gkehub.googleapis.com/projects/xxxxxxxxxxxx/locations/global/memberships/${local.cluster_name}"
EOF
target = {
group = "apps"
version = "v1"
kind = "Deployment"
name = "gke-connect-agent-20210514-00-00"
namespace = "gke-connect"
}
}
}
gcloud
writes to disk can't be committed to version control because they include a Kubernetes secret with the plaintext service account key embedded. The key file is unfortunately a required parameter of the hub memberships register
command. So I had to delete this secret from the YAML file. And have to remember doing this whenever I rerun the command to update my base with the latest upstream manifests.kustomization_overlay
data source, I then use a secret_generator
to create a Kubernetes secret using the private key from the google_service_account_key
resource.envFrom
and set the environment variables dynamically in the overlay using a config_map_generator
. But the downside of this is, again, that there's one more modification to the upstream YAML which has to be repeated every time it is updated.gke-connect-agent-20210514-00-00
. Call me a pessimist, but I totally expect this to become a problem with updates in the future.unreachable
warning. As it turned out, this was due to the agent pod crash looping with a permission denied error.kubectl -n gke-connect logs gke-connect-agent-20210514-00-00-66d94cff9d-tzw5t
2021/07/02 11:40:33.277997 connect_agent.go:17: GKE Connect Agent. Log timestamps in UTC.
2021/07/02 11:40:33.298969 connect_agent.go:21: error creating tunnel: unable to retrieve namespace "kube-system" to be used as externalID: namespaces "kube-system" is forbidden: User "system:serviceaccount:gke-connect:connect-agent-sa" cannot get resource "namespaces" in API group "" in the namespace "kube-system"
gcloud
generated YAML I remember there were plenty of RBAC related resources included. Digging in, it turned out the generated YAML has a Role
and RoleBinding
. And if you followed carefully, you probably guessed the issue already. Here's the respective part of the generated resources:---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
hub.gke.io/project: "ie-gcp-poc"
version: 20210514-00-00
name: gke-connect-namespace-getter
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
hub.gke.io/project: "ie-gcp-poc"
version: 20210514-00-00
name: gke-connect-namespace-getter
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: gke-connect-namespace-getter
subjects:
- kind: ServiceAccount
name: connect-agent-sa
namespace: gke-connect
kube-system
namespace can not grant permissions to get
the kube-system
namespace, because namespaces are not namespaced resources.Role
to a ClusterRole
and the RoleBinding
to a ClusterRoleBinding
and reapplied my Terraform configuration. And I now have a running agent, that established a tunnel to the Anthos control plane and prints lots of log messages. I have yet to dig into what it actually does there.--manifest-output-file
parameter is used or if the RBAC configuration is also broken when directly applying the Kubernetes resources to the cluster using the gcloud
CLI.kubectl
based Terraform module for that may suggest so. But why wouldn't I do everything after the cluster is connected to Anthos using Anthos Config Management?