Integrating Harbor, KeyCloak and Istio

Jehoszafat Zimnowoda

Developer @ Red Kubes

This is our initial documentation setup for Otomi Container Platform. We hope you find it useful. Please don't hesitate to give us any feedback!

This is my story is about building a multi-tenant Kubernetes environment that facilitates various DevOps teams (tenants) with their own Kubernetes namespace and private container registry (Harbor v2.1.0) with Single-Sign-On On (Keycloak v10.0.0) and service mesh (Istio 1.6.14) included.

Harbor, A Fat But Versatile Container Registry#

Harbor provides a container image registry, vulnerability scanning, container image signature and validation, OIDC based authentication and authorization. The fully-featured version is composed of ten micro-services. It is also CNCF graduated OSS.

In Harbor, a project represents a container image registry, exposed under a unique URL, For example, "harbor.otomi.io/team-demo/", where team-demo is a project name.

By creating projects you can achieve a multi-tenant container image repository for workloads in your Kubernetes cluster.

In Harbor, you can also define project membership, first by defining OIDC groups and then assigning them with a given project. For example, the OIDC group team-demo is a member of a team-demo project.

Next, we want pods from a given Kubernetes namespace to pull container images from a private registry. The Harbor robot accounts are made for that purpose. I recommend creating two robot accounts in each Harbor project. The first one for using Kubernetes as a PullSecret at a given namespace and the second one for CI/CD pipeline.

Multi-Tenancy With Keycloak And Harbor#

Keycloak can be used as an identity provider. In Harbor, a user can log in by performing an OIDC authentication code flow with Keycloak. Upon successful authentication, a user is redirected back to Harbor with a JSON Web Token (JWT) signed by the identity Provider (Keycloak). The JWT contains a ID token.

Harbor can verify JWT signature and automatically assign a user to a role and a project, based on groups claim from the ID token.

The following code snippet present an ID token with a groups claim:

{
"iss": "https://keycloak.otomi.io/realms/master",
"sub": "xyz",
"name": "Joe Doe",
"groups": ["team-dev", "team-demo"],
"given_name": "Joe",
"family_name": "Doe",
"email": "joe.doe@otomi.io"
}

There is a "Joe Doe" user that belongs to team-dev and team-demo groups, which in Harbor can be matched to predefined OIDC groups. The ID token is issued by the Keycloak (iss property) that is running in the same Kubernetes cluster. Harbor can be configured to leverage ID tokens by specifying a set of authentication parameters.

There is an OIDC endpoint URL, which is matched against iss property from the ID token. Next, OIDC Client ID with OIDC client secret is used by Harbor to authenticate with a client at Keycloak. The group claim name is crucial for enabling Harbor OIDC group matching.

If you want to perform an automatic user onboarding process you should provide the following OIDC scopes: OpenID (iss and sub-properties) and email scope (email and email_verified properties).

Do not forget about Keycloak, which requires an additional configuration of the OIDC client.

The client id is Otomi and the client secret is defined in the credentials tab. There are also valid redirect URLs and Web-origins that have to be set so a user can be redirected from and to the Harbor dashboard upon successful login.

Secure Connectivity With Istio Service Mesh#

Istio ensures service interconnectivity, encrypted traffic (mTLS), and routing (VirtualService + Gateways). Integrating Harbor with Istio is mostly about setting up proper URI routing.

Harbor is composed of ten microservices:

  • Chartmuseum
  • Clair
  • Core
  • Database
  • Jobservice
  • Portal
  • Redis
  • Registry
  • Trivy
  • Notary

The Harbor Helm chart provides also Nginx as a reverse server proxy service. You don't need it, instead, you can deploy the following Istio VirtualService:

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: harbor
spec:
hosts:
– harbor.otomi.io
http:
– match:
– uri:
prefix: '/api/'
– uri:
prefix: '/c/'
– uri:
prefix: '/chartrepo/'
– uri:
prefix: '/service/'
– uri:
prefix: '/v1/'
– uri:
prefix: '/v2/'
route:
– destination:
host: harbor-harbor-core.harbor.svc.cluster.local
port:
number: 80
– match:
– uri:
prefix: /
rewrite:
uri: /
route:
– destination:
host: harbor-harbor-portal.harbor.svc.cluster.local
port:
number: 80

The Virtual Services redirect URI paths into Harbor core service:

  • /api/
  • /c/
  • /chartrepo/
  • /service/
  • /v1/
  • /v2/

All other URI paths are redirected to the Harbor portal service (dashboard).

The destination hosts from harbor VirtualService is a Fully Qualified Domain Name (FQDN) that indicates the Kubernetes namespace of the Harbor services. It makes it possible for the Istio Ingress gateway to route the incoming traffic.

You might have noticed that traffic is routed to port 80 (HTTP) instead of 433 (HTTPS). It is because I disabled Harbor internal TLS in favor of the Istio proxy sidecar that enforces mTLS for each Harbor service.

Automation With Otomi To Support Multi-Tenancy#

With Otomi, we strive to integrate best of breed Open-Source projects and provide multi-tenancy awareness out-of-the-box.

Multi-tenancy is challenging and requires configuration automation to ensure scalability.

Part of Otomi's automation is to configure applications, so they are aware of each other. We do it either by using a declarative approach when that is possible, or else by interacting with their (REST) APIs directly.

We have generated REST API clients based on the open API specification for Harbor and Keycloak. You are welcome to use our factory

Next, we implemented idempotent tasks that leverage these REST API clients and automate service configuration. These are run as Kubernetes jobs whenever a configuration changes.For Harbor, we have automated the creation of projects, OIDC groups, project membership, and OIDC settings.

For Keycloak, we have automated configuration of the external identity provider, group names normalization, deriving Client ID, Client Secret and more. Are you inspired about both? Then take a look at our open-source code.

Pitfalls#

Each OSS project has its own goals and milestones, thus it may be challenging to integrate various projects to work together. Here, I share just a few Issues that I stumbled upon.

Harbor#

The container image registry, provided by Harbor, and Docker CLI do not support the OIDC protocol. Instead, it uses a username/password-based authentication. It means that whenever you perform Docker Login/push/pull commands, the HTTPS traffic from a docker client to the container registry does not carry JWT. Make sure to exclude /v1/, /v2/ and /service/ Harbor URI paths from the JWT verification. Otherwise, you won't be able to use the registry.

Next, OIDC users may experience issues with their Docker credentials (CLI secrets) that suddenly are invalidated. The CLI secret depends on the validity of the ID token, which has nothing in common with the container registry. This hybrid security solution is something that a regular docker user does not expect and can be a source of many misunderstandings.

Follow this thread to get more insights.

The good news is that if you are an automation freak like me, you don't actually need CLI secrets. Instead, you can use Harbor robot accounts that do not depend on OIDC authentication.

Keycloak Or Other Identity Provider#

If your organization decides to migrate users to another identity provider you may experience a duplicated user error: "Conflict, the user with the same username or email has been onboarded".

It is because sub and/or iss scopes from ID token may change, so the same user trying to login to the harbor dashboard will be treated as a new one. The onboarding process starts but fails because Harbor requires each user to have a unique email address. I ended up removing existing OIDC users from Harbor and allowing them to onboard once again. Interestingly the community of Harbor users is having a broad debate about using OIDC protocol and could not agree on a final solution so far. I encourage you to take a look here for a very insightful conversation about it.

Istio#

While making Harbor services part of the Istio service mesh, it is very important that Kubernetes services are using port names that follow the Istio convention. For example, the Harbor registry services should have an HTTP-registry port name, instead of a registry. See the following example:

apiVersion: v1
kind: Service
metadata:
name: harbor-harbor-registry
spec:
ports:
– name: http-registry
port: 5000
protocol: TCP
targetPort: 5000
– name: http-controller
port: 8080
protocol: TCP
targetPort: 8080

If the service port name does not follow the Istio convention, Harbor core service is not able to communicate with the Harbor registry service in Istio service mesh. Attempting to login into the Docker registry will end up with an "authentication required" error.

Takeaways#

  • Harbor is a suitable solution for deploying a self-hosted container image repository in a multi-tenant Kubernetes cluster. Nevertheless, the lack of configuration automation makes it hard to maintain it in a constantly changing environment
  • Using the OIDC group's claim can be used for granting users default role and access to Harbor project(s)
  • While working with Istio, do not mess up with named ports
  • Stay in touch with open-source communities for projects that you are using, as they are a true treasure trove of information

I hope that this article provides you a good insight into more advanced Harbor integration in the Kubernetes cluster.

Part of Otomi's automation is to configure applications, so they are aware of each other. We do it either by using a declarative approach when that is possible, or else by interacting with their (REST) APIs directly.

We have generated REST API clients based on the open API specification for Harbor and Keycloak. You are welcome to use our factory for building and publishing open API clients.

This article was originally posted by Jehoszafat Zimnowoda on medium.com.

Universal OPA Policies Development

Alin Spinu

Developer @ Red Kubes

Introduction#

This is a story behind the trenches of writing Rego policies and how to unravel the cumbersome process of working with Gatekeeper vs Conftest for validating Kubernetes resources.

Working with policy compliant Kubernetes clusters is on the radar for a lot of companies these days, especially if you're walking the path towards popular certifications like ISO/IEC 27001 for Information Security Management.

Hands-on using OPA in Otomi Container Platform#

As some of you probably already know, the Kubernetes native PodSecurityPolicy resource is going to be deprecated, see Github and Google docs — this leaves way for external projects like Open Policy Agent to be used as the new standard for developing and enforcing policy rules.

Otomi is using OPA as the standard for providing policy enforcement because of the popularity and commitment to the Kubernetes community, ease of use and also for the future project development plans.

One of the key principles of Otomi Container Platform is that it is easy to use and provide clarity for integrations, allowing developers to easily extend any platform feature and provide integrated security for everything from the ground up.

OPA ecosystem common knowledge#

Decisions are handled by the means of Admission Controllers such as OPA kube-mgmt project or Gatekeeper, which I will touch upon in a minute, but also remember that we can validate things using Rego query language on any plain files using static analysis tools like Conftest. The list of supported Conftest formats include (but is not limited to): json, yaml, Dockerfile, INI files, XML, etc.Here's the problem, Conftest and Gatekeeper are on the path of diverging. Although they speak the same REGO language, the two disagree on some aspects.

In-Cluster vs Static Resources Policy Wars#

Working in a policy constricted environment is like having "parental controls" turned on for unprivileged users, allowing administrators to decide what kind of resources and setup are safest for their flock of Kubernetes clusters. From an application developer's perspective, being denied access to deploy some resources means that they are not adhering to the rules imposed for that environment and should decide to find and fix the missing links in this setup.

Policy administrators/developers on the other hand, struggle with finding the correct enforcement strategies and adjusting policy parameters according to desired state or allowing certain exclusions for cases where policy enforcement does not make sense. For example: system namespaces, cloud vendor specific namespaces or anything that should avoid intervention by default. There is no golden rule for policy adoption and you are in charge of overcoming your own mistakes if something is not right.

Let's start with the simple use-case of running policy checks against any kind of YAML resource. Then I move forward with more details about in-cluster Kubernetes admission review objects.

Conftest in action#

With a simple command we can test if a helm chart is violating any rules in the provided policies folder.

$ helm template stable/keycloak | conftest test --policy ./policies/ --all-namespaces -
FAIL - Policy: container-limits - container <keycloak> has no resource limits
FAIL - Policy: container-limits - container <keycloak-test> has no resource limits
162 tests, 160 passed, 0 warnings, 2 failures, 0 exceptions

The generated yaml files are streamed into Conftest and policies are tested one by one.

By examining the log message, we can see that the container-limits policy is marking two resources as failures. Now all we have to do is modify the templates to provide a "sensitive" amount of resource limits to the indicated containers and our policy checks will pass successfully! Hooray

This is pretty useful if you want to adopt new helm applications, but don't want to deploy anything to the cluster unless it's well-examined for any violations. Conftest supports passing values to the policies using the –data option, which allows policy designers to configure different settings through parameters. Parameters can help us control any aspect of creating configurable rules for resources. I will return to that in a moment.

Running Gatekeeper#

Gatekeeper is becoming the new standard for implementing security policies in Kubernetes, endorsing a broad ecosystem to spread ideas. Enforcing policy decisions works by using a validating web-hook, intercepting any request authenticated by the api-server and checking if the request meets the defined policy  or rejecting it otherwise.

Trying to create a non-conformant resource object in a Gatekeeper enabled cluster results in an error with a message to explain the rejected request.

$ helm template charts/redis | kubectl apply -f -
secret/redis created
configmap/redis created
service/redis-master created
Error from server (
[denied by banned-image-tags] Policy: banned-image-tags - container <redis> has banned image tag <latest>, banned tags are {"latest", "master"}
[denied by psp-allowed-users] Policy: psp-allowed-users - Container redis is attempting to run as disallowed user 0. Allowed runAsUser: {"rule": "MustRunAsNonRoot"}
)
error when creating "redis/templates/redis-master-statefulset.yaml": admission webhook "validation.gatekeeper.sh" denied the request

As you can see, some of the resources get created, but the command fails with a denial from the admission webhook.

To make this resource valid, small tweaks to the image tag and securityContext fields will do the trick.

Policy Development in Gatekeeper context#

Similar to plain policy files which Conftest uses, Gatekeeper policy library works by loading into memory a collection of wrapped rego files in Kubernetes CRDs called ConstraintTemplates.

To enforce a policy for certain resource types we need to instantiate a Constraint (CR resource) which is acting like the blueprint with desired values or parameters.

Example Gatekeeper setup for Config and Constraint resources:

--- # Gatekeeper Config
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: gatekeeper-system
spec:
match:
- excludedNamespaces:
- gatekeeper-system
- kube-system
processes:
- '*'
sync:
syncOnly:
- group: ''
kind: Pod
version: v1
--- # ContainerLimits Constraint
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: ContainerLimits
metadata:
name: containerlimits
spec:
match:
kinds:
- apiGroups:
- apps
- ''
kinds:
- DaemonSet
- Deployment
- StatefulSet
- Pod
parameters:
container-limits:
enabled: false
cpu: '2'
memory: 2000Mi

So far this looks pretty easy . We decide to enable this policy for all namespaces except for "gatekeeper-system" & "kube-system" and Gatekeeper will test all our containers for resource limits.

Hold on.. Where is the definition for this policy? Rego Rules are defined in CRD files called ConstraintTemplates and they need to be deployed prior to the Constraint instance.

--- # ConstraintTemplate for container-limits policy
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: containerlimits
spec:
crd:
spec:
names:
kind: ContainerLimits
validation: # Schema for the `parameters` field
openAPIV3Schema:
...
targets:
- target: admission.k8s.gatekeeper.sh
libs:
- |-
package lib.utils
...
rego: |-
package containerlimits
...

To simplify the example, we've cut out parts of our file, but you can also view it on the Gatekeeper Library repo. The "rego" property is where the actual policy definition lives and any dependency libraries can be declared in the list of "libs". We won't go into how Rego rules for violations work now, but you can enjoy all the fun of learning a powerful query language inspired by decades old Datalog.

Planning for Unification#

While the process of creating new policies seems to bear a lot of weight, luckily there is a simple CLI tool to speed up our work called Konstraint.

You can define the rules in Rego files and Konstraint will wrap our policy and all dependency libraries in the necessary CRD files for Gatekeeper. Konstraint has core library functions for working with all kinds of kubernetes native objects which makes it an indispensable tool for Rego development. OK, so this means we only care about writing Rego files and we can test them in both Gatekeeper and Conftest.

Open Issues in OPA Community#

While that is true, after delving deeper into the bowels of Gatekeeper and Conftest, we found conflicting design concepts which completely ruin this unified policy development mindset.

The way you can define parameters in two libraries is different, and Gatekeeper has a restrictive parser which does not allow arbitrary imports in the rego definition. There has been an attempt to change this hard limitation in a discussion [here](https://github.com/open-policy-agent/gatekeeper/issues/1046

Current State of Unified Rego#

To echo the subject once again, we are interested in reducing the boilerplate and simplify the process of developing Policies.

[ Same Rego == different contexts ]

Working with one common set Rego policy files and using the same source code for testing "static files", generated from CI Builds or across any "Gatekeeping context", where objects are created via the Kubernetes api. Let's cut to the chase, and say we already made this happen, and delivered a working solution in Otomi. By using the available community tools, lots of integration work and a lot of customization additions. We now have a rich collection of policies and utility functions defined for our Kubernetes clusters . You can browse them in the Otomi-core repository.

How Istio is Mutating Objects in the Background#

So by now, we understand all the nitty gritty details about Rego policies and Gatekeeper — but there will always be external factors, changing the state of the world and you can find yourself in a closed box situation, nowhere to go.

This situation becomes a nightmare when using Istio mesh for networking. In reality Istio is creating a "subspace" of resources by injecting a Sidecar container to all the pods in namespaces where service mesh communication is enabled. This kind of container is sometimes interfering with the security constraints design strategy.

Otomi Policy Features#

To create some flexibility, we have further extended the policy exceptions capabilities by examining granular annotation information for every resource under analysis.

Coming back to the Parameters idea described a while back, if Policy files can read resource files as raw input, why not allow certain exceptions through annotating which resources we want to skip checking, similar to the excludeNamespace option for Gatekeeper.

Rego has a rich built-in library system and is a powerful language, allowing us to easily create robust utility functions for this design.

To give an example, using the following annotations will allow entire pod or certain containers from the pod to exclude one or more policies:

# Annotation for entire pod
policy.otomi.io/ignore: psp-allowed-repos
# Annotation for Istio sidecar based containers
policy.otomi.io/ignore-sidecar: psp-allowed-users,psp-capabilities
# Annotation for specific container (name=app-container)
policy.otomi.io/ignore/app-container: banned-image-tags
Another example of exclusion is to turn of the policy entirely by disabling it from the baseline settings:
# baseline configuration
policies:
container-limits:
enabled: false
cpu: '2'
memory: 2Gi
banned-image-tags:
enabled: false
tags:
- latest
- master
psp-host-filesystem:
enabled: true
allowedHostPaths:
- pathPrefix: /tmp/
readOnly: false
psp-allowed-users:
enabled: true
runAsUser:
rule: MustRunAsNonRoot
runAsGroup:
rule: MayRunAs
ranges:
- min: 1
max: 65535
supplementalGroups:
rule: MayRunAs
ranges:
- min: 1
max: 65535
fsGroup:
rule: MayRunAs
ranges:
- min: 1
max: 65535
psp-host-security:
enabled: true
psp-host-networking-ports:
enabled: true
psp-privileged:
enabled: true
psp-capabilities:
enabled: true
allowedCapabilities:
- NET_BIND_SERVICE
- NET_RAW
psp-forbidden-sysctls:
enabled: true
forbiddenSysctls:
- kernel.*
- net.*
- abi.*
- fs.*
- sunrpc.*
- user.*
- vm.*
psp-apparmor:
enabled: true
allowedProfiles:
- runtime/default
- docker/default
psp-selinux:
enabled: false
seLinuxContext: RunAsAny
psp-seccomp:
enabled: false
allowedProfiles:
- runtime/default
psp-allowed-repos:
enabled: true
repos:
- docker.io
- gke.otomi.cloud
- aks.otomi.cloud
- eks.otomi.cloud

Conclusion#

To pin down the state of the landscape —  designing Kubernetes policies is still evolving towards new heights and I can image in the near future more and more projects sharing the same policies from a registry like the policy-hub.

We think creating a common understanding about unified Rego is the key to a sunny future.

This article was originally posted by Alin Spinu on Medium.