Installing Kubeflow on VMware Tanzu Kubernetes Grid Cluster (TKC)

Source-> Kubeflow.org
Source -> VMware

What is Kubeflow?

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

Watch this video:

What is Tanzu Kubernetes Grid Cluster?

TKG Cluster is a VMware opinionated production ready Kubernetes cluster that can run across hybrid multicloud environment.

Read more detail here:

https://tanzu.vmware.com/kubernetes-grid

Now, Let’s discuss about how to install kubeflow on TKC.

In this post, I will be using Charmed Operator to install kubeflow on TKC, to know more about charmed operator, check this url: https://charmed-kubeflow.io/docs

Kubeflow Installation Pre-requirements

  1. Install the juju client on a Linux Server.
$ snap install juju --classic
juju 2.9.11 from Canonical✓ installed

quick note: “Juju provides easy, intelligent application orchestration on top of Kubernetes”. For more detail, visit here:

https://tanzu.vmware.com/kubernetes-grid

Validate if juju client installed successfully.

$ juju help

2. Connect Juju to Tanzu Kubernetes Grid Cluster (TKC)

$ juju add-k8s mytkgcluster --cluster-name=<name of your cluster>
--storage=<storage class name>
This operation can be applied to both a copy on this client and to the one on a controller.
No current controller was detected and there are no registered controllers on this client: either bootstrap one or register one.
k8s substrate "<Cluster name>" added as cloud "mytkgcluster" with storage provisioned
by the existing "tanzu-storage-policy" storage class.
You can now bootstrap to this cloud by running 'juju bootstrap mytkgcluster'.

3. Create a controller. To operate workloads on a Kubernetes cluster, Juju uses controllers.

$ juju bootstrap mytkgcluster my-tkg-controller
Creating Juju controller "my-tkg-controller" on mytkgcluster
Bootstrap to generic Kubernetes cluster
Fetching Juju Dashboard 0.8.1
Creating k8s resources for controller "controller-my-tkg-controller"
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.110.11.40 to verify accessibility...
Bootstrap complete, controller "my-tkg-controller" is now available in namespace "controller-my-tkg-controller"
Now you can run
juju add-model <model-name>
to create a new model to deploy k8s workloads.

4. Validate the resources deployed

$ k get all -n controller-my-tkg-controller
NAME                                 READY   STATUS    RESTARTS   AGE
pod/controller-0                     2/2     Running   2          3m8s
pod/modeloperator-696db856f9-xc2nw   1/1     Running   0          2m14s
NAME                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)     AGE
service/controller-service   ClusterIP   10.110.11.40   <none>        17070/TCP   3m11s
service/modeloperator        ClusterIP   10.102.11.92   <none>        17071/TCP   95s
NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/modeloperator   1/1     1            1           95s
NAME                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/modeloperator-696db856f9   1         1         1       2m15s
NAME                          READY   AGE
statefulset.apps/controller   1/1     3m8s

5. Create a model. A model in Juju is a blank canvas where operators will be deployed, and it holds a one:one relationship with a k8s namespace.

$ juju add-model kubeflow
Added 'kubeflow' model with credential 'mytkgcluster' for user 'admin'

Kubeflow Installation

To deploy the full Kubeflow bundle, we will need at least 50Gb available of disk, 14Gb of RAM, and 2 CPUs available in your machine/VM. In this post, I will be showing how to deploy kubeflow-lite.

  1. Install kubeflow lite
$ juju deploy cs:kubeflow-lite
Located bundle "kubeflow-lite" in charm-store, revision 54
Located charm "admission-webhook" in charm-store, revision 10
Located charm "argo-controller" in charm-store, revision 51
Located charm "dex-auth" in charm-store, revision 60
Located charm "istio-ingressgateway" in charm-store, revision 20
Located charm "istio-pilot" in charm-store, revision 20
Located charm "jupyter-controller" in charm-store, revision 56
Located charm "jupyter-ui" in charm-store, revision 10
Located charm "kfp-api" in charm-store, revision 12
Located charm "mariadb-k8s" in charm-store, revision 35
Located charm "kfp-persistence" in charm-store, revision 9
Located charm "kfp-schedwf" in charm-store, revision 9
Located charm "kfp-ui" in charm-store, revision 12
Located charm "kfp-viewer" in charm-store, revision 9
Located charm "kfp-viz" in charm-store, revision 8
Located charm "kubeflow-dashboard" in charm-store, revision 56
Located charm "kubeflow-profiles" in charm-store, revision 52
Located charm "kubeflow-volumes" in charm-store, revision 0
Located charm "minio" in charm-store, revision 55
Located charm "mlmd" in charm-store, revision 5
Located charm "oidc-gatekeeper" in charm-store, revision 54
Located charm "pytorch-operator" in charm-store, revision 53
Located charm "seldon-core" in charm-store, revision 50
Located charm "tfjob-operator" in charm-store, revision 1
Executing changes:
- upload charm admission-webhook from charm-store with architecture=amd64
- deploy application admission-webhook from charm-store with 1 unit
added resource oci-image
- set annotations for admission-webhook
- upload charm argo-controller from charm-store with architecture=amd64
- deploy application argo-controller from charm-store with 1 unit
added resource oci-image
- set annotations for argo-controller
- upload charm dex-auth from charm-store with architecture=amd64
- deploy application dex-auth from charm-store with 1 unit
added resource oci-image
- set annotations for dex-auth
- upload charm istio-ingressgateway from charm-store with architecture=amd64
- deploy application istio-ingressgateway from charm-store with 1 unit
added resource oci-image
- set annotations for istio-ingressgateway
- upload charm istio-pilot from charm-store with architecture=amd64
- deploy application istio-pilot from charm-store with 1 unit
added resource oci-image
- set annotations for istio-pilot
- upload charm jupyter-controller from charm-store with architecture=amd64
- deploy application jupyter-controller from charm-store with 1 unit
added resource oci-image
- set annotations for jupyter-controller
- upload charm jupyter-ui from charm-store with architecture=amd64
- deploy application jupyter-ui from charm-store with 1 unit
added resource oci-image
- set annotations for jupyter-ui
- upload charm kfp-api from charm-store with architecture=amd64
- deploy application kfp-api from charm-store with 1 unit
added resource oci-image
- set annotations for kfp-api
- upload charm mariadb-k8s from charm-store with architecture=amd64
- deploy application kfp-db from charm-store with 1 unit using mariadb-k8s
- set annotations for kfp-db
- upload charm kfp-persistence from charm-store with architecture=amd64
- deploy application kfp-persistence from charm-store with 1 unit
added resource oci-image
- set annotations for kfp-persistence
- upload charm kfp-schedwf from charm-store with architecture=amd64
- deploy application kfp-schedwf from charm-store with 1 unit
added resource oci-image
- set annotations for kfp-schedwf
- upload charm kfp-ui from charm-store with architecture=amd64
- deploy application kfp-ui from charm-store with 1 unit
added resource oci-image
- set annotations for kfp-ui
- upload charm kfp-viewer from charm-store with architecture=amd64
- deploy application kfp-viewer from charm-store with 1 unit
added resource oci-image
- set annotations for kfp-viewer
- upload charm kfp-viz from charm-store with architecture=amd64
- deploy application kfp-viz from charm-store with 1 unit
added resource oci-image
- set annotations for kfp-viz
- upload charm kubeflow-dashboard from charm-store with architecture=amd64
- deploy application kubeflow-dashboard from charm-store with 1 unit
added resource oci-image
- set annotations for kubeflow-dashboard
- upload charm kubeflow-profiles from charm-store with architecture=amd64
- deploy application kubeflow-profiles from charm-store with 1 unit
added resource kfam-image
added resource profile-image
- set annotations for kubeflow-profiles
- upload charm kubeflow-volumes from charm-store with architecture=amd64
- deploy application kubeflow-volumes from charm-store with 1 unit
added resource oci-image
- set annotations for kubeflow-volumes
- upload charm minio from charm-store with architecture=amd64
- deploy application minio from charm-store with 1 unit
added resource oci-image
- set annotations for minio
- upload charm mlmd from charm-store with architecture=amd64
- deploy application mlmd from charm-store with 1 unit
added resource oci-image
- set annotations for mlmd
- upload charm oidc-gatekeeper from charm-store with architecture=amd64
- deploy application oidc-gatekeeper from charm-store with 1 unit
added resource oci-image
- set annotations for oidc-gatekeeper
- upload charm pytorch-operator from charm-store with architecture=amd64
- deploy application pytorch-operator from charm-store with 1 unit
added resource oci-image
- set annotations for pytorch-operator
- upload charm seldon-core from charm-store with architecture=amd64
- deploy application seldon-controller-manager from charm-store with 1 unit using seldon-core
added resource oci-image
- set annotations for seldon-controller-manager
- upload charm tfjob-operator from charm-store with architecture=amd64
- deploy application tfjob-operator from charm-store with 1 unit
added resource oci-image
- set annotations for tfjob-operator
- add relation argo-controller - minio
- add relation dex-auth:oidc-client - oidc-gatekeeper:oidc-client
- add relation istio-pilot:ingress - dex-auth:ingress
- add relation istio-pilot:ingress - jupyter-ui:ingress
- add relation istio-pilot:ingress - kfp-ui:ingress
- add relation istio-pilot:ingress - kubeflow-dashboard:ingress
- add relation istio-pilot:ingress - kubeflow-volumes:ingress
- add relation istio-pilot:istio-pilot - istio-ingressgateway:istio-pilot
- add relation istio-pilot:ingress - oidc-gatekeeper:ingress
- add relation istio-pilot:ingress-auth - oidc-gatekeeper:ingress-auth
- add relation kfp-api - kfp-db
- add relation kfp-api:kfp-api - kfp-persistence:kfp-api
- add relation kfp-api:kfp-api - kfp-ui:kfp-api
- add relation kfp-api:kfp-viz - kfp-viz:kfp-viz
- add relation kfp-api:object-storage - minio:object-storage
- add relation kfp-ui:object-storage - minio:object-storage
- add relation kubeflow-profiles - kubeflow-dashboard
Deploy of bundle completed.

2. Validate the installation status by checking various pods status inside kubeflow namespace in TKC

kubeflow pods

3. It will take around 20 mins. Once installation is successful, move to the next step and access kubeflow UI

Accessing Kubeflow UI

I will be demonstrating very simple way to access kubeflow by using port forwarding. You can use other methods.

  1. Run the port forwarding to access the kubeflow ui
Port fowarding

2. Access the kubeflow GUI using localhost and port

Kubeflow first page

3. Provide the namespace name. This namespace is in kubeflow, not TKC.

kubeflow second page

3. Here is your kubeflow landing page.

kubeflow landing page

4. Now, kubeflow is successfully installed and i am able to access the GUI.

If you are a Data Scientist, this dashboard is for You 🙂

Uninstall Kubeflow

Kubeflow really takes lot of resource to run, hence i have uninstalled it and here is one simple command that you can run too.

$ juju destroy-model kubeflow --destroy-storage
Destroy model

You need to wait approx 10 mins and juju will take care of complete cleanup.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s