Katonic AWS Private Cluster
This guide provides instructions for installing, operating, administering, and configuring the Katonic Platform within your AWS Kubernetes cluster. The information presented here is specifically relevant to users of Katonic who possess self-installation licenses.
Hardware Configurationsโ
This configuration is specifically designed to provide high availability (HA) and optimal performance for various use cases. Its purpose is to deliver superior performance, enabling real-time execution of analytics, machine learning (ML), and artificial intelligence (AI) applications within a production pipeline.
Katonic on EKSโ
The Katonic MLOps platform can be deployed on an EKS. In this setup, the architecture of Katonic leverages AWS resources to meet the platform's operational needs and requirements. Below is the architecture of Katonic platform utilizing a private cluster deployed using AWS EKS:
The above diagram depicts Katonic Platform on Private Cluster with External Application LoadBalancer, it is allocated a public IP address outside of VPCโs CIDR. This configuration facilitates external public access to the Katonic Platform, making it openly accessible over the public internet. It is important to remember that the Katonic Platform is housed within a private cluster, even if it is accessible from the outside. As a result, the underlying services and infrastructure are shielded from public internet access. External traffic enters through the external load balancer, which distributes it across the resources in the private cluster
The control of Kubernetes is transferred to the EKS control plane, which offers managed Kubernetes masters.
Katonic uses a dedicated Auto Scaling Group (ASG) of EKS workers to host the Katonic platform.
ASGs of EKS workers host elastic compute for Katonic executions.
AWS S3 is used to store entire platform backups.
AWS EFS is used to store Katonic Datasets.
AWS Internet Gateway to allow internet access to Jump Host and other instances.
AWS NAT Gateway to initiate outbound traffic from Private Cluster and prevent unsolicited inbound traffic.
The kubernetes.io/aws-ebs provisioner is used to create persistent volumes for Katonic executions.
Katonic cannot be installed on EKS Fargate since Fargate does not support stateful workloads with persistent volumes.
All AWS services listed previously are required except GPU compute instances, which are optional. Your annual Katonic license fee will not include any charges incurred from using AWS services. You can find detailed pricing information for the Amazon services listed above at AWS Pricing.
Architecture of Katonic Platform on Private Cluster with Internal Application LoadBalancer:โ
Katonic Platform with Internal Application LoadBalancer is assigned a private IP address within the VPC's CIDR. This configuration restricts external public access to the Katonic Platform, ensuring that the platform is exclusively accessible from within the confines of the VPC, as displayed in above image. An internal ALB is used for routing traffic within a VPC and is not directly accessible from the internet.
Setting up an EKS Cluster for Katonic Platformโ
This section provides a detailed guide on how to configure an Amazon EKS cluster to work seamlessly with Katonic. To successfully set up an EKS cluster for Katonic, it is essential to have a solid understanding of the following AWS services:
- Elastic Kubernetes Service (EKS)
- Identity and Access Management (IAM)
- Virtual Private Cloud (VPC) Networking
- Elastic Block Store (EBS)
- Elastic File System (EFS)
- S3 Object Storage
Additionally, having a basic comprehension of Kubernetes concepts such as node pools, network CNI, storage classes, autoscaling, and Docker will prove invaluable during the cluster deployment.
Security Considerationsโ
To provision an EKS cluster, it is essential to create IAM policies in the AWS console. Katonic recommends following the standard security practice of granting the least privilege when creating IAM policies. It is advised to start with minimal privileges and only grant elevated privileges when necessary. For more information, refer to the concept of Grant Least Privilege concept.
IAM Permissions for Userโ
In order to complete the installation, the IAM user must have the following AWS permissions. These permissions include both AWS managed policies and custom policies that need to be created and attached to the IAM user.
AWS Managed:โ
Custom Managed:โ
These IAM policies have to be created manually in your AWS account and have to be assigned to the IAM User. Use the link to get policy json.
Additionally, for backups, you need to add the policy provided below in the S3 Object Storage part of the documentation.
Managing Service Quotasโ
Amazon maintains default service quotas for each of the previously listed services. You can check the default Service Quotas and manage your quotas by logging in to the AWS Service Quotas Console.
Creating Elastic Kubernetes Service (EKS)โ
By default, the Katonic installer creates an EKS cluster. However, there are certain prerequisites that must be met in order for the cluster to be built properly, such as the provisioning of resources like VPC, Private and Public Subnets, etc.
Follow below steps to provision to setup resources for private cluster:
- Provision a VPC with DNS Support and DNS Hostnames enabled.
- Create in total three subnets. Two private subnets and one public subnet in same VPC.
- Create SecurityGroups for Endpoints and then create Endpoints for EFS and S3.
- Create separate required routetables for private and public subnets and associate them to subnets accordingly.
- Create Internet Gateway and attach it to VPC.
- Create NAT Gateway in Public Subnet and route it through private routetable.
- Create SecurityGroups for JumpHost. and then Create JumpHost (50 GB disk recommended) using ec2 instance in public subnet and install AWS CLI, Kubectl and Docker, EKSCTL and Terraform in it.
- Create a Cluster using both private subnets with endpoints set to private. create three nodegroups namely platform, compute and deployment
Note: To deploy the Katonic platform on an EKS cluster, it is essential to create a Virtual Private Cloud (VPC) with a CIDR range of at least /21. Additionally, two subnets with CIDR ranges of /24 each are required. This is crucial because the platform allocates IP addresses for pods from within the subnet's IP range. It's important to note that any CIDR range equal to, or larger than /25 will not function properly for this purpose. For JumpHost in the public subnet, the ip range can be kept /28.
Note: AWS only allows CIDR block between /16 to /28.
Dynamic Block Storageโ
EKS clusters come pre-configured with several EBS-backed storage classes. For improved input and output performance, Katonic recommends using gp3 disks. The default gp3-based storage class (kfs) is created automatically by the Katonic installer.
If you are manually creating a cluster, you will need to create an EBS gp3-based storage class named kfs. To do this, you need to install and configure the EBS CSI driver in the EKS cluster. Refer to the documentation for instructions on creating the GP3 based storage class.
Dynamic Shared Storageโ
To enable dynamic shared storage, you need to provision an EFS file system and configure an access point that allows access from the EKS cluster.
The Katonic Installer provides an optional parameter shared_storage.create
to create an AWS Elastic File System. It automatically creates the AWS Elastic File System and configures the kfs-shared storage class to use it.
If you are manually creating a cluster and want to use shared storage, you need to create an AWS Elastic File System and configure the kfs-shared storage class to utilize it. Refer to the documentation for instructions on creating an AWS Elastic File System based storage class.
S3 Object Storageโ
To facilitate the storage of platform backups, it is essential to create an Amazon S3 bucket. Access to this bucket should be granted to the IAM user account responsible for the installation process. This can be accomplished by applying the below IAM policy to the bucket.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketLocation"
],
"Resource": "*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::<backup-bucket-name>"
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::<backup-bucket-name>/*"
}
]
}
AWS EKS Cluster Autoscalingโ
If you intend to deploy the Cluster Autoscaler in your cluster, follow this documentation Cluster Autoscaler.
Domainโ
To ensure proper operation, Katonic must be configured to serve from a specific Fully Qualified Domain Name (FQDN). If you want to serve Katonic securely over HTTPS, you will also need an SSL certificate that covers the chosen domain name. Make sure to record the FQDN for use during the Katonic installation process.
Katonic offers the default option to use the .katonic.ai domain in all versions of the Katonic Platform. However, if you have your own domain, you can also utilize it across all versions provided by the Katonic Platform.
Resources Provisioned Post-Installationโ
When the platform is installed, the following resources are created. Take this into account when selecting your installation configuration.
SR NO. | TYPE | AMOUNT | WHEN | NOTES |
---|---|---|---|---|
1 | Classic Elastic Load Balancer | 1 | Always | Only 1 is required. Automatically gets created by EKS when required. |
2 | Network interface | 1 per node | Always | |
3 | OS boot disk (AWS EBS) | 1 per node | Always | |
4 | VPC | 1 | The platform is deployed to a new VPC. | |
5 | Security Group | 1 | Always | See Security Groups Configuration (AWS). |
6 | EKS Cluster | 1 | EKS is used as the application cluster | Version 1.28 |
7 | EKS Managed Nodes | varies depending on configuration | EKS nodes are used to manage Kubernetes Workloads | |
8 | AWS EFS | 1 | When you enable shared storage while installing Katonic platform. |
Kubernetes(EKS) versionโ
Katonic MLOps platform 4.5 version has been validated with Kubernetes(EKS) version 1.28 and above.
Network pluginโ
Katonic relies on Kubernetes network policies to manage secure communication between pods in the cluster. Network policies are implemented by the network plugin, so your cluster uses a networking solution that supports NetworkPolicy, such as Calico.
See the AWS documentation on installing Calico for your EKS cluster.
If you use the Amazon VPC CNI for networking, with only NetworkPolicy enforcement components of Calico, you must ensure the subnets you use for your cluster have CIDR ranges of sufficient size, as every deployed pod in the cluster will be assigned an elastic network interface and consume a subnet address. Katonic recommends at least a /23 CIDR for the cluster.
Data Visualisationโ
Katonic MLOps platform 4.5 include Superset Version 2.0.1 for Data Visualization.
You require an additional DNS if you're installing Superset.
Example:
- If your domain name to access platform is katonic.tesla.com.
- Then, the domain for data visualisation would look like dash-katonic.tesla.com.
Connectorsโ
Katonic MLOps platform 4.5 include Airbyte Version 0.40.32 for Connectors.
You require an additional DNS if you're installing Airbyte.
Example:
- If your domain name to access platform is katonic.tesla.com.
- Then, the domain for connectors would look like connectors-katonic.tesla.com.
Katonic Platform Installationโ
Installation of the Katonic platform has been segmented based on product. When you click the link, you will be redirected to the installation process documentation.