Knowledge Base
Product Updates
Advanced Configuration
This section support advanced configurations.
AWS
Using a GPU requires creating an Amazon Machine Image (AMI) in your desired AWS region. If you are using the ap-southeast-2 region, a public AMI is already available,
Currently P2, P3, G4dn & G6 AWS EC2 Instances have been tested.
aws:
region: "ap-southeast-2"
nodes:
- vmType: t2.medium
count: 1
role: control-plane
- vmType: t3.2xlarge
count: 2
role: worker
- vmType: d3.2xlarge
count: 1
role: worker
- vmType: p2.xlarge
count: 1
awsAMI: ami-0db4a0fc42c49f8c6
role: gpu
For other region using the below section to create an GPU AMI.
To use GPU instances in other AWS regions, follow the process below to create your own AMI. This involves downloading and preparing a Talos OS image for use as an Amazon Machine Image (AMI) in AWS. The Talos OS image is first downloaded and decompressed, then uploaded to an Amazon S3 bucket. From there, it is imported into AWS as an EBS snapshot. The script monitors the import progress until it is complete. Once the snapshot is ready, it is registered as an AMI with the necessary configurations, such as storage settings and virtualization type. This AMI can then be used to launch EC2 instances running Talos OS.
aws_region="ap-southeast-2"
talos_gpu_decompressed_file_name="aws-amd64.raw"
talos_gpu_versioned_file_name="aws-amd64-gpu-v1.8.3.raw"
amiName="talos-aws-amd64-gpu-v1.8.3"
aws_bucket="my-talos-images"
The Talos Linux Image Factory, developed by Sidero Labs, Inc., offers a method to download various boot assets for Talos Linux. The below URL downloads the Talos v1.8.3 OS image with NVIDIA GPU extensions required for Kubox.
curl -O https://factory.talos.dev/image/af8eb82417d3deaa94d2ef19c3b590b0dac1b2549d0b9b35b3da2bc325de75f7/v1.8.3/aws-amd64.raw.xz
The image is then decompressed using the xz command:
xz --decompress ./aws-amd64.raw.xz
The decompressed file is uploaded to the specified S3 bucket:
aws s3 cp ./${talos_gpu_decompressed_file_name} "s3://${aws_bucket}/${talos_gpu_versioned_file_name}" --quiet
Once uploaded, the image is imported as an EBS snapshot using the import-snapshot command:
import_task_id=$(
aws ec2 import-snapshot \
--region "${aws_region}" \
--description "Talos v1.8.3 GPU" \
--disk-container "Format=raw,UserBucket={S3Bucket=${aws_bucket},S3Key=${talos_gpu_versioned_file_name}}" \
--query 'ImportTaskId' \
--output text
)
Store the ImportTaskId for monitoring the import progress.
echo "Import task id ${import_task_id}"
The script continuously checks the status of the import task. It waits for the status to change to “completed” and then retrieves the Snapshot ID.
snapshot_id=""
while [ -z "$snapshot_id" ]; do
snapshot_state="$(aws ec2 describe-import-snapshot-tasks \
--region "${aws_region}" \
--import-task-ids "${import_task_id}" \
--query 'ImportSnapshotTasks[0].SnapshotTaskDetail.Status' \
--output text)"
if [ "$snapshot_state" == "completed" ]; then
snapshot_id="$(aws ec2 describe-import-snapshot-tasks \
--region "${aws_region}" \
--import-task-ids "${import_task_id}" \
--query 'ImportSnapshotTasks[0].SnapshotTaskDetail.SnapshotId' \
--output text)"
echo "$snapshot_id"
break
else
echo "Waiting for snapshot import to complete..."
sleep 30
fi
done
echo "Snapshort id ${snapshot_id}"
Once the snapshot is ready, the script registers it as an AMI:
# Register the image
aws ec2 register-image \
--region "${aws_region}" \
--block-device-mappings "DeviceName=/dev/xvda,VirtualName=talos,Ebs={DeleteOnTermination=true,SnapshotId=${snapshot_id},VolumeSize=80,VolumeType=gp2}" \
--root-device-name /dev/xvda \
--virtualization-type hvm \
--architecture x86_64 \
--ena-support \
--name "${amiName}"
Finally, check the image exists.
existing_image_id="$(aws ec2 describe-images \
--region "${aws_region}" \
--filters "Name=name,Values=${amiName}" \
--query 'Images[0].ImageId' \
--output text)"
To create an AWS IAM role, attach the AWS Managed AmazonEBSCSIDriverPolicy. This policy gives the necessary permissions to manage Amazon EBS volumes using Kubernetes’ Container Storage Interface (CSI). It enables Kubox to dynamically create EBS volumes, making it possible to run stateful workloads.
Run this command to create the IAM role
aws iam create-role \
--role-name KuboxEC2InstanceRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}' \
--description "Role for Kubox EC2 instances with AmazonEBSCSIDriverPolicy attached"
aws iam create-instance-profile \
--instance-profile-name KuboxEC2InstanceRole
aws iam add-role-to-instance-profile \
--instance-profile-name KuboxEC2InstanceRole \
--role-name KuboxEC2InstanceRole
Run this command to attach the AWS Managed AmazonEBSCSIDriverPolicy
policy:
aws iam attach-role-policy \
--role-name KuboxEC2InstanceRole \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
This policy is optional but highly recommended, as it provides the essential permissions required for users or groups to create and delete kubox clusters. The KuboxCreatorAccess policy allows for the creation, management, and deletion of AWS resources, including EC2 instances, networking components (such as VPCs, subnets, security groups, and routes), and Elastic Load Balancing (ELB) resources. It is crucial to assign this policy only to trusted roles or groups to ensure security.
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --output json | jq -r .Account)
Run this command to create the policy.
aws iam create-policy \
--policy-name KuboxCreatorAccess \
--policy-document "$(cat <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ec2:AuthorizeSecurityGroupIngress",
"ec2:DeleteSubnet",
"ec2:DescribeInstances",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeInstanceAttribute",
"ec2:CreateVpc",
"ec2:AttachInternetGateway",
"ec2:DescribeVpcAttribute",
"ec2:AssociateRouteTable",
"ec2:DescribeInternetGateways",
"ec2:DescribeNetworkInterfaces",
"ec2:StartInstances",
"ec2:CreateRoute",
"ec2:CreateInternetGateway",
"ec2:RevokeSecurityGroupEgress",
"ec2:GetSecurityGroupsForVpc",
"ec2:CreateSecurityGroup",
"ec2:DescribeVolumes",
"ec2:DescribeAccountAttributes",
"ec2:DeleteInternetGateway",
"ec2:ModifyInstanceAttribute",
"ec2:DescribeNetworkAcls",
"ec2:DescribeRouteTables",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:TerminateInstances",
"ec2:DescribeTags",
"ec2:CreateTags",
"ec2:RegisterImage",
"ec2:ImportSnapshot",
"ec2:DeleteRoute",
"ec2:RunInstances",
"ec2:DetachInternetGateway",
"ec2:StopInstances",
"ec2:DisassociateRouteTable",
"ec2:DescribeInstanceCreditSpecifications",
"ec2:DescribeSecurityGroups",
"ec2:DescribeImportSnapshotTasks",
"ec2:RevokeSecurityGroupIngress",
"ec2:DescribeImages",
"ec2:DescribeSecurityGroupRules",
"ec2:DescribeVpcs",
"ec2:DeleteSecurityGroup",
"ec2:DescribeInstanceTypes",
"ec2:DeleteVpc",
"ec2:DescribeAvailabilityZones",
"ec2:CreateSubnet",
"ec2:DescribeSubnets"
],
"Resource": "*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"elasticloadbalancing:CreateLoadBalancer",
"elasticloadbalancing:DescribeTags",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:CreateTargetGroup",
"elasticloadbalancing:DeregisterTargets",
"elasticloadbalancing:DeleteTargetGroup",
"elasticloadbalancing:DeleteLoadBalancer",
"elasticloadbalancing:DescribeLoadBalancerAttributes",
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:CreateListener",
"elasticloadbalancing:DescribeTargetGroupAttributes",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:AddTags",
"elasticloadbalancing:DescribeTargetGroups",
"elasticloadbalancing:ModifyLoadBalancerAttributes",
"elasticloadbalancing:ModifyTargetGroupAttributes",
"elasticloadbalancing:DeleteListener"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KuboxEC2InstanceRole"
}
]
}
EOF
)" \
--description "Policy for creating and deleting AWS resources by Kubox"