It is a visitor publish by AWS Information Hero and co-founder of Conduktor, Stephane Maarek.
Deploying Apache Kafka on AWS is now simpler, because of Amazon Managed Streaming for Apache Kafka (Amazon MSK). In a number of clicks, it supplies you with a production-ready Kafka cluster on which you’ll run your functions and create knowledge streams.
Apache Kafka is an open-source mission, and no official consumer interfaces can be found. The shortage of visibility into Apache Kafka is an element within the sluggish growth of functions.
The latest announcement of the Conduktor Platform makes Amazon MSK operations easy, and you may remedy Kafka points finish to finish with options for testing, monitoring, knowledge high quality, governance, and safety.
You should use the Conduktor Platform to observe each forms of MSK clusters, provisioned and serverless. On this publish, we reveal use AWS Id and Entry Administration (IAM) primarily based safety to manage our MSK cluster.
We have a look at how we will deploy the Conduktor Platform on Amazon MSK in a production-ready deployment so you’ll be able to attempt it out at this time.
The answer is totally serverless and customizable. The whole lot is deployed utilizing AWS CloudFormation templates.
The supply code and CloudFormation templates used on this publish can be found within the GitHub repo.
To implement this resolution, we full the next high-level steps:
- Deploy a CloudFormation template to create our personalized Docker picture for the Conduktor Platform utilizing AWS CodeBuild.
- Optionally, deploy an MSK cluster in provisioned or serverless mode utilizing a CloudFormation template.
- Deploy the Conduktor Platform as an AWS Fargate container towards our MSK cluster utilizing a CloudFormation template.
Create a personalized configuration for the Conduktor Platform
The Conduktor Platform makes use of a YAML configuration file to outline the cluster connection endpoints. Due to this fact, we should create a personalized Docker picture of the Conduktor Platform that’s ready to hook up with a cluster on Amazon MSK with a personalized YAML file. For this, we use CodeBuild, and we retailer our configuration recordsdata in Amazon Easy Storage Service (Amazon S3). The ultimate picture is saved in Amazon Elastic Container Registry (Amazon ECR). The next diagram illustrates this workflow.
- Deploy the first CloudFormation template to create the next assets:
- An S3 bucket to retailer our configuration recordsdata.
- An ECR repository to retailer our remaining Docker picture.
- A CodeBuild mission to construct that Docker picture.
- An IAM position and coverage to permit CodeBuild to carry out the construct.
Now we have to add our recordsdata into Amazon S3.
- Add the next recordsdata:
- The file buildspec.yml, which is utilized by CodeBuild to construct our major Docker picture.
- The Dockerfile, which comprises directions on construct our remaining Docker picture.
- The folder conduktor-platform-config (as is), which comprises the configuration recordsdata to hook up with Amazon MSK.
- At this stage, you’ll be able to customise the
conduktor-platform.yamlfile, permitting you to hook up with one MSK cluster:
Alternatively, you’ll be able to hook up with a number of MSK clusters or exterior ones by specifying a number of Kafka bootstrap servers, as proven within the following code. You too can use the identical configuration file to specify the schema registry URL, Kafka Join connection particulars, and SSO.
A single-Area Conduktor Platform deployment can work for multi-Area MSK clusters, though pure latency is predicted. For latency-sensitive utilization, you’ll be able to deploy this resolution in each Area wherein you’re utilizing Amazon MSK.
After importing the recordsdata and configurations in your S3 bucket, let’s run CodeBuild to generate a brand new picture.
- On the CodeBuild console, navigate to the mission and select Begin construct.
The construct ought to full in about 3 minutes.
The ultimate picture is pushed to Amazon ECR because of the script hosted in our build-spec.yml script run by CodeBuild. We’re now completed with our first step. Your Conduktor Platform setup can now totally hook up with your MSK cluster.
Begin the MSK cluster
If you have already got an MSK cluster arrange with IAM entry management, you’ll be able to skip this step. If not, you’ll be able to create one utilizing the supplied CloudFormation template.
From the MSK cluster (the brand new one or present one), retrieve two important items of data:
We use IAM entry management in order that we solely want to make use of IAM insurance policies to hook up with our cluster.
In case you’re utilizing one other safety mechanism (similar to SASL/SCRAM), you might want to modify the Conduktor configuration recordsdata with the appropriate properties, add them again into Amazon S3, and rebuild the Conduktor picture utilizing CodeBuild.
Conduktor helps each single Kafka authentication methodology, together with those supported by Amazon MSK: IAM entry management, mutual TLS authentication, and consumer title/password utilizing SASL/SCRAM.
Deploy the Conduktor Platform on Amazon ECS with Fargate
The final step is to deploy the Conduktor Platform. For this, we choose operating serverless options utilizing Amazon Elastic Container Service (Amazon ECS) with Fargate. This lets you right-size your containers sooner or later in case your utilization of Conduktor grows over time.
Conduktor shops persistent knowledge within the
/var/conduktor file system folder, to retailer configuration, cache computation outcomes, retailer logs, and run an inner database (for instance, for those who begin creating knowledge masking guidelines). For the persistence layer, we use Amazon Elastic File System (Amazon EFS), an elastic community file system that may be mounted on Fargate to supply a persistence layer.
Lastly, we expose our Fargate container by an Software Load Balancer, giving us a public static DNS endpoint to show the Conduktor Platform and giving us full management over the community safety to entry the Conduktor Platform. The next diagram illustrates our structure.
We deploy our final CloudFormation file and specify some essential parameters:
- MSKBookstrapServersURL – This parameter is critical to inform Conduktor which MSK cluster to hook up with
- MSKSecurityGroupID – The MSK safety group is critical to permit the template so as to add a safety group ingress rule to it, thereby permitting our ECS activity
- PublicSubnetIDs – The general public subnet IDs are to your Software Load Balancer
- SubnetIDs – The subnet IDs are to your ECS activity and could be the identical subnets or personal subnets (so long as they’ve entry to the MSK cluster and the opposite public subnets)
- VpcID – That is the VPC you’re deploying to
After deploying the template, on the Output tab of the stack, you could find the Software Load Balancer URL.
We use this URL and log in to the Conduktor Platform with the consumer title [email protected] and password password. These login credentials could be modified utilizing the YAML configuration file, and you may even allow SSO and LDAP.
On the Conduktor console, you can begin creating matters, producing knowledge, consuming knowledge, and way more! AWS Glue Schema Registry assist is coming quickly, and Confluent Schema Registry compatibility is already obtainable.
To wash up your AWS account, carry out the next steps so as:
- Delete the third CloudFormation template (3 – create ECS Service.yaml).
- Delete the second CloudFormation template (2 – create MSK cluster.yaml).
- Empty the contents of your S3 bucket.
- Delete all of your photos in your ECR repository.
- Delete the primary CloudFormation template (1 – base conduktor.yaml).
You should use the Conduktor Platform towards as many MSK clusters as desired by enhancing the file conduktor-platform.yaml. You possibly can even hook up with your clusters operating elsewhere, for instance on Amazon Elastic Compute Cloud (Amazon EC2).
On our roadmap, we’re engaged on an entire integration with Amazon MSK, together with AWS Glue Schema Registry assist, Amazon MSK Join assist, and full monitoring capabilities.
The Conduktor Platform affords a restricted free tier with no time restrict. Head to Conduktor’s Get Began web page and create an account to start out utilizing the Platform alongside MSK clusters at this time.
In regards to the Creator
Stéphane Maarek is the co-founder of Conduktor. He’s additionally the lead teacher on Udemy for studying Apache Kafka and AWS Certifications, having taught these applied sciences to over 1.5 million learners. Via Conduktor, he needs to democratize entry to Apache Kafka and make its utilization seamless and enterprise-ready.