Case Study - Kafka migration to Amazon MSK for improved visibility, scalability, security and cost control - Peachjar
Peachjar is a SaaS platform for delivering communications in the education market. They have a containerized, distributed solution that sends out millions of emails, newsletters, and other communications to teachers, students, and parents. Communication between services on the platform is critical infrastructure for Peachjar, and they brought on Stratus10 to assist them in migrating their Kafka messaging system off a 3rd-party and onto their own infrastructure.
With Kafka running on their AWS Environment with MSK, Peachjar was able to reduce costs, improve visibility into their infrastructure, leverage infrastructure as code, and control the scalability of their infrastructure.
- Peachjar brought their Kafka infrastructure in-house, giving them maximum control over configuration, monitoring, and maintenance while reducing costs.
- Stratus10 reduced the hosting cost of Kafka by more than 20%, lowered data egress costs substantially, and gave full visibility into how messages flow through the Peachjar system.
- Peachjar is in a better position to recover from disasters and outages, and is able to proactively monitor their environment.
Peachjar is a medium-size educational technology company founded in 2011 that produces eLearning, communication, and other software solutions targeted at increasing parent and student engagement in the learning environment. Their target market is K-12 school districts.
About Kafka: Apache Kafka is a community distributed event streaming platform capable of handling trillions of events a day. It was initially conceived as a messaging queue, but since being created and open sourced by LinkedIn in 2011, Kafka quickly evolved into a full-fledged event streaming platform.
Peachjar’s platform is a distributed, microservice architecture consisting of 60+ services and hundreds of containers. These services communicate with each other using Kafka topics, and they send hundreds of thousands of messages per day. Peachjar needed to optimize their hosting costs, improve their monitoring and alerting capabilities, and sought more granular control of their environment.
Peachjar was spending a substantial portion of their cloud budget on the 3rd-party hosted Kafka solution. Although the 3rd Party was hosted in AWS as well, they were not able to provide VPC endpoints, causing Peachjar to incur data egress costs for all of their messaging between services. In addition, their cluster was excessively large for the traffic they were running, leading to wasted infrastructure budgets.
Additionally, Peachjar needed to improve their monitoring and alerting capabilities with Kafka. With their vendor, they had little visibility into the performance of the cluster, and weren’t able to proactively set alerts when important events occurred.
Finally, Peachjar needed more fine-grained control of their Kafka environment. They wanted to manage scaling of the cluster so they would not have to over-provision infrastructure, and wanted the ability to modify how messages were serialized and transported.
Why AWS and Stratus10
Amazon Managed Streaming for Apache Kafka (MSK) provides Peachjar with a scalable and affordable solution to hosting Kafka. It allows Peachjar to have maximum control and monitoring of their cluster while preventing the operational overhead of hosting the infrastructure on their own. Finally, through CloudFormation, MSK is delivered repeatedly across their environments and further automates their deployment and disaster recovery capabilities.
Stratus10 was the perfect partner to assist Peachjar in safely migrating a core component of their application to AWS. Our expertise across AWS’ suite of services ensured that Peachjar’s solution would be delivered using the best combination of tools available. Stratus10’s experience in delivering complex containerized solutions across the AWS ecosystem as well as previous projects with Peachjar gave the customer confidence that all scaling, cost, and performance targets would be met.
Stratus10 provided Peachjar with an automated, Infrastructure as Code approach to delivering the Kafka Infrastructure. The CloudFormation template is deployed as a nested stack alongside other components of their environment that Stratus10 helped automate.
The Kafka cluster is deployed to Peachjar’s VPCs and leverages the same availability zones as their EKS cluster, ensuring that all network traffic remains local and does not incur any regional or egress costs.
Results and Benefits
Stratus10 was able to deliver a solution for Peachjar that addressed all of their challenges with the 3rd Party solution.
Stratus10 helped Peachjar reduce their Kafka hosting costs by more than 20%. By leveraging an autoscaling cluster, Peachjar was able to reduce the size of the cluster and the VM size to better align with their utilization. In addition, all data transfer costs were eliminated, freeing up infrastructure resources for other projects.
Stratus10 was also able to improve Peachjar’s visibility into their cluster. By leveraging CloudWatch, Peachjar is able to view granular metrics about their cluster’s health, topic sizes, scaling operations, and more. Dashboards were created to make monitoring the cluster easier, and alerts were set up and delivered through CloudFormation that notify both Stratus10 and Peachjar if any issues arise with the cluster.
Finally, Peachjar needed more positive control over their critical infrastructure. They needed the ability to deploy new environments easily, implement disaster recovery, and improve the scalability of the infrastructure. Using MSK and CloudFormation, Stratus10 was able to address each of these items.
Stratus10 is an Advanced AWS Partner Network (APN) Consulting Partner helping companies migrate to the cloud or if they are already on AWS we help them implement best practices. We specialize in application modernization, DevOps, migration, security, and cost optimization to help our clients take full advantage of the latest technologies AWS has to offer
Use case: Kafka
Date: April 2022
Category: Big data, DevOps