RedBus has been an early adopter of cloud technology and has been on AWS almost from the very inception of AWS. We run all our workloads on the AWS cloud and redBus as a company has been a part of the various phases of AWS over the last 11 odd years.
Over the years, redBus has grown to be the leading online bus ticketing company and we have adopted many of the services offered by AWS during our various stages of growth. Below is a snapshot of how redBus has progressed over the years and what services of AWS have we used during this time.
Generation 1 : S3, EC2 Classic, RDS, Route 53 Generation 2 : CloudFront, SQS, SNS, Elastic CacheGeneration 3 : VPC, Redshift, EMR, Kinesis, Lambda, Generation 4 : Athena, SageMaker, Containers (EBS, ECS) and most recently Fargate.
It is key to note that we have over the years been able to find the right mixture of services and stack that has enabled us to scale and build out newer features for our customers.
As it should be with any organization, Data is at the heart of almost everything we do at redBus. And as for volumes, we generate terabytes of data every month. Sources of data vary from Inventory data, transactional/payment data, user or usage related data etc. We use these to do a variety of analysis including comparative analysis, demand analysis, pattern analysis etc to make key business or process decisions. For example we analyse success rates of the various payment gateways/modes and identify which gateway is working best on which channel and have built in ability to dynamically switch gateways in case the trend is varying.
To store, process and analyse all these data we use a myriad of AWS tools including RedShift, EMR, Athena. The ability of Redshift to scale and cater to peak workloads and the inherent ability to work on really large data sets has helped us to come up with some great analysis which otherwise would have taken much longer. Additionally we started using Athena when it was released. Adopting Parquet on Athena has helped us achieve better efficiency in storing and analysing data in S3.
How can we have an article nowadays without the mandatory ML and AI mention? Outside of data processing for analytics and such, there are a few use cases within redBus where we need to leverage Machine Learning.
Lambda allows one to run code without the need of provisioning and managing servers and scales automatically
Since all our applications, workloads, data are on AWS the obvious choice was to leverage AWS Sagemaker. Our recent feature of review classification has been built over this. redBus also has incorporated AWS Lambda services into many of its internal applications. Lambda allows one to run code without the need of provisioning and managing servers and scales automatically. This is particularly interesting for use cases that can be compute intensive but runs at specific intervals. Like a report service which runs once or twice a day, that may need to pour over huge volumes of data and then compute and send results. There are also other use cases for which we are using Lambda. For instance we use this for processing and compressing images uploaded by customers. We also use Lambda for running the backend of our Alexa skill and our self-help chat bot.
While AWS as a solution provides a great deal of flexibility to launch instances or services and scale up quickly, one thing that bothers almost everyone would be the cost factor. In our experience of being on AWS, we have seen that they have been continuously working towards bringing down prices. However that being said, costs can really spiral out if one were not to keep an eye on the same.
There are a few reasons where we at redBus have seen cost spikes. The primary one has been where instances that were launched either for temporary scale ups or for running a small workload or POCs not being shut down post its use. Scenarios like using older generation or wrong types of instances for the wrong work load. For example, using a compute intensive instance for workloads not needing the same. In such cases you could end up spending much more than what you should have ideally.
However all is not grey and bleak. There are a few things that one could follow to get some costs down.
1. Purchasing and Optimal utilization of RI. We have seen that if planned well, purchased and by ensuring optimal usage, we have got roughly 30%+ savings. And if you can take these for longer durations with upfront payment then the savings could be much more. However one would need to be extremely sure of a long term commitment and that could very well depend on your business, system architecture and plans on how your system would evolve.
2. Look at utilizing Lambda’s where possible. We have seen that moving short burst workloads on to Lambda has helped us save almost 70% costs.
3. Monitor and shut down unused instances regularly. We have seen this can be a source of big leakage.
4. Look at Spot Instances. redBus has moved some of its workloads over the last few quarters onto Spot Instances. And we have seen good savings and the plan is to move more possible instances on to spot.
5. Finally, have a look at your architecture. Over a few years’ time, your code can become bulky and non-optimized. I also firmly believe that simplifying things can, not just help scale faster, but management, distribution etc. would also be extremely simplified. I can assure you, based on the work which we have done on redBus, that we have been able to save some good money by simplifying some of our systems. We also work closely with AWS SME’s to do ‘well architected review’ and identify areas of improvements.
Hope this article gives you some insights into how we are leveraging AWS at redBus and our experiences in trying to optimize costs would help you in some way.