How Teracloud helped Gerald to add new services for its customers faster and cheaper
Executive Summary
Gerald is on a mission to eliminate stress about paying bills on time. To do it, Gerald requires ingesting loads of information from different sources, processing it, and generating distinct data assets to provide it’s business value. Teracloud engineered a flexible model of data ingesting, processing and storing pipelines to accommodate all use cases; with this model Gerald reduced the cost and time needed to add new bills sources to its platform.
About Gerald Tech
Paying bills sucks. It sucks even more for about a third of Americans living paycheck to paycheck and struggling to pay bills on time. Gerald is on a mission to eliminate stress about it. Gerald is transforming bill payments by providing consumers with an application for linking and automatically paying their household bills while offering overdraft and late fee protection for all their biller accounts. An app tracks and pays bills so the users don’t have to.
Customer Challenge
Gerald continuous expansion to new services and new platforms required a solid data strategy. New data streams, new processing approaches, more insights generated, better service to its customers mean more time and effort devoted to data. After the initial implementation, Gerald needed a systematic approach to data collection, processing, and reporting. Being on the financial market also implied using sensitive customer data, so everything should be compliant with the strictest financial regulations.
If the company continued with hand-made solutions and customized to each data source it would not be able to scale quickly enough and the company growth would have been impacted as soon as more and more new services were added.
Main Challenges
-
Security: Dealing with financial data.
-
Scalability: More and more services are added every day, scaling is one of the most important factors for the company. Also, more data required a more intelligent approach to storage, so a Data Lake is included as a way to consolidate data.
-
Reliability: Not timely or incorrect information results in a direct penalty for the customers, and consequently, for the bottom line.
-
Performance: Dealing with deadlines for payments means that no delays can be introduced by processing bottlenecks.
Why AWS
AWS was chosen because it helps create a secure, high-performance, resilient, and efficient workload and application infrastructure, and this was what Gerald was needing to create a strong data strategy.
Using AWS Lambda, AWS Glue, and EMR we achieve our processing goals faster and with less cost. Also, the data lake capabilities of s3 and (in the future) LakeFormation means less cost and development effort for the data storage and long-term analytics and machine learning capabilities. Finally, being able to integrate data storage, analysis, presentation, and usage in a serverless environment means less costs overall.
Why Gerald chose Teracloud
Gerald wanted to have an experienced AWS partner with a strong background in data processing pipelines, a high-security record, and the capabilities to build a secure, scalable, high-performing, resilient, and efficient infrastructure for their applications and workloads. Teracloud has shown these capabilities on past projects, and also implementation speed on similar initiatives.
Teracloud Solution
Get Inspire
The solution was structured around 6 core areas and aligned with the AWS Well-Architected Framework:
-
Core customer requirements
-
Security
-
Performance
-
Reliability
-
Cost optimization
-
Operational excellence.
The core of the solution is a blueprint for data pipelines that can be quickly adapted, deployed, and connected to the existing internal Gerald services. After the initial design, this blueprint was used to implement the first set of data pipelines to validate their approach and start feeding data to Gerald core algorithms.
The generic pipeline high level design is presented below:
In this design, we leverage the power of AWS services to reach our goals faster and with fewer costs:
-
The AWS lambda that starts the pipeline can be triggered from external providers, by a cron job, or by responding to some event inside Gerald’s infrastructure.
-
Using AWS Glue enables us to do ETLs and to catalog the data in a centralized way. It also allows us to share the catalogs with any other future tool we decide to use.
-
AWS EMR gives us the power to transform the data efficiently with maximum flexibility and using transient clusters empowers us to minimize the cost.
-
Adding SageMaker to the generic pipeline further enables us to add machine learning capabilities to any pipeline that requires it.
Results and Benefits
The new pipeline model results in immediate benefits versus the custom-made pipelines by leveraging the power of the AWS services. Some results and benefits are listed below, grouped by area of impact:
-
Reliability: Using this new architecture design and implementation we achieved the following benefits:
-
99.999% availability of the pipeline
-
Auto-scaling capabilities, because any number of parallel jobs can be triggered and processed by AWS Glue and EMR.
-
Any failure can be fixed because of the step-by-step nature of the data processing pipeline.
-
-
Performance: Using EMR transient clusters as the main processing task and Glue as a way to keep the data schema updated means minimal waste.
-
Security: Teracloud has implemented the best practices when designing the pipelines and subsystems interacting with it, like:
-
Using IAM fine-grained permissions for roles.
-
Using a 3-tier architecture for the VPC, and isolating the data in the more restricted data zone.
-
Enable a data-access policy for S3 buckets access.
-
Next Steps
Gerald’s continuous growth means it requires more and more IT services to support and optimize this growth. The excellent experience of working with Teracloud as an AWS certified partner allowed us to start two lines of work:
-
Continue supporting the addition of new data sources, with its particular transformation and storage needs.
2. Extend Teracloud’s involvement in the design and development of other applications in Gerald’s ecosystem.
About the Partner
Teracloud is a fast-growing AWS Advanced Consulting Partner company created by certified cloud experts in migrating and deploying startups, enterprises, and everything in between to the cloud.
We have worked for companies from many different industries such as airlines, healthcare, education, and e-commerce designing, implementing, and managing Cloud workloads with HA architecture under 99.999% uptime SLA and PCI/HIPAA compliance requirements.
We also have a strong commitment to the Cloud community, we host local branch of AWS User Group and meetups in support of education, evangelization, and evolution of the IT community. As an Amazon Web Services partner, we have been invited to participate in the AWS Community Day Buenos Aires 2019 and AWS main conference in Las Vegas re: invent 2019 as speakers.