About the Client
Our client is one of the world’s largest online payment processing companies with huge annual revenues . The client wanted to build a robust and highly scalable application to process and store incoming click events in real-time.
Client Challenges/Pain Points
- Architectural challenges in combining stream and batch events to ensure the resiliency of the input data
- Extracting, transforming, and merging unstructured events from multiple sources
- High volume of data
Reliant Vision introduced the data science and analytics solutions to address the issue. Our data scientists used Open Source Technologies like Apache Kafka, Hadoop HDFS, for data ingestion from various sources. Following that, our analytic experts processed the incoming big-data using Spark computing framework and stored the processed data into Apache Hive warehouse for querying and analysis.
The technologies used in the process are Apache Kafka, Hadoop HDFS and the Spark computing framework.
The major benefits that the customer received are:
- 15 mins for processing a day’s worth of data – processed 100 million real-time events/hour
- Brought together data from several sources creating a single source of truth
- The new streaming pipeline has fault tolerance enabled via check pointing