Background and Challenge
Our client is a leading, global automotive group that designs, manufactures and distributes passenger and commercial vehicles, motorcycles, engines, and turbomachinery. They also offer related services including financing, leasing, and fleet management.
Our client was accumulating large amounts of data from cars driving on the road, robots from manufacturing facilities, IoT sensors, dealership data, registration data, and internally generated data by humans. Because the data was hot, the client needed a single data platform solution to integrate customer data so it could be accessed immediately when needed for use cases or analysis. Their current, on-prem solution was inflexible and required long turnaround times for adding new capacity.
The storage solution needed to be easily scalable and low-cost, to keep up with the terabytes of incoming data. It also needed to be able to seamlessly integrate with Tableau, the data visualization tool the client was currently using.
Adastra implemented a data lake in Amazon S3 and created an enterprise data model on top of it, running on Redshift. Leveraging this design, when the curated data comes in through S3, it can be moved to Redshift.
Adastra created data marts, using stored procedures and AWS Glue jobs, allowing us to overlay the enterprise data model on top of the data lake, mimicking the client’s on-prem solution, in the cloud.
We used the external tables feature in Amazon Redshift to query data pulled from Amazon S3, so the client can manage the amount of data stored in Redshift while still having it ready to easily pull in for use cases. With the implemented solution, the client can now perform analysis on an ad hoc basis and store the rest of the data in S3, reducing overall storage costs. The external tables can also be used as input for visualization or machine learning if desired.
Adastra will continue to work with the client to support their strategy to be a data-driven and customer-centric company. We will be adding new features and refinements to the platform, to ensure we are continuously optimizing its performance.
Adastra’s solution met all the client’s requirements and more. Redshift satisfied their need for hot data storage: the client has been able to easily extract data, and their reports now run in under an hour compared to several hours each day using their on-prem solution. The solution has been able to easily integrate with Tableau, and the client has been successfully running 100+ jobs every morning pulling data into Redshift.
Their costs for using Redshift were reduced considerably, thanks to the Redshift RA3 instances which decouple storage and compute, and both can be scaled up as required.
Adastra’s work has set the client up to bring in new use cases more easily, as well as begin implementing data science solutions. Having the data in one central location on the cloud and using hot data storage allows data scientists to access it immediately to explore schemas and be productive as soon as possible. Redshift also easily integrates with Amazon SageMaker Notebook Instances.
Adastra partnered with the client to build their cloud analytics platform on AWS and has worked with them to deliver several other high-quality projects in the past. Based on this, the client was eager to partner with Adastra again for their current solution.
Adastra is a Advanced AWS Partner – we specialize in delivering fast analytics on the cloud and provide a suite of services around Data Estate Modernization, AI & Analytics, Governance, and Managed Services.
From our 20+ year history, we have extensive experience in implementing and modernizing legacy systems. We leverage best practices and accelerators to implement Redshift, ensuring that our clients achieve fast-tracked delivery and value from their analytics engines.