Optimizing Feature Store and Model Execution for Kpler

Challenge
Kpler, a leader in market intelligence and data analytics, faced challenges in optimizing their Feature Store ingestion process and model execution workflows. The existing system suffered from long data processing times, database connection issues, and lack of observability in model training. These inefficiencies led to resource-intensive workflows and operational bottlenecks, hindering real-time data-driven decision-making.


Our approach
To address these challenges, DS Stream focused on optimizing Kpler’s Feature Store ingestion, redesigning its architecture, and improving model execution workflows. First, we conducted a comprehensive review of the existing ingestion process, identifying inefficiencies and implementing monitoring tools to track data flows. Database access was improved by resolving connection pool exhaustion and setting up a read replica in collaboration with Kpler’s DevOps team. To ensure reliability, we introduced unit tests and failure management mechanisms.
Next, we analyzed how the Feature Store API interacted with different models and proposed improvements to enhance efficiency and scalability. A new framework was implemented, ensuring better integration with data pipelines and simplifying model access. Finally, we validated the entire model execution process, introducing solutions to prevent log loss and optimizing training workflows to improve traceability and performance.
The outcome
The collaboration between DS Stream and Kpler resulted in a significant transformation of their data workflows. The Feature Store ingestion process, once prone to inefficiencies, was streamlined, leading to a dramatic reduction in data processing time from four hours to just ten minutes. System stability improved as database connection issues were eliminated, ensuring smoother and more reliable operations. The new Feature Store framework provided a standardized approach, making model integration more seamless and enhancing collaboration between MLOps and Data Science teams. Additionally, model execution became more traceable, preventing log losses and increasing overall reliability. These improvements allowed Kpler to achieve a more efficient, scalable, and transparent data processing system.