Position: Druid Specialist
Location: Basking Ridge, NJ (On-site 3 days/week)
As a Druid Specialist within our team, you will harness the power of Apache Druid to drive real-time analytics on streaming data, contributing to the processing and analysis of high-velocity and high-volume data streams. Your expertise will play a critical role in optimizing our data infrastructure for scalability and performance.
Key Responsibilities:
- Leverage Apache Druid for real-time analytics, ensuring efficient processing and analysis of streaming data from multiple sources.
- Performance tune and optimize Apache Druid clusters for enhanced scalability and throughput.
- Configure DRUID components, including Middle Managers and Historical nodes, for optimal data ingestion and query response times.
- Implement monitoring metrics and alerts using ELK/Grafana to oversee cluster performance.
- Design a robust consumption layer for reporting, facilitating insightful data analysis and visualization.
- Employ strong scripting skills in Python and Spark to support data processing and analytics workflows.
- Utilize Kubernetes for orchestration and management of containerized applications, enhancing the deployment and scalability of data processing pipelines.
- Advance data analysis techniques using SQL, including window functions, CTEs, and subqueries, to address complex data challenges.
Qualifications:
- 5+ years of hands-on experience with Apache Druid, including its real-time streaming capabilities.
- Proven track record of performance tuning and optimizing Apache Druid configurations.
- Expertise in Python, Spark, and Kubernetes, with a strong foundation in data processing and analytics.
- Advanced SQL skills, with the ability to perform complex data analysis.
- Experience setting up monitoring and alerting systems, preferably with ELK/Grafana.