Data Engineer

28-04-2026

Zwijnaarde

For our client in Zwijnaarde, we are looking for an Data Engineer.

Data Engineer - Databricks on AWS

Experienced Data Engineer specializing in Databricks on AWS, focused on building scalable, reliable, and cost-efficient data pipelines on a governed lakehouse platform. Strong expertise in ETL/ELT development, Unity Catalog governance, CI/CD automation, and monitoring.

Core Expertise

 

ETL/ELT Development (PySpark & SQL)

Design and build scalable data pipelines using PySpark and SQL on Delta Lake - implementing medallion architecture (bronze/silver/gold), incremental and CDC patterns, schema evolution, and optimization strategies.

Data Modeling

Design dimensional models for analytics and define business-friendly metrics based on classic lakehouse data modeling strategies.

 

Databricks Platform & Infrastructure

Set up and configure Databricks workspaces, compute resources, and permissions. Manage cluster configurations, autoscaling, and pool strategies for optimal performance and cost.

 

CI/CD & Deployment Automation

Build reproducible CI/CD pipelines for Databricks notebooks, jobs, and infrastructure using GitHub Actions, and define quality gates through automated testing, review approvals, and validation checks.

 

Unity Catalog & Data Governance

Implement fine-grained access controls, data lineage, metadata management, and catalog organization using Unity Catalog. Enforce security policies, tagging conventions, and auditability across workspaces and teams.

 

Monitoring, Observability & Cost Control

Set up monitoring dashboards and alerting for Databricks to track cost, usage, capacity, cluster utilization, and job performance. Implement cost tagging, autoscaling policies, and cluster pool strategies to optimize spend.

 

Collaboration & Knowledge Sharing

Work in an Agile/Scrum setting with cross-functional teams. Support software engineers with guidance, code reviews, and best practices when they build their own data use cases on the platform.

 

Tools & Technologies

• Databricks: Delta Lake, Databricks Jobs & Pipelines, Unity Catalog

• Languages: Python, PySpark, SQL

• AWS: S3, IAM, KMS, CloudWatch, VPC

• CI/CD & IaC: GitHub Actions, Terraform, Databricks Asset Bundles

• Data Quality: Great Expectations, Lakehouse monitoring, or similar frameworks

• Monitoring: Databricks built-in monitoring, CloudWatch, custom dashboards

 

Key Responsibilities

• Design, build and maintain bronze/silver/gold data models and production-ready pipelines.

• Design and optimize ETL/ELT pipelines for large-scale data processing on Databricks.

• Implement Unity Catalog governance: access controls, lineage, metadata, and cross-team policies.

• Build and maintain CI/CD pipelines for Databricks workloads with proper testing and validation.

• Configure and manage Databricks workspaces, and compute configurations.

• Embed automated data quality checks into pipelines to ensure reliable, trusted data products.

• Work closely with data scientists, analysts, and product teams to translate requirements into scalable solutions.

• Support development teams with best practices, documentation, and guidance when onboarding to the platform.

 

Profile Summary

A hands-on data engineer who builds and maintains production-grade lakehouse solutions on Databricks in AWS - designing robust data pipelines, setting up CI/CD and governance with Unity Catalog, and ensuring full visibility into cost and platform health. Comfortable working in Agile environments and supporting other teams when they onboard to the platform.

 

 

Contact