Senior Databricks Data Engineer

Open-ended contract

Beschrijving van het bedrijf

Organization

Our Mission Statement

Digital and human resources at the center of the sustainable development of our society.

In a world of continuous transformation, accelerated by technological developments and societal challenges, it is necessary to adapt in an ongoing, agile way to meet the challenges of the future.

About Inetum

Inetum is a European leader in digital services. Inetum’s team of 27,000 consultants and specialists strive every day to make a digital impact for businesses, public sector entities and society. Inetum’s solutions aim at contributing to its clients’ performance and innovation as well as the common good.

Present in 19 countries with a dense network of sites, Inetum partners with major software publishers to meet the challenges of digital transformation with proximity and flexibility. Driven by its ambition for growth and scale, Inetum generated sales of 2.4 billion euros in 2024.

For further information, please visit www.inetum.com

Functieomschrijving

To develop, implement, and optimize complex Data Warehouse (DWH) and Data Lakehouse solutions using the Databricks platform (including Delta Lake, Unity Catalog, and Spark) to ensure a scalable, high-performance, and governed data foundation for analytics, reporting, and Machine Learning.

Responsibilities

A. Databricks Development and Architecture

Advanced Design and Implementation: Design and implement robust, scalable, and high-performance ETL/ELT data pipelines using PySpark/Scala and Databricks SQL on the Databricks platform.
Delta Lake: Expertise in implementing and optimizing the Medallion architecture (Bronze, Silver, Gold) using Delta Lake to ensure data quality, consistency, and historical tracking.
Lakehouse Platform: Efficient implementation of the Lakehouse architecture on Databricks, combining best practices from DWH and Data Lake.
Performance Optimization: Optimize Databricks clusters, Spark operations, and Delta tables (e.g., Z-ordering, Compaction, Tuning Queries) to reduce latency and computational costs.
Streaming: Design and implement real-time/near-real-time data processing solutions using Spark Structured Streaming and Delta Live Tables (DLT).

B. Governance and Security

Unity Catalog: Implement and manage Unity Catalog for centralized data governance, fine-grained security (row/column-level security), and data lineage.
Data Quality: Define and implement data quality standards and rules (e.g., using DLT or Great Expectations) to maintain data integrity.

C. Operations and Collaboration

Orchestration: Develop and manage complex workflows using Databricks Workflows (Jobs) or external tools (e.g., Azure Data Factory, Airflow) to automate pipelines.
DevOps/CI/CD: Integrate Databricks pipelines into CI/CD processes using tools like Git, Databricks Repos, and Bundles.
Collaboration: Work closely with Data Scientists, Analysts, and Architects to understand business requirements and deliver optimal technical solutions.
Mentorship: Provide technical guidance and mentorship to junior developers and promote best practices.

Functie-eisen

A. Mandatory Knowledge (Expert Level)

Databricks Platform: Proven, expert-level experience with the entire Databricks ecosystem (Workspace, Cluster Management, Notebooks, Databricks SQL).
Apache Spark: In-depth knowledge of Spark architecture (RDD, DataFrames, Spark SQL) and advanced optimization techniques.
Delta Lake: Expertise in implementing and managing Delta Lake (ACID properties, Time Travel, Merge, Optimize, Vacuum).
Programming Languages: Advanced/expert-level proficiency in Python (with PySpark) and/or Scala (with Spark).
SQL: Advanced/expert-level skills in SQL and Data Modeling (Dimensional, 3NF, Data Vault).
Cloud: Solid experience with a major Cloud platform (AWS, Azure, or GCP), especially with storage services (S3, ADLS Gen2, GCS) and networking.

B. Additional Knowledge (Major Advantage)

Unity Catalog: Hands-on experience with implementing and managing Unity Catalog.
Lakeflow: Experience with Delta Live Tables (DLT) and Databricks Workflows.
ML/AI Concepts: Understanding of basic MLOps concepts and experience with MLflow to facilitate integration with Data Science teams.
DevOps: Experience with Terraform or equivalent tools for Infrastructure as Code (IaC).
Certifications: Databricks certifications (e.g., Databricks Certified Data Engineer Professional) are a significant advantage.

C. Education and Experience

Education: Bachelor’s degree in Computer Science, Engineering, Mathematics, or a relevant technical field.
Professional Experience: Minimum of 5+ years of experience in Data Engineering, with at least 3+ years of experience working with Databricks and Spark at scale.

Aanvullende informatie

Benefits

Full access to foreign language learning platform
Personalized access to tech learning platforms
Tailored workshops and trainings to sustain your growth
Medical insurance
Meal tickets
Monthly budget to allocate on flexible benefit platform
Access to 7 Card services
Wellbeing activities and gatherings

Working model: hybrid - 2 days at the office

Country

Romania

Location

Bucharest

Werknemers kunnen op afstand werken

Contract type

Open-ended contract

Toepassen

Back to jobs listing

Senior Databricks Data Engineer

Open-ended contract

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!

Join us to live your digital impact!