Data

Data Engineer

2–5 yearsGreater Noida / RemoteFull-time

Role Overview

Join our Data practice as a Data Engineer and help clients build modern data platforms that turn raw operational data into reliable, decision-ready pipelines. You will work with the latest open-source data stack — dbt, Dagster, Apache Iceberg, and DuckDB — on real production systems. This is a great role for someone who values working across multiple industries and problem domains, wants mentorship from senior data engineers, and is ready to take ownership of full data platform deliverables.

Responsibilities

Build and maintain ELT pipelines using Airbyte, dbt, and Dagster
Design lakehouse architectures on S3/GCS with Apache Iceberg or Delta Lake
Implement data quality tests and alerting with dbt tests and Elementary
Set up CDC pipelines with Debezium for real-time replication
Support data consumers: analysts, ML engineers, and product teams
Write clear technical documentation for data models and pipeline logic

Requirements

2+ years in a data engineering role
Strong SQL — you can write window functions and CTEs in your sleep
Python for custom pipeline logic and data transformations
Experience with dbt Core (models, tests, macros)
Familiarity with at least one cloud data warehouse (Snowflake, BigQuery, or Redshift)
Comfortable with Git and basic CI/CD

Frequently Asked Questions

Do I need to know Spark?+

Helpful but not required. We use DuckDB and Trino for most analytical workloads, reserving Spark for large-scale batch jobs. If you know SQL and Python well, we can get you up to speed on the distributed computing side.

Back to all roles

Role at a Glance

Full-time

2–5 years

Greater Noida / Remote

Apply for this Role

Data Engineer

Or email your CV to contact@codexops.com