Data Engineer – IAM Data Lake (Google Cloud Platform)

Location: Irving, TX (Preferred) or Ohio (Hybrid)
Duration: 12 Month Contract

W2 ONLY, NO C2C

Overview

We are seeking a skilled Data Engineer to support the design, development, and enhancement of an enterprise IAM Data Lake platform within Google Cloud Platform (GCP).

This role will focus on building scalable data lake solutions, developing data ingestion pipelines, and supporting large-scale data processing initiatives using modern cloud and big data technologies. The ideal candidate will have hands-on experience with Google Cloud Platform, data lake architectures, big data processing frameworks, and Hadoop-based environments.

Experience with Hadoop/HDFS and cloud-native data engineering solutions is highly desirable.

Key Responsibilities

Design, build, and maintain scalable Data Lake solutions within Google Cloud Platform (GCP).
Develop and support batch and streaming data ingestion pipelines using GCP-native services and big data technologies.
Build and optimize data processing workflows to support enterprise-scale analytics and reporting requirements.
Design and implement data models, ingestion frameworks, and data transformation processes.
Develop and maintain PySpark-based data processing applications.
Utilize Apache Airflow to orchestrate and manage complex data workflows.
Implement and maintain CI/CD pipelines to support automated deployment and delivery of data engineering solutions.
Design and manage Pub/Sub-based streaming architectures and event-driven data processing workflows.
Support event schema design, schema evolution, and versioning best practices.
Implement incremental data ingestion strategies and Change Data Capture (CDC) patterns.
Develop APIs and integration solutions to support data consumption and data-sharing requirements.
Create and maintain curated datasets, analytical views, and data exposure layers for downstream consumers.
Collaborate with architecture, engineering, security, and business teams to ensure data solutions align with enterprise standards.

Required Qualifications

4+ years of experience working with Google Cloud Platform (GCP).
4+ years of experience building and supporting large-scale data processing solutions.
4+ years of experience with PySpark and distributed data processing.
4+ years of experience implementing CI/CD practices and deployment automation.
2+ years of experience building and maintaining data pipelines.
2+ years of experience with Apache Airflow.
Experience developing and integrating APIs.
Experience working with data lake architectures and cloud-based storage solutions.
Understanding of data modeling concepts and best practices.
Strong understanding of data processing frameworks and big data technologies.

Preferred Qualifications

Experience with Hadoop Ecosystem technologies and HDFS.
Experience designing and implementing streaming architectures using Google Pub/Sub.
Familiarity with Change Data Capture (CDC) methodologies and incremental ingestion frameworks.
Experience building enterprise-scale IAM or security-related data platforms.
Knowledge of data governance, lifecycle management, and access control best practices within GCP.
Experience supporting analytical data platforms and data consumption frameworks.

Technical Skills

GCP skills are optional and preference will be given to .NET, specifically the following:

Databases & Data Engineering

Microsoft SQL Server (Advanced)
T‑SQL, Stored Procedures, Functions, Views
SQL Server Packages (SSIS or equivalent concepts)

Programming & Development

C# (.NET Framework / .NET Core)
Backend Development & Data Access Layers
Integration of SQL Server with .NET applications

Cloud & Data Platforms

Google Cloud Platform (GCP)
Google Cloud Storage (GCS)
Pub/Sub
Data Lake Architecture

Data Engineering & Processing

PySpark
Apache Airflow
Data Pipelines
Data Processing
Data Modeling
Change Data Capture (CDC)

Big Data Technologies

Hadoop Ecosystem
HDFS

Data Formats

Parquet
Avro
ORC

Development & Automation

APIs
CI/CD Pipelines
Version Control
Automation Frameworks

Job Information

Rate / Salary

$ - $

Sector

Information Technology