Job Description
STRATEGIC STAFFING SOLUTIONS HAS AN OPENING!
This is a Contract Opportunity with our company that MUST be worked on a W2 Only. No C2C eligibility for this position. Visa Sponsorship is Available! The details are below.
“Beware of scams. S3 never asks for money during its onboarding process.”
Job Title: Data Engineer
Contract Length: 24 Month contract
Location: Irving, Texas
Job Overview
In this assignment, you may: Consult on complex initiatives with broad impact and large-scale planning for Software Engineering. Review and analyze complex multi-faceted, larger scale or longer-term Software Engineering challenges that require in-depth evaluation of multiple factors including intangibles or unprecedented factors. Contribute to the resolution of complex and multi-faceted situations requiring solid understanding of the function, policies, procedures, and compliance requirements that meet deliverables. Strategically collaborate and consult with client personnel.
Day to Day
- Design and develop ETL/ELT workflows and data pipelines for batch and real-time processing.
- Build and maintain data pipelines for reporting and downstream applications using open source frameworks and cloud technologies.
- Implement operational and analytical data stores leveraging Delta Lake and modern database concepts.
- Optimize data structures for performance and scalability across large datasets.
- Collaborate with architects and engineering teams to ensure alignment with target state architecture.
- Apply best practices for data governance, lineage tracking, and metadata management, including integration with Google Dataplex for centralized governance and data quality enforcement.
- Develop, schedule, and orchestrate complex workflows using Apache Airflow, with strong proficiency in designing and managing Airflow DAGs.
- Troubleshoot and resolve issues in data pipelines and ensure high availability and reliability.
Skills
- Strong Understanding of Data: Data structures, modeling, and lifecycle management.
- ETL/ELT Expertise: Hands-on experience designing and managing data pipelines.
- PySpark: Advanced skills in distributed data processing and transformation.
- NetApp Iceberg: Experience implementing open table formats for analytics.
- Hadoop Ecosystem: Knowledge of HDFS, Hive, and related components.
- Cloud Platforms: GCP (BigQuery, Dataflow), Delta Lake, and Dataplex for governance and metadata management.
- Programming & Orchestration: Python, Spark, SQL.
- Workflow Orchestration: Strong experience with Apache Airflow, including authoring and maintaining DAGs for complex workflows.
- Database & Reporting Concepts: Strong understanding of relational and distributed systems.