Job Description
The Senior Data Engineer serves as a primary development resource for design, build, implementation, and supports enterprise application initiatives. The role requires working closely with data teams, frequently in a matrixed environment as part of a broader project team. As a senior-level position the role requires ‘self-starters’ who are proficient in problem solving and capable of bringing clarity to complex situations. The culture of the organization places an emphasis on teamwork, so social and interpersonal skills are equally important as technical capability. Due to the emerging and fast-evolving nature of Big Data/GCP technology and practice, the position requires that one stay well-informed of technological advancements and be proficient at putting new innovations into effective practice.
In addition, this candidate will have a history of increasing responsibility in a small multi-role team. This position requires a candidate who can analyze business requirements, perform design tasks, construct, test, and implement solutions with minimal supervision. This candidate will have a record of accomplishment of participation in successful projects in a fast-paced, mixed team (consultant and employee) environment. In addition, the applicant must be willing to mentor other developers to prepare them for assuming the responsibilities.
Core Competencies
• The following are highlighted entrepreneurial competencies and core expectations for the job/role:
• Communication and interpersonal skills
• Problem-solving and critical thinking skills.
• Understand strategic imperatives.
• Technology & business knowledge
This role will provide application development for specific business environments. Focus on setting technical direction on groups of applications and similar technologies as well as taking responsibility for technically robust solutions encompassing all business, architecture, and technology constraints.
• Responsible for building and supporting a GCP based ecosystem designed for enterprise-wide analysis of structured, semi-structured, and unstructured data.
• Build APIs, data pipelines, and systems to access and process data.
• Implement and test data pipelines.
• Build analytics on raw data.
• Troubleshoot data issues.
• Communicate with other teams to identify and solve problems.
• Set coding standards and perform code reviews.
• Mentor junior developers.
• Experience with APIs, microservices, and modern software patterns using containerized environment (i.e., Kubernetes) or serverless compute services.
• Make sure service levels are maintained, and any interruption is resolved in a timely fashion.
• Closely collaborates with team members to successfully execute development initiatives using Agile practices and principles.
• Collaborates with business analysts, project lead, management, and customers on requirements.
• Participates in the deployment, change, configuration, management, administration and maintenance of deployment process and systems.
• Proven experience effectively prioritizing workload to meet deadlines and work objectives.
• Gather requirements, designs, constructs, and delivers solutions with minimal team interaction.
• Works in an environment with rapidly changing business requirements and priorities
• Bring new data sources into GCP, transform and load to BQ and databases.
• Work collaboratively with Data Scientists and business and IT leaders throughout the company to understand Data needs and use cases.
What qualifications you will need:
Education:
• Bachelor's degree in computer science, related technical field, or equivalent experience
• Master's degree in computer science or related field
• 3+ years of experience in Data Engineer
• 1+ year(s) of experience in Healthcare
• 5+ years of experience in Information Technology
Knowledge & Abilities
A successful candidate will have:
• Strong understanding of best practices and standards for GCP Data process design and implementation.
• 2+ year(s) experience of hands-on experience with GCP platform and experience with many of the following components:
• Cloud Run, GKE, Cloud Functions
• Spark Streaming, Kafka, Pub/Sub
• Bigtable, Firestore, Cloud SQL, Cloud Spanner
• JSON, Avro, Parquet
• BigQuery, Dataflow, Data Fusion
• Cloud Composer, DataProc, CI/CD, Cloud Logging
• Vertex AI, NLP, GitHub
• 3+ Years of hands-on experience with many of the following components:
• Spark Streaming, Kafka
• SQL, JSON, Avro, Parquet
• Java, Python, or Scala
• Ability to multitask and to balance competing priorities.
• Ability to define and utilize best practice techniques and to impose order in a fast-changing environment. Must have strong problem-solving skills.
• Strong verbal, written, and interpersonal skills, including a desire to work within a highly-matrixed, team-oriented environment.
• CICD Deployment experience
• A successful candidate may have:
• Experience in Healthcare Domain
• API development and integration
• Hardware/Operating Systems:
• Linux, UNIX
• GCP
• Distributed, highly scalable processing environments.
Certifications (a plus, but not required):
• GCP Cloud Professional Data Engineer