Create Alert

Data Engineer AWS EMR

Apply now »

Date: Jan 23, 2025

Location: shanghai, SH, CN

Company: NTT DATA Services

Job Description: Data Engineer (Production Support) for AWS EMR / All cloud services with Spark and Scala

Position Overview

We are seeking a highly skilled and motivated Data Engineer specializing in Production Support for AWS EMR (Elastic MapReduce) and Ali cloud services with spark and scala knowledge to join our dynamic team. The ideal candidate will ensure the smooth operation, performance, and stability of large-scale distributed data processing pipelines and applications deployed on AWS EMR using AWS services such as Glue, Redshift, Lambda, Step Functions, S3, Kafka, KMS, Secrets Manager. This role requires a mix of strong technical expertise, problem-solving skills, and operational excellence.

Key Responsibilities

1. Production Support:

Monitor, troubleshoot, and resolve issues in real-time for AWS EMR /Ali cloud clusters and associated data pipelines.
Investigate and debug data processing failures, latency issues, and performance bottlenecks.
Provide support for mission-critical production systems as part of an on-call rotation.
Analytical and problem-solving skills applied to Big Data domain. Strong exposure in Object Oriented concepts and implementation.

2. Cluster Management:

Manage AWS EMR cluster lifecycle, including creation, scaling, termination, and optimization.

Ensure effective resource utilization and cost optimization of clusters.
Apply patches and upgrades to EMR clusters and software components as needed.

3. Data Pipeline Maintenance:

Maintain and support ETL/ELT pipelines built on tools such as Apache Spark, Glue catalogue, Hive or Presto running on EMR.
Ensure data quality, consistency, and availability across pipelines and storage systems like S3, Redshift,Mysql or Snowflake.
Implement and monitor automated workflows using AWS tools like Step Functions, Lambda, and CloudWatch / Datadog

4. Performance Optimization:

Analyze and optimize EMR job performance by tuning Spark/Hive configurations and improving query efficiency.
Identify and address inefficiencies in data storage and access patterns.
Providing optimal solutions for performance enhancement and fine tuning of current applications.
Optimize the performance and cost-effectiveness of AWS Lambda, Glue, and Redshift deployments through continuous monitoring and best practices.

5. Monitoring and Reporting:

Set up and manage monitoring tools (e.g., AWS CloudWatch, Datadog, or Prometheus) to track system health and performance.
Develop alerting mechanisms and dashboards for proactive issue identification.
Provide daily/weekly monitoring reports on job status and alert on any long running/resource consuming issues

6. Collaboration and Documentation:

Collaborate with software developers, data scientists, and DevOps teams to resolve issues and optimize workflows.
Significant experience working with cross sectional teams - customers, project managers and technical teams for securing & executing project deliverables.
Maintain comprehensive documentation for troubleshooting guides, operational workflows, and best practices.

Required Skills and Qualifications

Technical Expertise:

Proficiency in managing AWS services and Ali cloud services , particularly EMR, S3, Lambda, Step Functions, PolarDB , OCC and CloudWatch.
Hands-on experience with distributed data processing frameworks like Apache Spark, Hive, or Presto.
Experience on Kafka, NiFi, Amazon Web Service (AWS), Maven, Ambari-TEZ, Stash and Bamboo.
Familiarity with data loading tools like Sqoop. Familiarity with cloud database like AWS Redshift, Aurora MySQL and PostgreSQL or Ali cloud PolarDB
Knowledge of workflow/schedulers like Oozie or Apache AirFlow.
Strong knowledge of Shell Scripting, python or Java for scripting and automation.
Familiarity with SQL and query optimization techniques.

Operational Skills:

Experience in production support for large-scale distributed systems or data platforms.
Ability to analyze logs, diagnose issues, and implement fixes in high-pressure scenarios.
Implement data modelling concepts, methodologies to optimize data warehouse solutions.
Manage detailed Standard Operating Procedure (SOP) using flow diagrams, source to target mapping, system architecture diagram and use cases

Problem-Solving:

Strong analytical skills to debug complex systems and resolve performance bottlenecks.
Soft Skills:
Effective communication skills to coordinate with cross-functional teams.
A proactive and customer-focused attitude to provide excellent production support.

Preferred Skills

Experience with CI/CD tools like Jenkins or GitLab for pipeline deployments.
Familiarity with container orchestration tools (e.g., Kubernetes, Docker).
Knowledge of data governance, security, and compliance in cloud environments.
Certifications in AWS (e.g., AWS Certified Big Data Specialty or AWS Certified Solutions Architect).

Education and Experience

Bachelor’s degree in computer science, Engineering, or a related field.
3+ years of experience in data engineering, production support, or a similar role.

Job Description: Data Engineer (Production Support) for AWS EMR / All cloud services with Spark and Scala

Position Overview

Key Responsibilities

1. Production Support:

Monitor, troubleshoot, and resolve issues in real-time for AWS EMR /Ali cloud clusters and associated data pipelines.
Investigate and debug data processing failures, latency issues, and performance bottlenecks.
Provide support for mission-critical production systems as part of an on-call rotation.
Analytical and problem-solving skills applied to Big Data domain. Strong exposure in Object Oriented concepts and implementation.

2. Cluster Management:

Manage AWS EMR cluster lifecycle, including creation, scaling, termination, and optimization.

Ensure effective resource utilization and cost optimization of clusters.
Apply patches and upgrades to EMR clusters and software components as needed.

3. Data Pipeline Maintenance:

Maintain and support ETL/ELT pipelines built on tools such as Apache Spark, Glue catalogue, Hive or Presto running on EMR.
Ensure data quality, consistency, and availability across pipelines and storage systems like S3, Redshift,Mysql or Snowflake.
Implement and monitor automated workflows using AWS tools like Step Functions, Lambda, and CloudWatch / Datadog

4. Performance Optimization:

Analyze and optimize EMR job performance by tuning Spark/Hive configurations and improving query efficiency.
Identify and address inefficiencies in data storage and access patterns.
Providing optimal solutions for performance enhancement and fine tuning of current applications.
Optimize the performance and cost-effectiveness of AWS Lambda, Glue, and Redshift deployments through continuous monitoring and best practices.

5. Monitoring and Reporting:

Set up and manage monitoring tools (e.g., AWS CloudWatch, Datadog, or Prometheus) to track system health and performance.
Develop alerting mechanisms and dashboards for proactive issue identification.
Provide daily/weekly monitoring reports on job status and alert on any long running/resource consuming issues

6. Collaboration and Documentation:

Collaborate with software developers, data scientists, and DevOps teams to resolve issues and optimize workflows.
Significant experience working with cross sectional teams - customers, project managers and technical teams for securing & executing project deliverables.
Maintain comprehensive documentation for troubleshooting guides, operational workflows, and best practices.

Required Skills and Qualifications

Technical Expertise:

Proficiency in managing AWS services and Ali cloud services , particularly EMR, S3, Lambda, Step Functions, PolarDB , OCC and CloudWatch.
Hands-on experience with distributed data processing frameworks like Apache Spark, Hive, or Presto.
Experience on Kafka, NiFi, Amazon Web Service (AWS), Maven, Ambari-TEZ, Stash and Bamboo.
Familiarity with data loading tools like Sqoop. Familiarity with cloud database like AWS Redshift, Aurora MySQL and PostgreSQL or Ali cloud PolarDB
Knowledge of workflow/schedulers like Oozie or Apache AirFlow.
Strong knowledge of Shell Scripting, python or Java for scripting and automation.
Familiarity with SQL and query optimization techniques.

Operational Skills:

Experience in production support for large-scale distributed systems or data platforms.
Ability to analyze logs, diagnose issues, and implement fixes in high-pressure scenarios.
Implement data modelling concepts, methodologies to optimize data warehouse solutions.
Manage detailed Standard Operating Procedure (SOP) using flow diagrams, source to target mapping, system architecture diagram and use cases

Problem-Solving:

Strong analytical skills to debug complex systems and resolve performance bottlenecks.
Soft Skills:
Effective communication skills to coordinate with cross-functional teams.
A proactive and customer-focused attitude to provide excellent production support.

Preferred Skills

Experience with CI/CD tools like Jenkins or GitLab for pipeline deployments.
Familiarity with container orchestration tools (e.g., Kubernetes, Docker).
Knowledge of data governance, security, and compliance in cloud environments.
Certifications in AWS (e.g., AWS Certified Big Data Specialty or AWS Certified Solutions Architect).

Education and Experience

Bachelor’s degree in computer science, Engineering, or a related field.
3+ years of experience in data engineering, production support, or a similar role.

Job Description: Data Engineer (Production Support) for AWS EMR / All cloud services with Spark and Scala

Position Overview

Key Responsibilities

1. Production Support:

Monitor, troubleshoot, and resolve issues in real-time for AWS EMR /Ali cloud clusters and associated data pipelines.
Investigate and debug data processing failures, latency issues, and performance bottlenecks.
Provide support for mission-critical production systems as part of an on-call rotation.
Analytical and problem-solving skills applied to Big Data domain. Strong exposure in Object Oriented concepts and implementation.

2. Cluster Management:

Manage AWS EMR cluster lifecycle, including creation, scaling, termination, and optimization.

Ensure effective resource utilization and cost optimization of clusters.
Apply patches and upgrades to EMR clusters and software components as needed.

3. Data Pipeline Maintenance:

Maintain and support ETL/ELT pipelines built on tools such as Apache Spark, Glue catalogue, Hive or Presto running on EMR.
Ensure data quality, consistency, and availability across pipelines and storage systems like S3, Redshift,Mysql or Snowflake.
Implement and monitor automated workflows using AWS tools like Step Functions, Lambda, and CloudWatch / Datadog

4. Performance Optimization:

Analyze and optimize EMR job performance by tuning Spark/Hive configurations and improving query efficiency.
Identify and address inefficiencies in data storage and access patterns.
Providing optimal solutions for performance enhancement and fine tuning of current applications.
Optimize the performance and cost-effectiveness of AWS Lambda, Glue, and Redshift deployments through continuous monitoring and best practices.

5. Monitoring and Reporting:

Set up and manage monitoring tools (e.g., AWS CloudWatch, Datadog, or Prometheus) to track system health and performance.
Develop alerting mechanisms and dashboards for proactive issue identification.
Provide daily/weekly monitoring reports on job status and alert on any long running/resource consuming issues

6. Collaboration and Documentation:

Collaborate with software developers, data scientists, and DevOps teams to resolve issues and optimize workflows.
Significant experience working with cross sectional teams - customers, project managers and technical teams for securing & executing project deliverables.
Maintain comprehensive documentation for troubleshooting guides, operational workflows, and best practices.

Required Skills and Qualifications

Technical Expertise:

Proficiency in managing AWS services and Ali cloud services , particularly EMR, S3, Lambda, Step Functions, PolarDB , OCC and CloudWatch.
Hands-on experience with distributed data processing frameworks like Apache Spark, Hive, or Presto.
Experience on Kafka, NiFi, Amazon Web Service (AWS), Maven, Ambari-TEZ, Stash and Bamboo.
Familiarity with data loading tools like Sqoop. Familiarity with cloud database like AWS Redshift, Aurora MySQL and PostgreSQL or Ali cloud PolarDB
Knowledge of workflow/schedulers like Oozie or Apache AirFlow.
Strong knowledge of Shell Scripting, python or Java for scripting and automation.
Familiarity with SQL and query optimization techniques.

Operational Skills:

Experience in production support for large-scale distributed systems or data platforms.
Ability to analyze logs, diagnose issues, and implement fixes in high-pressure scenarios.
Implement data modelling concepts, methodologies to optimize data warehouse solutions.
Manage detailed Standard Operating Procedure (SOP) using flow diagrams, source to target mapping, system architecture diagram and use cases

Problem-Solving:

Strong analytical skills to debug complex systems and resolve performance bottlenecks.
Soft Skills:
Effective communication skills to coordinate with cross-functional teams.
A proactive and customer-focused attitude to provide excellent production support.

Preferred Skills

Experience with CI/CD tools like Jenkins or GitLab for pipeline deployments.
Familiarity with container orchestration tools (e.g., Kubernetes, Docker).
Knowledge of data governance, security, and compliance in cloud environments.
Certifications in AWS (e.g., AWS Certified Big Data Specialty or AWS Certified Solutions Architect).

Education and Experience

Bachelor’s degree in computer science, Engineering, or a related field.
3+ years of experience in data engineering, production support, or a similar role.

Job Segment: Cloud, Computer Science, Data Warehouse, Developer, Solution Architect, Technology

Apply now »