Descrição da vaga

Role Overview

This position is for a Big Data Engineer focused on PySpark and Scala. The assignment is based in Singapore and calls for 6 to 12 years of experience. The work centers on building and tuning large-scale data processing platforms that support both batch and real-time analytics across enterprise data environments.

What You Will Do

You will design and implement distributed data solutions, create reliable data pipelines, and help build modern data lakes and streaming systems. The role also includes improving processing efficiency, supporting cloud-based data engineering initiatives, and contributing to data quality and governance efforts.

Core Responsibilities

Build scalable data pipelines using PySpark and Scala.
Work on big data architecture and platform design for enterprise-scale systems.
Improve data processing performance and overall pipeline efficiency.
Develop streaming and near-real-time data solutions.
Support cloud-based data engineering and modern data stack implementations.
Contribute to data quality controls and governance practices.
Collaborate with teams to deliver solutions end to end.

Technical Stack

The role involves working across the Hadoop ecosystem, including HDFS, Hive, and YARN. It also requires experience with ingestion tools such as Kafka and Sqoop, along with common data formats like Parquet, ORC, JSON, and Avro.

Programming work will include Python/Scala development, advanced SQL for joins, aggregations, and query optimization, as well as shell scripting and Unix basics.

Streaming and messaging work will be centered on Kafka and event-driven architectures, along with real-time data processing frameworks.

Day-to-day tooling may include Git, CI/CD pipelines, and scheduling tools such as Airflow or Control-M. Exposure to Docker and Kubernetes is considered useful.

Cloud experience with at least one major platform such as AWS, Azure, or GCP is preferred, and familiarity with Databricks or Snowflake would be an added advantage.

Preferred Domain Exposure

Experience in BFSI, banking, or payments is preferred, especially with high-volume financial datasets and work involving data compliance or regulatory reporting.

Additional Requirements

The ideal candidate should bring strong PySpark and Scala coding ability, a solid understanding of big data architecture, and practical experience delivering cloud-based streaming solutions.

Big Data Analyst

Where you'll work