This page was automatically translated and may contain errors. View in English.
T

Big Data Analyst

Tata Consultancy Services

Singapore · Contrato

Seja o primeiro a se candidatar

Experiência
6–12 yrs
Salário
Vagas
1
Publicado
há 4 horas
Work mode
No escritório
Eligibility
Candidates with strong big data engineering experience, particularly in PySpark and Scala, are suitable for this role. Prior exposure to streaming systems, cloud data engineering, and BFSI/banking/payments datasets is preferred.
Resume
Required to apply

Where you'll work

Descrição da vaga

Role Overview

This position is for a Big Data Engineer focused on PySpark and Scala. The assignment is based in Singapore and calls for 6 to 12 years of experience. The work centers on building and tuning large-scale data processing platforms that support both batch and real-time analytics across enterprise data environments.

What You Will Do

You will design and implement distributed data solutions, create reliable data pipelines, and help build modern data lakes and streaming systems. The role also includes improving processing efficiency, supporting cloud-based data engineering initiatives, and contributing to data quality and governance efforts.

Core Responsibilities

  • Build scalable data pipelines using PySpark and Scala.
  • Work on big data architecture and platform design for enterprise-scale systems.
  • Improve data processing performance and overall pipeline efficiency.
  • Develop streaming and near-real-time data solutions.
  • Support cloud-based data engineering and modern data stack implementations.
  • Contribute to data quality controls and governance practices.
  • Collaborate with teams to deliver solutions end to end.

Technical Stack

The role involves working across the Hadoop ecosystem, including HDFS, Hive, and YARN. It also requires experience with ingestion tools such as Kafka and Sqoop, along with common data formats like Parquet, ORC, JSON, and Avro.

Programming work will include Python/Scala development, advanced SQL for joins, aggregations, and query optimization, as well as shell scripting and Unix basics.

Streaming and messaging work will be centered on Kafka and event-driven architectures, along with real-time data processing frameworks.

Day-to-day tooling may include Git, CI/CD pipelines, and scheduling tools such as Airflow or Control-M. Exposure to Docker and Kubernetes is considered useful.

Cloud experience with at least one major platform such as AWS, Azure, or GCP is preferred, and familiarity with Databricks or Snowflake would be an added advantage.

Preferred Domain Exposure

Experience in BFSI, banking, or payments is preferred, especially with high-volume financial datasets and work involving data compliance or regulatory reporting.

Additional Requirements

The ideal candidate should bring strong PySpark and Scala coding ability, a solid understanding of big data architecture, and practical experience delivering cloud-based streaming solutions.

Deixe este campo se desejar uma resposta — não o utilizaremos para mais nada.

Clique para navegar, arrastar e soltar, ou colar uma captura de tela

PNG, JPG, GIF, MP4, WebM, MOV · Máximo de 20 MB cada · Até 5 arquivos