G
Data Engineer (PySpark)
GSS Tech Group
Dubai, UAEAED 7,000-18,000/moYesterday
UAEIT & TechnologyFull Time
Skills Required
SqlAwsAzureDevops
Job Description
We are seeking a highly skilled Data Engineer with strong expertise in PySpark and the Cloudera Data Platform (CDP). The ideal candidate will design, develop, and maintain scalable data pipelines while ensuring high data quality, performance, and availability across the organisation.This role requires hands-on experience in big data ecosystems, cloud-native technologies, and advanced data processing frameworks. You will collaborate with cross-functional teams to build reliable and high-performance data solutions that drive business insights.Key Responsibilities1. Data Pipeline DevelopmentDesign, develop, and maintain scalable ETL/ELT pipelines using PySpark on CDPEnsure data integrity, reliability, and performance optimisation2. Data IngestionDevelop ingestion frameworks to collect data from relational databases, APIs, streaming sources, and file systemsLoad structured and unstructured data into Data Lake/Data Warehouse environments3. Data Transformation & ProcessingProcess, cleanse, and transform large-scale datasets using PySparkBuild reusable data processing components4. Performance OptimisationTune Spark jobs and Cloudera components for optimal performanceOptimise memory, partitioning, and execution plansReduce ETL runtime and improve cluster efficiency5. Data Quality & ValidationImplement data validation checks and monitoring mechanismsEnsure end-to-end data quality and governance standards6. Automation & OrchestrationAutomate workflows using tools such as Apache Oozie, Apache Airflow, or similar orchestration frameworksMaintain CI/CD integration for data pipelines7. Monitoring & SupportMonitor pipeline health and troubleshoot failuresProvide production support and continuous improvementsRequired Skills & Qualifications5+ years of experience in Data EngineeringStrong hands‑on experience in PySparkExperience working on Cloudera Data Platform (CDP)Strong knowledge of Hadoop ecosystem (HDFS, Hive, Impala, YARN)Proficiency in SQL and data modelling conceptsExperience with workflow orchestration tools (Airflow, Oozie, etc.)Good understanding of data warehousing conceptsExperience with performance tuning and optimisationGood to HaveExperience with cloud platforms (AWS, Azure, GCP)Knowledge of streaming tools (Kafka, Spark Streaming)Exposure to DevOps practices and CI/CD pipelinesBanking/Financial Services domain experience#J-18808-Ljbffr
Similar Opportunities
T
Customer Experience Manager
Track24
Dubai, UAEAED 8,000-20,000/moYesterday
UAEIT & Technology
I
Business Intelligence Manager
IFZA
Dubai, UAEAED 8,000-20,000/moYesterday
UAEIT & Technology
D
DevOps Engineer
DataScience Middle East
Dubai, UAEAED 10,000-25,000/moYesterday
UAEIT & Technology
A
Senior Back End Engineer - Python
AppliedAI
Abu Dhabi, UAEAED 7,000-18,000/moYesterday
UAEIT & Technology
L
IT Support & Systems Administrator — End-User + Cloud
LINKVIVA Websiite
Dubai, UAEAED 4,000-9,000/moYesterday
UAEIT & Technology
M
Remote AI-First Salesforce Architect&Agentforce Trainer
Max Accelerate Technology Group
Fujairah, UAEAED 10,000-16,667/moYesterday
UAEIT & Technology