I
Data Engineer (PySpark) | Innovations Global | Dubai, UAE
Innovations Global
Dubai, UAEAED 7,000-18,000/moToday
UAEFinance & AccountingFull Time
Skills Required
PythonSqlExcelDevops
Job Description
Job DetailsJob Title: Data Engineer (PySpark) | Innovations Global | Dubai, UAERecruiting Company: Innovations GlobalJob Location: Dubai, United Arab Emirates (Onsite)Job Type: Full-TimeApplication Method: sp.hariharan@innovationsglobal.comNotice Period: Immediate to 30 DaysExperience Required: 5+ YearsIndustry Requirement: Banking Domain (Mandatory)Position SummaryThe Data Engineer will design, build, and optimize large-scale data pipelines using PySpark and Cloudera Data Platform within a fast-paced banking environment. This role is key to enabling reliable data processing, analytics, and ETL workflows that power mission‑critical financial applications.Job DescriptionAs a Data Engineer, you will work with massive datasets on Cloudera Data Platform, leveraging advanced PySpark techniques to deliver efficient and scalable data transformations. You will collaborate with data architects, business analysts, and engineering teams to build robust ETL pipelines, optimize cluster and workload performance, and support key banking initiatives. This role requires strong experience with CDP components, big data frameworks, and orchestration tools, alongside excellent Linux scripting skills and a deep understanding of financial data operations.Key ResponsibilitiesDevelop, optimize, and maintain PySpark-based data pipelines for large-scale processing.Work with Cloudera Data Platform components including Cloudera Manager, Hive, Impala, HDFS, and HBase.Design and implement scalable ETL workflows aligned with banking data requirements.Perform advanced data transformations using PySpark (RDDs, DataFrames, Spark SQL).Support data warehousing initiatives and develop SQL-based analytics queries.Integrate and work with big data tools such as Hadoop, Kafka, and distributed systems.Use orchestration frameworks like Oozie or Airflow for workflow scheduling.Develop automation scripts on Linux for deployments and workload management.Ensure performance tuning, data quality, and compliance with financial industry standards.Required Qualifications & SkillsBachelor’s or Master’s in Computer Science, Data Engineering, IT, or related field.5+ years of experience as a Data Engineer with strong banking domain exposure.Advanced proficiency in PySpark (RDD, DataFrames, performance optimization).Hands‑on experience with Cloudera Data Platform (CDP).Strong SQL skills and experience with Hive/Impala-based data warehousing.Solid understanding of Hadoop ecosystem tools and Kafka.Experience with Oozie, Airflow, or similar orchestration frameworks.Strong Linux scripting (Bash/Python) for automation.Nice-to-Have SkillsExposure to CI/CD concepts or DevOps practices.Understanding of data governance in financial institutions.Knowledge of Spark performance profiling or tuning at cluster level.Experience working with cloud-based big data platforms.Recruitment Pro TipShowcase end-to-end PySpark pipeline projects—especially those deployed on Cloudera within banking environments—as employers prioritize candidates who demonstrate real performance tuning, ETL optimization, and large-scale financial data processing expertise.#J-18808-Ljbffr
Similar Opportunities
A
Remittance Product Lead: Growth & Strategy
ADIB Group
Dubai, UAEAED 4,000-10,000/moToday
UAEFinance & Accounting
C
Backlog AML Analyst – Transaction Monitoring (Contract)
Capitex
Dubai, UAEAED 6,000-15,000/moToday
UAEFinance & Accounting
I
Senior Reporter
ION
Dubai, UAEAED 4,000-10,000/moToday
UAEFinance & Accounting
M
HNW Private Banking Relationship Manager
Mashreqbank PSC
UAEAED 8,000-20,000/moToday
UAEFinance & Accounting
P
Head of Data Science (Finance)
Property Finder
Abu Dhabi, UAEAED 18,000-50,000/moToday
UAEFinance & Accounting
C
Head of Accounting - Middle East
ChainGPT
Sharjah, UAEAED 18,000-50,000/moToday
UAEFinance & Accounting