N
Deep Learning Engineer
Norconsult Telematics
Riyadh, Saudi ArabiaAED 7,000-18,000/mo≈ SAR 7.1K-18.4K/moToday
Saudi ArabiaIT & TechnologyFull Time
Skills Required
AwsAzureDevopsErpArabicEnglish
Job Description
Lead fine-tuning, compression, and deployment of deep learning models, especially LLMs, using distributed multi-GPU frameworks (e.g., DeepSpeed, FSDP) to enhance performance, scalability, and efficiency.Optimize model inference through quantization, pruning, and distillation, and deploy using ONNX, TensorRT, OpenVINO, or platforms like ModelMesh and NVIDIA Triton.Support Generative AI and RAG pipelines by improving latency, throughput, and GPU resource utilization across modern AI infrastructures in alignment with MLOps/DevOps workflows.Job Description & Responsibilities:Fine-tune and optimize LLMs and deep learning models using distributed multi-GPU frameworks such as DeepSpeed, FSDP, and Hugging Face Accelerate.Apply model compression techniques including quantization (e.g., INT8), pruning, and knowledge distillation to improve inference efficiency.Convert and optimize models for low-latency inference using tools like ONNX Runtime, TensorRT, OpenVINO, and TF-Serving.Deploy and serve models using high-performance platforms such as ModelMesh and NVIDIA Triton.Collaborate with MLOps and DevOps teams on GPU resource planning, benchmarking, and building automated deployment pipelines.Support and enhance RAG pipelines, embedding models, and inference tuning for Generative AI applications.Ensure deployment readiness by focusing on model scalability, latency, and throughput, aligned with enterprise infrastructure standards.Qualification & Experience:Bachelor’s or Master’s degree in Computer Science, AI, Data Science, or a related field.Minimum 4 years of hands-on experience in deep learning, including multi-GPU distributed training and high-performance model optimization.Proficient in PyTorch, TensorFlow, and deep learning toolkits such as Hugging Face Transformers and ONNX Runtime.Strong experience with model compression techniques (quantization, pruning, distillation) and inference optimization using ONNX, TensorRT, OpenVINO, or TF-Serving.Knowledge of Transformer architectures, LLM internals, and generative AI workflows, including RAG.Familiarity with model serving platforms (e.g., Triton, ModelMesh) and deployment on OpenShift AI environments.Exposure to MLOps, GPU benchmarking, and cloud platforms (AWS, Azure, GCP) is preferred.Fluent in English (mandatory); Arabic proficiency is an added advantage.Certifications in AI/ML, DL, cloud platforms, or Red Hat/OpenShift AI are a plus.#J-18808-Ljbffr
Similar Opportunities
Senior Frontend Architect & Tech Lead
Nakhla
Riyadh, Saudi ArabiaSAR 7,600-20,900/moToday
Saudi ArabiaIT & Technology
Technology Sales Professional - Cloud & Digital Innovation
Alnafitha IT
Riyadh, Saudi ArabiaSAR 4,750-17,100/moToday
Saudi ArabiaIT & Technology
Senior SAP ABAP & Fiori Developer on S/4HANA
Wipro
Riyadh, Saudi ArabiaSAR 6,650-19,000/moToday
Saudi ArabiaIT & Technology
System Administration Engineer
Arbete Careers
Al Khubar, Saudi ArabiaAED 7,000-18,000/mo≈ SAR 7.1K-18.4K/moToday
Saudi ArabiaIT & Technology
IT Support Jobs in Abha (Jan 2026) - jobsaisle.com
Saudi Petroleum Services Polytechnic
Abha, Saudi ArabiaAED 4,000-9,000/mo≈ SAR 4.1K-9.2K/moToday
Saudi ArabiaIT & Technology
IT Onboarding & Asset Coordinator
Snoonu
Lusail, QatarQAR 4,200-10,500/moToday
QatarIT & Technology