JobsAisle
M

Web Developer

MeasureOne

Ahmedabad, India₹35,000–₹100,000/moAED 1.5K-4.4K/moToday
IndiaPythonWeb ScrapingDockerKubernetesAWSGCPAzureSeleniumNodejsPuppeteerPlaywrightScrapyFull Time

Skills Required

JavascriptPythonJavaAwsAzureDockerKubernetesErp

Job Description

Job Description Role Overview: MeasureOne is seeking an ideal candidate with hands-on expertise in designing and deploying advanced web scraping solutions, with a significant focus on overcoming bot detection challenges, building scalable and resilient scraping systems, and ensuring the efficiency and scalability of data acquisition pipelines. Key Responsibilities: - Develop and maintain high-performance scraping systems using Node.js, Python, or other relevant technologies. - Handle JavaScript-heavy and asynchronous content using tools like Puppeteer, Playwright, or custom solutions in Node.js. - Solve CAPTCHA using automation, AI/ML, or third-party services. - Build robust error-handling mechanisms to adapt to changes in website structures or anti-scraping measures. - Analyze and reverse-engineer advanced bot detection systems and anti-scraping mechanisms, including rate-limiting, behavioral analysis, and fingerprinting. - Design and implement techniques to bypass WAFs (Web Application Firewalls) and server-side protections using Node.js libraries and tools. - Architect and maintain scalable infrastructure using containerization tools like Docker and orchestration platforms such as Kubernetes. - Leverage cloud platforms (AWS, GCP, Azure) for distributed scraping and data acquisition. - Utilize Node.js and related tools to optimize network configurations for high-throughput scraping, including proxy and load balancer configurations. - Automate deployment and scaling of scraping systems using CI/CD pipelines. - Ensure optimal performance of scraping systems by reducing latency and optimizing resource utilization. - Develop robust monitoring and logging systems to track and troubleshoot issues in real-time. - Ensure adherence to legal, ethical, and regulatory standards, safeguarding data acquisition systems from detection, blocking, and external threats. - Respect website's terms of service while implementing efficient scraping solutions. Qualifications Required: - 3+ years of hands-on experience in web scraping or data engineering. - Expertise in Node.js for building and optimizing scraping systems. - Strong knowledge of programming languages such as Python and JavaScript. - Advanced understanding of networking concepts, including protocols, WebSockets, DNS, and API integrations. - Experience with containerization tools (Docker) and orchestration platforms (Kubernetes). - Proficiency in cloud platforms (AWS, GCP, Azure) for scalable data acquisition pipelines. - Familiarity with tools like Puppeteer, Playwright, Scrapy, or Selenium. - Strong debugging and optimization skills for network and scraping pipelines. Role Overview: MeasureOne is seeking an ideal candidate with hands-on expertise in designing and deploying advanced web scraping solutions, with a significant focus on overcoming bot detection challenges, building scalable and resilient scraping systems, and ensuring the efficiency and scalability of data acquisition pipelines. Key Responsibilities: - Develop and maintain high-performance scraping systems using Node.js, Python, or other relevant technologies. - Handle JavaScript-heavy and asynchronous content using tools like Puppeteer, Playwright, or custom solutions in Node.js. - Solve CAPTCHA using automation, AI/ML, or third-party services. - Build robust error-handling mechanisms to adapt to changes in website structures or anti-scraping measures. - Analyze and reverse-engineer advanced bot detection systems and anti-scraping mechanisms, including rate-limiting, behavioral analysis, and fingerprinting. - Design and implement techniques to bypass WAFs (Web Application Firewalls) and server-side protections using Node.js libraries and tools. - Architect and maintain scalable infrastructure using containerization tools like Docker and orchestration platforms such as Kubernetes. - Leverage cloud platforms (AWS, GCP, Azure) for distributed scraping and data acquisition. - Utilize Node.js and related tools to optimize network configurations for high-throughput scraping, including proxy and load balancer configurations. - Automate deployment and scaling of scraping systems using CI/CD pipelines. - Ensure optimal performance of scraping systems by reducing latency and optimizing resource utilization. - Develop robust monitoring and logging systems to track and troubleshoot issues in real-time. - Ensure adherence to legal, ethical, and regulatory standards, safeguarding data acquisition systems from detection, blocking, and external threats. - Respect website's terms of service while implementing efficient scraping solutions. Qualifications Required: - 3+ years of hands-on experience in web scraping or data engineering. - Expertise in Node.js for building and optimizing scraping systems. - Strong knowledge of programming languages such as Python and JavaScript. - Advanced understanding of networking concepts, including protocols, WebSockets, DNS, and API integrations. - Experience with containerizatio