Data Engineer

Web Scraping / Data Acquisition Engineer

Wissen Technology is hiring for Web Scraping / Data Acquisition Engineer. We are...

Location

India , Mumbai

Salary:

Not provided

Wissen

Expiration Date

Until further notice

Requirements

Strong hands-on experience with Python
Proven experience in web scraping and crawler development
Proficiency with browser automation tools: Playwright, Scrapy, or equivalent
Experience with PDF extraction tools (pdfplumber, PyMuPDF, Apache Tika, etc.)
Strong understanding of HTML parsing, pagination handling, and automated file downloads
Knowledge of anti-bot techniques (rate limiting, proxy handling, session rotation)
Experience processing structured and semi-structured documents

Job Responsibility

Design and develop web crawlers to extract data from public websites
Crawl listing pages and extract case metadata (case title, number, court, date, etc.)
Download judgments and maintain structured PDF/document storage
Build automated pipelines to monitor websites and detect new judgments
Extract structured data from documents and HTML pages
Store data in structured formats suitable for downstream processing or search
Handle pagination, anti-bot measures, and data cleaning workflows
Maintain scrapers for reliability, accuracy, and long-term scalability

Fulltime

Software Engineer – Web Data Extraction & API Development

Sybrant Technologies has been in the forefront of transforming its customers int...

Location

Salary:

Not provided

Sybrant Technologies

Expiration Date

Until further notice

Requirements

Strong proficiency in Python
Hands-on experience with web scraping tools (Requests, BeautifulSoup, Selenium, Scrapy)
Good understanding of HTML, DOM structure, XPath, and CSS selectors
Experience building REST APIs using FastAPI, Flask, or Django
Solid knowledge of SQL and relational databases (MySQL / PostgreSQL)
Experience handling proxies, cookies, headers, rate limits, and sessions
Familiarity with Git and basic CI/CD workflows

Job Responsibility

Develop and maintain web scraping scripts using Python (Requests, BeautifulSoup, Selenium, Scrapy)
Automate extraction workflows to ensure reliable and repeatable data collection
Handle anti-scraping mechanisms such as CAPTCHAs, rotating proxies, headers, and session management
Clean, transform, and load extracted data into internal databases
Design and build REST APIs to expose processed data from the database
Optimize scraping workflows for performance, reliability, and error handling
Monitor scraping jobs, troubleshoot failures, and ensure data freshness
Maintain documentation for scraping logic, API endpoints, and workflows
Collaborate with product and data teams to understand evolving data requirements

Web Scraping Engineer II

We are seeking a Web Scraping Engineer to join our growing engineering team. In ...

Location

India

Salary:

Not provided

YipitData

Expiration Date

Until further notice

Requirements

Effective communication in English with both technical and non-technical stakeholders
3+ years of experience with web scraping frameworks (e.g., Selenium, Playwright, or Puppeteer)
Strong understanding of HTTP, RESTful APIs, HTML parsing, browser rendering, and TLS/SSL mechanics
Expertise in advanced fingerprinting and evasion strategies (e.g., browser fingerprint spoofing, request signature manipulation)
Deep experience managing cookies, headers, session states, and proxy rotations, including the deployment of both residential and data center proxies
Experience with logging, metrics, and alerting to ensure high availability
Troubleshooting skills to optimize scraper performance for efficiency, reliability, and scalability

Job Responsibility

Refactor and Maintain Web Scrapers: Overhaul existing scraping scripts to improve reliability, maintainability, and efficiency
Implement best coding practices (clean code, modular architecture, code reviews, etc.) to ensure quality and sustainability
Implement Advanced Scraping Techniques: Utilize sophisticated fingerprinting methods (cookies, headers, user-agent rotation, proxies) to avoid detection and blocking
Handle dynamic content, navigate complex DOM structures, and manage session/cookie lifecycles effectively
Collaborate with Cross-Functional Teams: Work closely with analysts and other stakeholders to gather requirements, align on targets, and ensure data quality
Provide support, documentation, and best practices to internal stakeholders to ensure effective use of our web scraped data in critical reporting workflows
Monitor and Troubleshoot: Develop robust monitoring solutions, alerting frameworks to quickly identify and address failures
Continuously evaluate scraper performance, proactively diagnosing bottlenecks and scaling issues
Drive Continuous Improvement: Propose new tooling, methodologies, and technologies to enhance our scraping capabilities and processes
Stay up to date with industry trends, evolving bot-detection tactics, and novel approaches to web data extraction

What we offer

Our compensation package includes comprehensive benefits, perks, and a competitive salary
We care about your personal life and we mean it. We offer vacation time, parental leave, team events, learning reimbursement, and more!
Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust

Fulltime

Our client is seeking a skilled Data Engineer to support the design, development...

Location

United States , Miami

Salary:

Not provided

Robert Half

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Information Systems, Engineering or related field
3+ years of experience in a Data Engineer or similar role
Strong hands-on experience with SQL, Python and Snowflake
Experience developing and maintaining ETL workflows
Knowledge of data modeling concepts and best practices
Experience with Selenium for web automation or web scraping support
Strong analytical, problem-solving and troubleshooting skills
Ability to work independently and collaboratively in a team environment

Job Responsibility

Design, build and maintain scalable ETL pipelines to support data integration and reporting needs
Develop and optimize complex queries using SQL
Use Python to support data processing, transformation and automation tasks
Work within Snowflake to manage, transform and optimize cloud-based data solutions
Assist with automation efforts, including Selenium web scripting for web-based data extraction and process automation
Create and maintain logical and physical data models
Ensure data quality, integrity and consistency across multiple data sources
Collaborate with business stakeholders, analysts and technical teams to gather requirements and deliver data solutions
Monitor and troubleshoot data workflows and automation scripts
Document processes, workflows and technical specifications

What we offer

medical, vision, dental, and life and disability insurance
401(k) plan

Fulltime

Senior Software Engineer - Data Acquisition

Join TxODDS as a Senior Software Engineer and help build scalable, high-performa...

Location

United Kingdom , London

Salary:

Not provided

TXODDS

Expiration Date

Until further notice

Requirements

Strong experience with at least one core programming language (e.g. Python, Java, Scala)
Hands-on experience with Kubernetes, container orchestration, and Docker
Experience working with distributed systems and event‑driven technologies (e.g. Kafka)
Solid understanding of networking fundamentals (HTTP, APIs)
Experience with relational and NoSQL databases
Strong Git skills and familiarity with modern development practices (code reviews, testing, CI/CD)
Comfort working in a Linux/Unix command-line environment
Experience designing and debugging software from inception to deployment
Excellent problem‑solving skills and a proactive approach to improving systems and processes
Strong communication and collaboration skills, and the ability to work effectively across teams

Job Responsibility

Developing, testing, and deploying high‑quality software that processes data from diverse sources
Building, improving, and maintaining distributed systems and data pipelines (including Kafka-based services)
Deploying and supporting containerised workloads running in Kubernetes environments
Creating and maintaining clear, accurate documentation for the systems you build
Validating and monitoring data quality using internal tools and processes
Supporting data‑gathering workflows, including those involving web‑scraping or automated data acquisition
Investigating and resolving data‑related issues escalated from the Client Services team
Participating in an out‑of‑hours on‑call rotation to support critical data acquisition systems
Sharing knowledge widely and contributing to a positive, collaborative team culture
Mentoring junior engineers and helping raise the overall technical bar

What we offer

Competitive benefits package tailored to your location

Fulltime

Software Engineer – Web Crawling

Woflow is a technology startup creating products and solutions to support a high...

Location

Salary:

Not provided

Woflow

Expiration Date

Until further notice

Requirements

3+ years of experience in software engineering with a focus on web crawling and data extraction
Strong expertise in Node.js (preferred) for web crawling applications
Deep understanding of HTML, JavaScript, and reverse engineering techniques
Hands-on experience with Playwright, Puppeteer, and Cheerio for automation and scraping
Knowledge of security and performance best practices related to web crawling

Job Responsibility

Develop, enhance, and maintain web crawlers and scraping infrastructure
Optimize scraping techniques to handle anti-bot mechanisms, performance, and security challenges
Collaborate with a geographically distributed team to identify and resolve issues
Ensure high availability, efficiency, and reliability of crawling operations
Integrate AI solutions to enhance automation and data extraction accuracy

What we offer

Unlimited PTO
Comprehensive medical, dental, and vision insurance plans
STD, LTD, AD&D, and life insurance coverage
Free membership to TalkSpace, Teladoc and Health Advocate
Free annual membership to One Medical in participating regions
401(k) retirement plan with company matching
Pre-tax commuter benefits
Free equipment: laptop and home office stipend

Fulltime

Python Data Engineer

Arthur Lawrence is looking for a Python Data Engineer one of our clients in Hous...

Location

United States , Houston

Salary:

Not provided

Arthur Lawrence

Expiration Date

Until further notice

Requirements

7+ years of professional Python development
Strong knowledge of OOP, design patterns, and SOA
Hands-on experience in data engineering, data pipeline development, and web scraping (Requests, BeautifulSoup, Selenium)
Oracle/PL SQL expertise, stored procedures
Bachelor’s degree in Computer Science, MIS, or related field
Agile/Scrum experience

Fulltime

Data Engineer - Python

We are seeking a talented and motivated Python Data Engineer to join our global ...

Location

United States , Houston

Salary:

Not provided

Robert Half

Expiration Date

Until further notice

Requirements

6+ years of professional Python development experience at an enterprise level
Bachelor's degree in Computer Science, MIS, or a related technical field
Proven experience building and maintaining data pipelines and ETL processes
Proficiency with web scraping tools and techniques (e.g., Requests, BeautifulSoup, Selenium)
Hands-on experience with Oracle / PL SQL, including stored procedures
Strong knowledge of object-oriented programming, design patterns, and SOA architectures
Familiarity with Agile/Scrum methodologies and modern version control and issue tracking tools
Experience with Python libraries such as Pandas and NumPy
Excellent written and verbal communication skills

Job Responsibility

Develop modular and reusable Python components to connect external data sources with internal systems and databases
Work directly with business stakeholders to translate analytical requirements into technical implementations
Ensure the integrity and maintainability of the central Python codebase by adhering to existing design standards and best practices
Maintain and improve the in-house Python ETL toolkit, contributing to the standardization and consolidation of data engineering workflows
Partner with global team members to ensure efficient coordination and delivery
Actively participate in internal Python development community and support ongoing business development initiatives with technical expertise

What we offer

medical
vision
dental
life and disability insurance
company 401(k) plan

Select Country

Data Engineer - Web Scraping

Job Responsibility

Requirements

Looking for more opportunities?

Data Engineer - Web Scraping

Web Scraping / Data Acquisition Engineer

Software Engineer – Web Data Extraction & API Development

Web Scraping Engineer II

Data Engineer

Senior Software Engineer - Data Acquisition

Software Engineer – Web Crawling

Python Data Engineer

Data Engineer - Python

Our AI answers in your language