CrawlJobs Logo

Data Curator and Annotator

aciinfotech.com Logo

ACI Infotech

Location Icon

Location:

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

The Data Curator and Annotator will be responsible for curating, labeling, and maintaining high-quality datasets to support ML training, RAG pipelines, and evaluation. This role requires precision in annotation, strong attention to detail, and the ability to establish reliable guidelines and workflows. The ideal candidate will collaborate closely with engineers and data scientists to ensure datasets are accurate, secure, and aligned with business and research needs.

Job Responsibility:

  • Curate and label datasets for ML training and evaluation
  • Define annotation guidelines and quality control processes
  • Develop efficient labeling workflows with quality gates
  • Ensure privacy, security, and bias mitigation in datasets
  • Collaborate with engineers and data scientists to improve data utility
  • Build trusted evaluation datasets for ranking and RAG tasks

Requirements:

  • Experience labeling or curating datasets for NLP or search
  • Familiarity with annotation tools such as Label Studio or Prodigy
  • Strong attention to detail and commitment to labeling consistency
  • Comfort working with enterprise domain data
  • Experience with QA processes for annotation quality
  • Strong written communication for guideline creation
  • Respect for privacy, security, and ethical data principles

Nice to have:

  • Domain knowledge in BFSI, retail, or healthcare
  • Experience creating evaluation datasets for LLMs
  • Multi-lingual annotation experience
  • Comfort with basic Python scripting

Additional Information:

Job Posted:
December 14, 2025

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Data Curator and Annotator

AI Data Manager

This is not a standard data management role; it’s a rare opportunity to be at th...
Location
Location
United States , Palo Alto
Salary
Salary:
140000.00 - 260000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of hands-on experience in AI data operations, human data annotation, or a similar data-centric role within a top-tier AI company
  • direct experience translating complex researcher needs into effective data curation and annotation workflows
  • highly adaptable and thrive on cross-functional collaboration, with a proven ability to work across a comprehensive data pipeline, not just within a single vertical like human annotation
  • experience working with vision or multimodal data pipelines
  • a hands-on individual contributor who is driven by the work, not by people management
Job Responsibility
Job Responsibility
  • Translate researcher needs into actionable data annotation and curation strategies for our SOTA vision, 3D, and audio models
  • own and manage end-to-end data pipelines and annotation workflows, collaborating with external partners and labeling teams to ensure the highest quality data
  • provide horizontal management across multiple data pipelines, ensuring consistency and quality as we expand into new modalities
  • develop innovative data curation strategies, working with a diverse mix of human-annotated, raw, and synthetic data to solve complex model challenges
  • partner directly with researchers to diagnose model performance issues and propose data-driven solutions to improve results
  • define the standards for data quality and annotation excellence, establishing the foundation for how Luma scales its data operations
  • Fulltime
Read More
Arrow Right

Senior Expert for Industry Foundation Models

You drive the development of domain specific AI and shape the next generation of...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
bmw.de Logo
BMW
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree in computer science, mathematics, data science, or a related field, or equivalent senior industry experience
  • Five to ten years of professional experience, including several years in advanced AI and technical leadership roles
  • Ability to translate complex technical concepts into AI strategies, products, roadmaps, and measurable impact
  • Deep expertise in foundation models, particularly multimodal and reasoning models, including adaptation and post training methods
  • Proven experience building large scale AI systems, including distributed training, high throughput inference, GPU acceleration, and cost optimization
  • Strong background in industrial or automotive data, processes, and IT systems
  • Excellent software engineering skills in Python and modern ML frameworks, with the ability to adapt to internal platforms and toolchains
Job Responsibility
Job Responsibility
  • You define the technical and business product vision and system architecture for Large Industry Models, covering model, data, and platform layers
  • You lead and technically mentor an engineering team building multimodal foundation model stacks, including language, vision, and action models
  • You guide core technical and architectural decisions, selecting and adapting foundation models and designing scalable AI systems
  • You develop a comprehensive industrial data strategy, including data sourcing, curation, annotation, and feedback loops
  • You ensure reliable delivery of LIMs into cloud production environments with strong MLOps, evaluation, safety, and compliance standards
What we offer
What we offer
  • Challenging projects with which we shape the mobility of tomorrow together
  • Wide range of personal and professional development opportunities
  • Attractive, fair and performance-related remuneration
  • High level of job security
  • Annual special payments such as vacation pay, Christmas bonus, and profit sharing
  • Flexible working hours including six weeks annual leave and overtime compensation
  • Discounted BMW & MINI conditions
  • Fulltime
Read More
Arrow Right

Research Intern - GenAI

Appen is seeking Research Interns to support innovative research in Generative A...
Location
Location
Australia , Chatswood, Sydney
Salary
Salary:
Not provided
appen.com Logo
Appen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Postgraduate students in Linguistics, Computer Science, AI, Data Science, or similar disciplines preferred
  • strong final-year and recent undergraduate candidates in these fields will also be considered
  • Familiarity with programming languages such as Python, R, or similar tools used in data analysis and machine learning
  • Experience with data annotation, model evaluation, or prompt engineering
  • Understanding of multilingual NLP, speech technologies, or agentic AI systems
  • Strong written communication skills, especially for summarizing research and drafting technical content
  • Ability to work independently and collaboratively in a remote research environment
Job Responsibility
Job Responsibility
  • Conduct literature reviews on topics such as adversarial prompting, multilingual evaluation, and agentic AI
  • Assist in dataset curation, annotation, and quality assurance for speech, text, and multimodal data
  • Support model evaluation experiments, including prompt engineering and red teaming
  • Develop scripts and tools for data analysis, visualization, and automation
  • Contribute to internal documentation, research reports, and thought leadership content
  • Participate in team meetings and cross-functional collaborations
  • Help prepare materials for conferences, publications, and workshops
What we offer
What we offer
  • Hands-on experience in applied AI research with real-world impact
  • Mentorship from experienced researchers and exposure to industry workflows
  • Opportunities to contribute to publications, datasets, and thought leadership
  • A collaborative and inclusive research environment
Read More
Arrow Right
New

Metabolomics Scientist

Join Enveda as a Metabolomics Scientist in Boulder, CO, and help us transform na...
Location
Location
United States , Boulder
Salary
Salary:
131000.00 - 140000.00 USD / Year
enveda.com Logo
Enveda
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in chemistry or a related scientific discipline with 2+ years of experience in LC-MS-based metabolomics
  • Strong expertise in metabolite annotation and identification
  • Proficiency with one or more metabolomics data processing tools (e.g., MZmine, XCMS, MS-DIAL, MetaboScape)
  • Hands-on experience with large-scale metabolomics studies, ideally involving human biospecimens
Job Responsibility
Job Responsibility
  • Serve as a subject-matter expert in metabolite annotation and identification from LC-MS-based metabolomics data
  • Lead follow-up investigations on mis-annotations and unknown features
  • Build, curate, and maintain internal spectral libraries to strengthen metabolite annotation capabilities
  • Contribute to data QC, review, and troubleshooting, helping to continuously improve robustness and reproducibility of the platform
What we offer
What we offer
  • 90% Medical, Dental, Vision
  • 401k Match
  • Flexible PTO
  • Adoption Assistance
  • Fulltime
Read More
Arrow Right

Head of Delivery Operations, DaaS

We are seeking a Head of Delivery Operations to lead and scale our Delivery Oper...
Location
Location
United States , New York City; Redwood City; San Francisco
Salary
Salary:
200000.00 - 250000.00 USD / Year
snorkel.ai Logo
Snorkel AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in Delivery Operations, Data Operations, Program Management, or equivalent roles, ideally within high-growth B2B SaaS, DaaS, or services-led technology organizations
  • 3+ years managing people or teams
  • Demonstrated success building operational rigor in unstructured, fast-scaling environments, including designing SOPs, operating models, and execution frameworks from first principles
  • Proven people leadership experience, including building and managing high-performing teams, developing managers, and overseeing large contractor or distributed workforces
  • Strong vendor and partner management expertise, with a track record of scaling external suppliers while maintaining quality, cost efficiency, and delivery SLAs
  • Operational and analytical excellence, with hands-on experience defining KPIs, building pipeline and capacity visibility, and driving continuous process improvement through data
  • Exceptional cross-functional leadership and communication skills, with the ability to influence Delivery, Engineering, Product, Supply, and GTM stakeholders at senior levels
  • Comfort operating in ambiguity and zero-to-one contexts, with the ability to move quickly, make high-quality decisions, and iterate toward scalable systems
Job Responsibility
Job Responsibility
  • Lead and scale the Data Operations function, building a high-performance team with clear career paths, accountability, and a strong culture of operational excellence
  • Design and operationalize end-to-end operating models for expert contributor , onboarding, performance management, and quality assurance
  • ensuring consistency, compliance, and scalability as the business grows
  • Establish SOPs, including pipeline health monitoring, project execution tracking, and KPI-driven reporting to enable predictable, on-time delivery
  • Own operational enablement and documentation, creating playbooks, and internal tooling that reduce execution friction and accelerate onboarding across delivery teams
  • Oversee planning and execution of complex data programs (labeling, annotation, curation), balancing cost, quality, and timelines across multiple customers and delivery motions
  • Partner cross-functionally with Delivery, Supply, Engineering, Product, and GTM leadership to align execution with commercial priorities and continuously improve the DaaS operating model at scale
  • Fulltime
Read More
Arrow Right

Senior Human Data Operations Partner

As a Senior Human Data Operations Partner, you will play a pivotal role in ensur...
Location
Location
Mexico , Mexico City
Salary
Salary:
Not provided
prolific.com Logo
Prolific
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in operations, ideally in marketplace or platform environments
  • Demonstrated ability to manage multiple projects and meet deadlines in a fast-paced environment
  • Strong project management skills, with attention to detail and operational execution
  • Experience in implementing and optimising processes to improve efficiency and scalability
  • Proficiency in operational tools and systems such as Metabase, Zapier, Asana, and HubSpot
  • Exceptional communication and documentation skills, with a track record of stakeholder enablement
  • Confidence with data analysis and operational metrics to support decision-making and continuous improvement
  • Familiarity with human data operations, product or data operations, participant recruitment, or managing global crowdsourcing/gig-economy platforms
  • Experience supporting AI/ML lifecycle projects, such as data labelling, annotation, or domain-specific task curation
  • Understanding of Prolific’s platform and its role in delivering high-quality human data for AI research
Job Responsibility
Job Responsibility
  • Serve as a key resource for client-specific AI delivery teams, acting as the primary point of contact for participant-related operations
  • Support bespoke client projects by managing participant onboarding, verification, and activation processes to meet deliverables
  • Identify and address gaps in participant supply, ensuring alignment with AI task requirements and quality standards
  • Develop detailed project plans, workflows, and enablement materials to support efficient delivery
  • Maintain Prolific’s participant pools as high-quality, diverse, and optimised to meet customer demands
  • Implement and manage verification processes for domain experts and AI taskers, ensuring their skills and qualifications meet client-specific requirements
  • Continuously optimise participant onboarding and activation funnels to enhance retention and task performance
  • Monitor participant engagement and quality metrics, using insights to drive improvements
  • Contribute to upskilling programmes and certification processes for participants involved in complex AI tasks
  • Work closely with product, data, science, and customer success teams to align operational priorities with strategic objectives
What we offer
What we offer
  • Working for us will place you at the forefront of AI innovation, providing access to our unique human data platform and opportunities for groundbreaking research
  • Join us to enjoy a competitive salary, and our impactful, mission-driven culture
Read More
Arrow Right

Senior Machine Learning Engineer, Data for Embodied AI

The goal of this role is to build, scale, and optimise next-generation world mod...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in ML engineering, data engineering, or applied ML roles focused on large-scale data systems
  • Proven experience building and maintaining large-scale data pipelines for machine learning, including data ingestion, transformation, and validation
  • Strong Python fundamentals and experience with modern ML and data frameworks (e.g. PyTorch, Ray, Dask, Spark, or equivalent)
  • Solid understanding of multimodal data (video, lidar, sensor telemetry) and its challenges in large-scale training
  • Experience defining and tracking data quality metrics, conducting dataset analysis, and driving data-informed improvements in model performance
  • Demonstrated ability to work collaboratively with ML researchers, platform engineers, and product teams in a fast-paced, experimental environment
  • Strong problem-solving skills, a data-driven mindset, and the ability to translate research needs into reliable data solutions
Job Responsibility
Job Responsibility
  • Design and implement large-scale data acquisition, processing, and curation pipelines, owning the full lifecycle of high-quality datasets used to train advanced robotics and foundation models
  • Continuously improve dataset quality and utility through sophisticated data analysis, debugging, and experimentation
  • developing metrics, tests, and monitoring mechanisms that directly drive model performance improvements
  • Develop and scale multimodal data pipelines for ingestion, preprocessing, filtering, annotation, and storage across video, LiDAR, and telemetry modalities
  • Run systematic experiments on data ablations and composition to assess their impact on model training dynamics, generalisation, and downstream performance
  • Collaborate with ML researchers and platform engineers to ensure datasets are fit for purpose and efficiently integrated into large-scale training workflows
  • Build internal tools and workflows for dataset auditing, visualization, and versioning to streamline iteration and reproducibility
  • Advance best practices for data governance, reliability, and scalability across the data lifecycle
  • ensuring data safety, privacy, and long-term maintainability
Read More
Arrow Right

Senior Machine Learning Engineer, Data for Embodied AI

The goal of this role is to build, scale, and optimise next-generation world mod...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in ML engineering, data engineering, or applied ML roles focused on large-scale data systems
  • Proven experience building and maintaining large-scale data pipelines for machine learning, including data ingestion, transformation, and validation
  • Strong Python fundamentals and experience with modern ML and data frameworks (e.g. PyTorch, Ray, Dask, Spark, or equivalent)
  • Solid understanding of multimodal data (video, lidar, sensor telemetry) and its challenges in large-scale training
  • Experience defining and tracking data quality metrics, conducting dataset analysis, and driving data-informed improvements in model performance
  • Demonstrated ability to work collaboratively with ML researchers, platform engineers, and product teams in a fast-paced, experimental environment
  • Strong problem-solving skills, a data-driven mindset, and the ability to translate research needs into reliable data solutions
Job Responsibility
Job Responsibility
  • Design and implement large-scale data acquisition, processing, and curation pipelines, owning the full lifecycle of high-quality datasets used to train advanced robotics and foundation models
  • Continuously improve dataset quality and utility through sophisticated data analysis, debugging, and experimentation
  • developing metrics, tests, and monitoring mechanisms that directly drive model performance improvements
  • Develop and scale multimodal data pipelines for ingestion, preprocessing, filtering, annotation, and storage across video, LiDAR, and telemetry modalities
  • Run systematic experiments on data ablations and composition to assess their impact on model training dynamics, generalisation, and downstream performance
  • Collaborate with ML researchers and platform engineers to ensure datasets are fit for purpose and efficiently integrated into large-scale training workflows
  • Build internal tools and workflows for dataset auditing, visualization, and versioning to streamline iteration and reproducibility
  • Advance best practices for data governance, reliability, and scalability across the data lifecycle
  • ensuring data safety, privacy, and long-term maintainability
Read More
Arrow Right