CrawlJobs Logo

Multimodal AI Engineer, Document Understanding

llamaindex.ai Logo

LlamaIndex

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Join us and help shape the future of AI by redefining document workflows with AI agents. We are seeking exceptional AI engineers to join our core document understanding team. You will work at the intersection of computer vision, natural language processing, and production ML systems to push the boundaries of what's possible in document parsing and understanding. Our document understanding team builds the intelligence behind LlamaParse, LlamaExtract, and our other processing products. These systems are processing millions of complex documents including PDFs, PowerPoints, Word documents, and spreadsheets. Your work will directly impact thousands of developers building RAG applications and document agents, while also contributing to our open-source frameworks that shape how the industry approaches document processing. Depending on your background and interests, you might focus more on data curation and evaluation, model fine-tuning and experimentation, or ML infrastructure and production systems. We're hiring multiple people and will work with you to find the best fit.

Job Responsibility:

  • Develop, train, and optimize machine learning models for document structure understanding, table extraction, layout analysis, and multimodal content processing
  • Build robust data pipelines, evaluation frameworks, and experimentation infrastructure
  • Design and implement production ML systems that handle complex, real-world documents at scale
  • Stay current with latest advances in vision-language models, document AI, and multimodal learning
  • Collaborate with engineering teams to integrate ML innovations into production APIs
  • Contribute to both our open-source frameworks and enterprise offerings
  • Drive technical decisions while balancing research exploration with product delivery

Requirements:

  • 3-7 years of experience in machine learning engineering or applied research
  • Strong software engineering fundamentals with production Python experience (modern tooling: uv, ruff, mypy, Pydantic)
  • Hands-on experience training, fine-tuning, or deploying ML models in production
  • Deep understanding of modern ML techniques, particularly in computer vision, NLP, or multimodal learning
  • Experience with at least one of: data pipeline development, model training/fine-tuning, or ML infrastructure
  • Ability to read and implement from research papers and technical specifications
  • Track record of executing with high intensity in fast-paced environments
  • Strong technical communication skills and comfort with open-source collaboration

Nice to have:

  • Experience with vision-language models, transformer architectures, or model fine-tuning (LoRA, QLoRA)
  • Experience building evaluation frameworks, benchmarks, or data quality pipelines
  • Experience with model serving frameworks (vLLM, TensorRT, ONNX) or MLOps tools
  • Experience specifically with document understanding, OCR, or layout analysis
  • Contributions to open-source ML projects or frameworks
  • Experience with LLM applications and RAG systems
  • Strong understanding of model optimization techniques (quantization, distillation, pruning)
  • Experience with Docker/Kubernetes and distributed systems
  • Active participation in ML research community
What we offer:
  • Competitive base salary and equity compensation
  • Comprehensive medical/dental/vision coverage for you and your family
  • Unlimited paid time off policy
  • Daily catered lunch and snacks in the San Francisco office
  • Budget for conferences, research materials, and professional development
  • Access to cutting-edge compute resources and research tools

Additional Information:

Job Posted:
December 10, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Multimodal AI Engineer, Document Understanding

Senior AI Engineer

This role will be tasked with applying machine learning/deep learning to the aut...
Location
Location
United States , Belmont
Salary
Salary:
170000.00 - 210000.00 USD / Year
https://www.volkswagen-group.com Logo
Volkswagen AG
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6-8 years of professional experience post graduate degree preferred
  • 4+ years' Deep Learning experience post graduate degree preferred
  • Master's Degree in Computer Science or equivalent
  • PhD Strongly Preferred
  • Strong knowledge of different machine learning algorithms
  • Proficiency in deep learning techniques and frameworks
  • Strong understanding of traditional machine learning algorithms and their applications
  • Expertise in computer vision, including object detection, image segmentation, and image recognition
  • Proficiency in NLP techniques, including sentiment analysis, text generation, and language understanding models
  • Experience with multimodal language modeling and applications
Job Responsibility
Job Responsibility
  • Applying machine learning/deep learning to the automotive industry
  • Maintaining and enhancing existing machine learning modules for autonomous vehicles
  • Designing and implementing new machine learning based approaches based on existing frameworks
  • Keeping up to speed with the state of the art of academic research and technology in the industry
  • Coordinating with engineers at the ICC and in Germany on the development of autonomous driving software
  • Transferring technologies and solutions to Volkswagen Group development divisions
  • Developing technical specifications and documentation
  • Representing Volkswagen Group in the technical community, such as at conferences
  • Fulltime
Read More
Arrow Right

AI Content Engineer

Join us and help shape the future of AI by architecting next-generation knowledg...
Location
Location
United States , San Francisco
Salary
Salary:
Not provided
llamaindex.ai Logo
LlamaIndex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in software engineering (ML engineering + research a bonus)
  • Strong software engineering fundamentals with production Python experience
  • Understanding of modern ML techniques, particularly in computer vision, NLP, or multimodal learning
  • Demonstrated ability to write clearly, quickly, and authentically about technical topics
  • Bias toward shipping - comfortable publishing at blog pace, not paper pace
  • Ability to read, understand, and synthesize research papers rapidly
  • Scrappy and self-directed - can identify what's worth writing about and execute end-to-end
  • Track record of high-velocity output in fast-paced environments
Job Responsibility
Job Responsibility
  • Design, build, and maintain comprehensive benchmarks for document parsing and understanding
  • Publish high-quality technical content at a weekly cadence (blog posts, benchmark reports, technical comparisons, tutorials)
  • Stay deeply current with the document AI landscape - new models, papers, competitors, techniques
  • Run experiments and translate findings into publishable artifacts quickly
  • Produce technical analyses that demonstrate our capabilities against alternatives
  • Contribute to open-source examples, notebooks, and documentation
  • Collaborate with the core ML team to surface improvements and capabilities worth highlighting
  • Engage authentically with the developer community through technical content (not conferences/events)
What we offer
What we offer
  • Shape the Narrative: Your content will define how developers think about document understanding. You'll have direct influence on market perception
  • Technical Credibility: Work with cutting-edge document AI systems processing millions of documents. Your benchmarks and analyses will be grounded in real capabilities
  • High Autonomy: Significant freedom to identify what matters and publish quickly. No lengthy approval chains
  • Growth Opportunity: Help build this function from the ground up as we scale
  • Fulltime
Read More
Arrow Right

ML Engineer

You’ll help build core generative AI and multimodal capabilities that power cust...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 3+ years of experience in software engineering, machine learning engineering, or applied AI (or equivalent experience)
  • Experience contributing to technical designs and delivering features in complex codebases (e.g., writing design docs, reviewing changes, and improving reliability/performance)
  • Experience building and shipping generative AI systems (including multimodal scenarios)
  • Experience building and operating ML/AI systems in cloud environments
  • familiarity with MLOps practices (Azure a plus)
  • Experience partnering with cross-functional stakeholders to define requirements and drive technical decisions
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Design, build, and operate production-grade generative AI and multimodal systems, with ownership from implementation through deployment and live-site operations
  • Contribute to the design and implementation of core GenAI capabilities (e.g., retrieval-augmented generation, context and memory, orchestration) and make data-driven tradeoffs across quality, latency, cost, and safety
  • Improve model and system quality using evaluation frameworks, experiment design, and production telemetry
  • build robust testing, monitoring, and regression coverage
  • Work with security, privacy, and compliance partners to build solutions that meet enterprise requirements and align with Responsible AI standards and practices
  • Collaborate with teammates through design reviews, code reviews, and debugging to unblock delivery and improve architecture, code quality, and ML engineering practices
  • Partner with product and customers to understand scenarios, translate requirements into well-designed APIs and developer experiences, and contribute to adoption through documentation and samples
  • Fulltime
Read More
Arrow Right

AI Solutions Architect

We are looking for a highly skilled AI Architect with deep expertise in Generati...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
nstarxinc.com Logo
NStarX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 10 years of experience in ML/AI solution architecture
  • Deep expertise in Generative AI: LLMs, Vision/Video models, Digital Avatars, RAG systems, and multimodal architectures
  • Strong experience in ML engineering, data pipelines, and scalable model APIs
  • Hands-on experience with Nvidia GPU systems, CUDA stack, TensorRT, vLLM/Ollama, and model optimization
  • Experience building AI on edge devices (Intel, AMD, Qualcomm NPUs, AI PCs)
  • Proficiency in AWS and Azure cloud ecosystems, including GPU-based deployments
  • Strong knowledge of Python, ML frameworks (PyTorch, TensorFlow), model serving frameworks, and MLOps tools
  • Proven track record of architecting POC, MVP, and production-grade AI products
  • Strong architectural documentation and diagramming skills (Mermaid, Draw.io, Lucidchart, ArchiMate)
  • Excellent communication skills for client presentations and internal leadership discussions
Job Responsibility
Job Responsibility
  • Design and architect LLM-based systems using both open-source (Llama, Mistral, etc.) and proprietary (OpenAI, Azure OpenAI, Anthropic, etc.) models
  • Architect video-based AI systems, including Digital Human Avatars, Video Generation, Video-to-Text, and multimodal pipelines
  • Build end-to-end GenAI pipelines including data ingestion, preprocessing, retrieval, fine-tuning (LoRA, QLoRA, DAPT), evaluation, guardrailing, and deployment
  • Define and orchestrate data pipelines, ML workflows, vector search architecture, and embedding strategies
  • Build scalable, secure ML engineering wrappers around models (inference servers, orchestration layers, API microservices)
  • Oversee experimentation frameworks, evaluation methodologies, and MLOps integration
  • Architect AI solutions on AWS and Azure (preferred), including GPU clusters, model hosting, DevOps/MLOps, and autoscaling
  • Work with Nvidia GPU server stacks (DGX, H200, H100, L40S) and edge AI systems (Intel, AMD, Qualcomm AI PCs)
  • Optimize AI workloads across heterogeneous compute environments
  • Lead AI architecture across POC → MVP → GA → production-scale phases
What we offer
What we offer
  • Competitive base + commission
  • Fast growth into leadership roles
  • Fulltime
Read More
Arrow Right

AI Engineer/Product Engineer Freelance

You will build AI-powered automation for operational workflows that are currentl...
Location
Location
France , Paris
Salary
Salary:
Not provided
livecolonies.com Logo
Colonies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Built LLM-powered applications in production
  • Strong in Python and comfortable with frameworks like LlamaIndex, LangChain, or equivalent orchestration tools
  • Experience with vision models and multimodal AI
  • Understand how to design evaluation and feedback loops for AI systems
  • Can work with APIs, databases (Postgres), and existing internal systems
  • Use AI tools (Copilot, Claude, Cursor, etc.) as part of daily workflow
  • Have worked in fast-moving environments
  • Can talk to non-technical operations teams
  • Comfortable with ambiguity and have a bias for action
  • Have a proven track record of AI-powered systems built that run in production
Job Responsibility
Job Responsibility
  • Build AI-powered automation for operational workflows
  • Ingest inventory documents and extract structured damage assessments using vision and language models
  • Compare move-out inventories against move-in inventories to identify new damage and attribute responsibility
  • Estimate repair costs based on damage type, severity, and historical data
  • Feed results into the deposit return process
  • Build this as a reusable document processing pipeline
  • Parttime
Read More
Arrow Right

Principal ML Engineer, CoreAI

You’ll help build core generative AI and multimodal capabilities that power cust...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python or equivalent experience
  • Advanced degree in Computer Science, Machine Learning, or related field
  • Demonstrated technical leadership through influence (e.g., leading designs, setting architecture direction, mentoring engineers)
  • Experience with prompt engineering, retrieval-augmented generation (RAG), and memory/agent frameworks
  • Experience building and shipping generative AI systems (including multimodal scenarios)
  • Familiarity with compliance and security standards in enterprise AI solutions
  • Track record of delivering enterprise-facing AI products at scale
  • Experience building and operating ML/AI systems in cloud environments
  • familiarity with MLOps practices (Azure a plus)
  • Experience partnering with cross-functional stakeholders to define requirements and drive technical decisions
Job Responsibility
Job Responsibility
  • Design, build, and operate production-grade generative AI and multimodal systems, with end-to-end ownership from concept through deployment and service operations
  • Lead technical design for core GenAI capabilities (e.g., retrieval-augmented generation, context and memory, orchestration) and make data-driven tradeoffs across quality, latency, cost, and safety
  • Define and improve model and system quality using evaluation frameworks, experiment design, and production telemetry
  • ensure robust testing and regression coverage
  • Collaborate with security, privacy, and compliance partners to build solutions that meet enterprise requirements and align with Responsible AI standards and practices
  • Provide technical leadership across teams by setting direction, reviewing designs, unblocking execution, and mentoring engineers on architecture, coding standards, and ML engineering best practices
  • Partner with product and customers to understand scenarios, translate requirements into well-designed APIs and developer experiences, and drive adoption through documentation and samples
  • Fulltime
Read More
Arrow Right

AI Engineer

Our client is building a next-generation video platform that leverages artificia...
Location
Location
Brazil
Salary
Salary:
52.00 USD / Hour
g2i.co Logo
G2i Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience training AI models (this is mandatory — not limited to API integration or fine-tuning)
  • Strong background in machine learning frameworks such as PyTorch or TensorFlow
  • Experience with computer vision, video analysis, or multimodal AI is a strong plus
  • Familiarity with Python and standard ML toolkits
  • Ability to work independently and deliver results in a short time frame
  • Based in Brazil and available for part-time engagement (10–20 hours per week)
Job Responsibility
Job Responsibility
  • Design, develop, and train AI/ML models to support video-related tasks (e.g., tagging, summarization, content understanding, or generation)
  • Work with existing datasets and create synthetic or labeled data where needed
  • Optimize model performance and deployment for efficiency and scalability
  • Collaborate with engineers and product leads to integrate AI features into the platform
  • Document your process, findings, and recommendations for future development
What we offer
What we offer
  • Flexible remote setup
  • Short-term, well-scoped engagement with the potential for ongoing collaboration
  • Parttime
Read More
Arrow Right

Senior System Development Engineer – AI Technologies

Our customers’ system requirements are usually highly complex. Bringing together...
Location
Location
United States , Austin
Salary
Salary:
123000.00 - 170000.00 USD / Year
dell.com Logo
Dell
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Engineering, Computer Science, Electrical Engineering, or related field
  • 5+ years of experience in system engineering, platform development, or hardware–software validation
  • Strong understanding of x86 system architecture, CPU/GPU/accelerator internals, memory systems, and I/O subsystems
Job Responsibility
Job Responsibility
  • Lead bring‑up, configuration, and validation of system platforms supporting AI workloads (servers, GPU racks, accelerators, networking fabrics)
  • work with BIOS/UEFI, BMC, firmware, drivers, and kernel subsystems to ensure system readiness for large‑scale AI deployments
  • perform hardware–software co-validation of CPUs, GPUs, DPUs, NICs, accelerators, and memory subsystems under AI‑heavy workloads
  • validate PCIe fabric behavior, NUMA topology, and data‑path efficiency for model training and inference
  • Diagnose complex issues across BIOS, firmware, OS, driver stack, container runtime, orchestration layer, and AI frameworks
  • analyze system logs, kernel traces, hardware event telemetry, GPU health signals, and fabric diagnostics
  • conduct root‑cause analysis of performance bottlenecks, training failures, model divergence, and hardware stability issues
  • collaborate with silicon, firmware, OS, and AI software teams to resolve issues rapidly
  • Deploy and manage AI clusters: GPU servers, accelerators, high‑speed networking (InfiniBand, RoCE), and storage systems
  • validate cluster readiness for distributed training, including bandwidth, latency, topology checks, and gradient‑sync performance
What we offer
What we offer
  • Comprehensive Healthcare Programs
  • Award Winning Financial Wellness Tools and Resources
  • Generous Leave of Absence for New Parents and Caregivers
  • Industry Leading Wellness Platform
  • Employee Assistance Program
  • Fulltime
Read More
Arrow Right