CrawlJobs Logo

Lead AI/ML Engineer

fissionlabs.com Logo

Fission Labs

Location Icon

Location:
India , Hyderabad & Pune

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Responsibility:

  • Design, implement, and optimize end-to-end ML training workflows including infrastructure setup, orchestration, fine-tuning, deployment, and monitoring
  • Evaluate and integrate multi-cloud and single-cloud training options across AWS and other major platforms
  • Lead cluster configuration, orchestration design, environment customization, and scaling strategies
  • Compare and recommend hardware options (GPUs, TPUs, accelerators) based on performance, cost, and availability

Requirements:

  • Experience with cloud-based platforms (AWS, Azure), API integrations, and data models
  • Exposure to AI/ML-enabled platforms or decision-intelligence systems
  • Certifications: CBAP / PMI-PBA / Agile BA / SAFe Product Owner / Scrum Master
  • Experience in stakeholder training, change management, or workshop facilitation
  • At least 4-5 years in AI/ML infrastructure and large-scale training environments
  • Expert in AWS cloud services (EC2, S3, EKS, SageMaker, Batch, FSx, etc.) and familiar with Azure, GCP, and hybrid/multi-cloud setups
  • Strong knowledge of AI/ML training frameworks (PyTorch, TensorFlow, Hugging Face, DeepSpeed, Megatron, Ray, etc.)
  • Proven experience with cluster orchestration tools (Kubernetes, Slurm, Ray, SageMaker, Kubeflow)
  • Deep understanding of hardware architectures for AI workloads (NVIDIA, AMD, Intel Habana, TPU)
  • Expert knowledge of inference optimization techniques including speculative decoding, KV cache optimization (MQA/GQA/PagedAttention), and dynamic batching
  • Deep understanding of prefill vs decode phases, memory-bound vs compute-bound operations
  • Experience with quantization methods (INT4/INT8, GPTQ, AWQ) and model parallelism strategies
  • Hands-on experience with production inference engines: vLLM, TensorRT-LLM, DeepSpeed-Inference, or TGI
  • Proficiency with serving frameworks: Triton Inference Server, KServe, or Ray Serve
  • Familiarity with kernel optimization libraries (FlashAttention, xFormers)
  • Proven ability to optimize inference metrics: TTFT (first token latency), ITL (inter-token latency), and throughput
  • Experience profiling and resolving GPU memory bottlenecks and OOM issues
  • Knowledge of hardware-specific optimizations for modern GPU architectures (A100/H100)
  • Drive end-to-end fine-tuning of LLMs, including model selection, dataset preparation/cleaning, tokenization, and evaluation with baseline metrics
  • Configure and execute fine-tuning experiments (LoRA, QLoRA, etc.) on large-scale compute setups, ensuring optimal hyperparameter tuning, logging, and checkpointing
  • Document fine-tuning outcomes by capturing performance metrics (losses, BERT/ROUGE scores, training time, resource utilization) and benchmark against baseline models
What we offer:
  • Opportunities for continuous learning and certification support
  • Collaborative and growth-oriented work culture
  • Competitive compensation and comprehensive benefits
  • Exposure to modern cloud and integration technologies

Additional Information:

Job Posted:
March 04, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead AI/ML Engineer

Senior Data & AI/ML Engineer - GCP Specialization Lead

We are on a bold mission to create the best software services offering in the wo...
Location
Location
United States , Menlo Park
Salary
Salary:
Not provided
techjays.com Logo
techjays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Services: BigQuery, Dataflow, Pub/Sub, Vertex AI
  • ML Engineering: End-to-end ML pipelines using Vertex AI / Kubeflow
  • Programming: Python & SQL
  • MLOps: CI/CD for ML, Model deployment & monitoring
  • Infrastructure-as-Code: Terraform
  • Data Engineering: ETL/ELT, real-time & batch pipelines
  • AI/ML Tools: TensorFlow, scikit-learn, XGBoost
  • Min Experience: 10+ Years
Job Responsibility
Job Responsibility
  • Design and implement data architectures for real-time and batch pipelines, leveraging GCP services such as BigQuery, Dataflow, Dataproc, Pub/Sub, Vertex AI, and Cloud Storage
  • Lead the development of ML pipelines, from feature engineering to model training and deployment using Vertex AI, AI Platform, and Kubeflow Pipelines
  • Collaborate with data scientists to operationalize ML models and support MLOps practices using Cloud Functions, CI/CD, and Model Registry
  • Define and implement data governance, lineage, monitoring, and quality frameworks
  • Build and document GCP-native solutions and architectures that can be used for case studies and specialization submissions
  • Lead client-facing PoCs or MVPs to showcase AI/ML capabilities using GCP
  • Contribute to building repeatable solution accelerators in Data & AI/ML
  • Work with the leadership team to align with Google Cloud Partner Program metrics
  • Mentor engineers and data scientists toward achieving GCP certifications, especially in Data Engineering and Machine Learning
  • Organize and lead internal GCP AI/ML enablement sessions
What we offer
What we offer
  • Best in class packages
  • Paid holidays and flexible paid time away
  • Casual dress code & flexible working environment
  • Medical Insurance covering self & family up to 4 lakhs per person
Read More
Arrow Right

Sales Engineering Lead

EvoluteIQ is seeking a Sales Engineering Lead to drive pre-sales and sales engin...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
evoluteiq.com Logo
EvoluteIQ
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8–10 years of experience in pre-sales, solution engineering, or consulting for enterprise software, automation, or AI-driven platforms
  • Hands-on knowledge of process automation, AI/ML, data integration, API orchestration, or low-code/no-code environments
  • Experience collaborating with global channel partners, system integrators, or technology alliances
  • Familiarity with one or more BPM (Appian, Pega, Camunda), LCAP (Outsystems, Mendix) and automation stacks (e.g., ServiceNow, UiPath, Power Automate, Blue Prism, MuleSoft) and cloud platforms (AWS, Azure, or GCP), as well as Agentic AI and Generative AI technologies
  • Proven ability to connect technical capabilities with business outcomes and present to C-suite stakeholders
  • Strong communication, executive presentation, and solution storytelling abilities
  • Strategic thinker with the ability to influence joint go-to-market initiatives and co-create customer success outcomes
  • Bachelors or Masters degree in computer science, Engineering, or a related field
  • MBA preferred but not mandatory
Job Responsibility
Job Responsibility
  • Lead all technical pre-sales engagements for new opportunities, from discovery, requirement scoping, and solution design to technical validation, proposal creation and proof-of-concept delivery
  • Collaborate with business, sales, and technical teams to understand customer objectives and expand solution footprints
  • Translate business objectives and challenges into robust technical architectures and solution proposals that clearly articulate value, ROI, and platform differentiation
  • Design and deliver customized product demonstrations, solution architectures, and joint proof-of-concepts highlighting the EIQ platform’s unified automation capabilities
  • Coordinate for partner enablement workshops and provide ongoing knowledge transfer to enhance Eguardian’s sales and delivery maturity
  • Serve as a solution advisor by identifying and shaping new Agentic Automation use cases leveraging AI, ML, RPA, and orchestration
  • Act as a technical liaison between EvoluteIQ’s product, engineering, and partner management teams to ensure solution scalability and roadmap alignment
  • Design and present compelling solution demos, proofs of concept (POCs), and architecture blueprints tailored to client industries (banking, healthcare, insurance, telecom)
  • Build reusable demo assets, templates, and solution accelerators to support repeatable GTM success
  • Partner with Delivery teams to ensure a seamless handover from pre-sales to implementation
What we offer
What we offer
  • Opportunity to shape the strategy of a next-gen hyper-automation platform
  • Work with a cross-disciplinary team in a fast-growing, innovation-driven environment
  • Competitive compensation and growth opportunities
  • A culture of innovation, ownership, and continuous learning
  • Fulltime
Read More
Arrow Right

Technical Lead – AI/ML & Data Platforms

We are seeking a Technical Lead with strong managerial capabilities to drive the...
Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
thirdeyedata.ai Logo
Thirdeye Data
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong expertise in data pipelines, architecture, and analytics platforms (e.g., Snowflake, Tableau)
  • Experience reviewing and optimizing data transformations, aggregations, and business logic
  • Hands-on familiarity with LLMs and practical RAG implementations
  • Knowledge of AI/ML workflows, model lifecycle management, and experimentation frameworks
  • Proven experience in managing complex, multi-track projects
  • Skilled in project tracking and collaboration tools (Jira, Confluence, or equivalent)
  • Excellent communication and coordination skills with technical and non-technical stakeholders
  • Experience working with cross-functional, globally distributed teams
Job Responsibility
Job Responsibility
  • Coordinate multiple workstreams simultaneously, ensuring timely delivery and adherence to quality standards
  • Facilitate daily stand-ups and syncs across global time zones, maintaining visibility and accountability
  • Understand business domains and technical architecture to enable informed decisions and proactive risk management
  • Collaborate with data engineers, AI/ML scientists, analysts, and product teams to translate business goals into actionable plans
  • Track project progress using Agile or hybrid methodologies, escalate blockers, and resolve dependencies
  • Own task lifecycle — from planning through execution, delivery, and retrospectives
  • Perform technical reviews of data pipelines, ETL processes, and architecture, identifying quality or design gaps
  • Evaluate and optimize data aggregation logic while ensuring alignment with business semantics
  • Contribute to the design and development of RAG pipelines and workflows involving LLMs
  • Create and maintain Tableau dashboards and reports aligned with business KPIs for stakeholders
  • Fulltime
Read More
Arrow Right

Senior AI/ML Engineer

Barbaricum is seeking a highly experienced Senior AI/ML Engineer to support Soft...
Location
Location
United States , Crane
Salary
Salary:
Not provided
barbaricum.com Logo
Barbaricum
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Active DoD Secret Clearance (Top Secret preferred)
  • Bachelor’s degree in Computer Science, Engineering, or related technical discipline (Master’s preferred)
  • 10+ years of progressive experience in AI/ML engineering, software development, or applied data science
  • Expertise in developing, deploying, and securing AI/ML applications within mission-critical or defense environments
  • Demonstrated experience with LLMs, MLOps pipelines, and modern ML frameworks (e.g., PyTorch, TensorFlow)
  • Strong background in software and cyber engineering principles, including system hardening, secure coding, and vulnerability mitigation
  • Proven ability to lead complex technical efforts, mentor junior engineers, and interface with government stakeholders
  • DoD 8570 Advanced certification (e.g., SecurityX, GCSA, CCSP, or equivalent) must be obtained and maintained
Job Responsibility
Job Responsibility
  • Partner with project managers and engineering teams to define objectives for AI/ML systems in support of maneuver, surveillance, and engagement missions
  • Develop and prototype AI/ML systems to address mission-specific requirements, including computer vision, sensor fusion, and decision-support applications
  • Conduct rigorous testing and evaluation of AI/ML performance against operational datasets
  • Analyze test data to identify model strengths, weaknesses, and mission relevance
  • Refine and optimize systems to ensure robustness, scalability, and cyber resilience
  • Troubleshoot complex system challenges and provide technical guidance for deployed solutions
  • Deliver comprehensive documentation and technical reports to stakeholders
  • Maintain awareness of emerging AI/ML technologies, software engineering practices, and cyber defense techniques relevant to mission-critical systems
Read More
Arrow Right

AIML Lead Engineer

We build breakthrough software products that power digital businesses. We are an...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
3pillarglobal.com Logo
3Pillar Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of total IT experience
  • at least 4+ years in AI/ML development
  • Strong proficiency in Python and ML frameworks (TensorFlow, PyTorch, Scikit-Learn)
  • Experience with NLP libraries such as spaCy, Hugging Face Transformers, etc.
  • Solid understanding of AI/ML algorithms, data preprocessing, and model evaluation techniques
  • Hands-on experience with Generative AI, LLMs, and Agentic AI
  • Working knowledge of MLOps tools and CI/CD pipelines for AI model deployment
  • Familiarity with computer vision frameworks (OpenCV, etc.)
  • Excellent problem-solving and communication skills
  • Ability to lead and mentor junior engineers
Job Responsibility
Job Responsibility
  • Lead the design and implementation of AI/ML models and solutions for complex business problems
  • Work on NLP, LLMs, and Generative AI to build intelligent systems and conversational agents
  • Develop and optimize models using Python, TensorFlow, PyTorch, and Scikit-Learn
  • Apply deep learning and transformer-based architectures (e.g., BERT, GPT, etc.) for NLP and vision tasks
  • Implement computer vision solutions using OpenCV and related tools
  • Collaborate with cross-functional teams to integrate AI models into production systems
  • Apply MLOps best practices and manage CI/CD pipelines for model deployment
  • Stay updated with the latest AI research, LLM, and Agentic AI trends, and drive innovation across teams
  • Fulltime
Read More
Arrow Right

AI/ML Engineer - Public Sector

Unstructured is seeking an AI/Machine Learning Engineer to join our Public Secto...
Location
Location
United States
Salary
Salary:
Not provided
unstructured.io Logo
Unstructured
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field. Master’s or PhD a plus
  • 4+ years of experience in AI/ML engineering, MLOPS, systems architecture, or similar technical roles
  • 2+ years of experience working with government networks and security requirements
  • An understanding of government security frameworks (FedRAMP, NIST 800-53, FISMA, DISA SRG) and how they apply to ML workloads
  • History of leading or delivering high-impact ML initiatives in enterprise or government environments
  • preference for those with articulable experience assessing performance of alternative models, architectures, and implementation strategies
  • A commitment to meeting the demanding engineering standards required to support national security and defense clients
  • A strong interest in being at the forefront of the AI revolution
  • TS Active Clearance required for the role + ability to travel
  • Familiar with AWS, Azure, and/or GCP services for ML workloads
Job Responsibility
Job Responsibility
  • Develop evaluation and assessment tools and frameworks to measure newly developed models for performance against key metrics across a wide domain of tasks and knowledge sets
  • Identify, propose, and implement modifications of existing models and model implementation frameworks to optimize for new tasks
  • Lead conceptualization of both traditional and agentic implementation strategies for cloud and on-premises model deployments within broader system architectures
  • Lead and optimize distributed ML workloads on multiple government cloud and non-cloud infrastructures
  • Align AI/ML deployments with FedRAMP, NIST 800-53, FISMA, and DISA SRG, maintaining strict security standards
  • Create reference architectures and deployment patterns to streamline ML adoption across government agencies
  • Translate mission objectives into ML-focused technical specifications and project plans
  • Apply advanced security controls and zero-trust architectures to protect ML pipelines and data
  • Continuously assess ML workloads for performance, cost, and security improvements, driving ongoing refinement
What we offer
What we offer
  • Competitive compensation, equity, and benefits
  • Fulltime
Read More
Arrow Right

Senior Engineering Manager- AI/ML

As the Senior Engineering Manager, you will lead by being a highly technical lea...
Location
Location
United States
Salary
Salary:
Not provided
aledade.com Logo
Aledade, Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/BTech (or higher) in Computer Science, Engineering or a related field required
  • 10+ years of production-level experience as an engineer and technical lead building highly scalable and reliable software
  • 5+ years of managerial experience building and leading technical engineering teams
  • 7+ years of experience in machine learning related technologies, with a strong preference for Python
  • Extensive experience in designing and implementing secure, scalable, and maintainable AI/ML platform architectures
  • Proficiency in distributed systems, microservices, containerization technologies (e.g., Docker, Kubernetes), model training infrastructure, orchestration tools, and MLOps principles
  • Sitting for prolonged periods of time
  • Extensive use of computers and keyboard
  • Occasional walking and lifting may be required
Job Responsibility
Job Responsibility
  • Build a high performing team by hiring and nurturing engineering talent
  • Strong technical leadership - drive technical solutioning and building roadmaps
  • Set aggressive and clear goals and remove all roadblocks for the team to achieve them
  • Working seamlessly and collaboratively with stakeholders across Aledade to achieve business outcomes
  • Work closely with engineering leaders to drive engineering excellence in our processes and systems
  • Fulltime
Read More
Arrow Right

AI/ML Engineer

Hewlett Packard Enterprise is seeking an AI/ML Engineer who will design, develop...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in computer science, engineering, information systems, or closely related quantitative discipline
  • typically, 7-10 years’ experience
  • strong programming skills in Python and preferrable familiarity with Golang
  • understanding microservice architecture and how they can be built in a containerized, Kubernetes-managed environment
  • designing and integrating software systems running on multiple platform types into the overall architecture
  • evaluating forms and processes for software systems testing and methodology, including writing and executing test plans, debugging, and testing scripts and tools
  • excellent written and verbal communication skills with the ability to effectively communicate product architectures and design proposals at senior management levels.
Job Responsibility
Job Responsibility
  • Experiment, design, develop and maintain machine learning models and pipelines with a high potential for value and scale
  • collaborate with other ML engineers, data scientists, product managers, and other engineers to ensure successful implementation of machine learning solutions
  • perform research and testing to develop or customize machine learning algorithms
  • conduct model training and evaluation as needed
  • integrate, test, tune and monitor the solutions developed
  • research and evaluate new technologies and tools for machine learning
  • maintain and update existing machine learning systems
  • hands-on develop, productionize, and operate machine learning models and pipelines at scale, including both batch and real-time use cases
  • work with large scale structured and unstructured data, build and continuously improve cutting-edge machine learning models
  • lead project teams for design and development of complex products and platforms, including solution design, analysis, coding, testing, and integration for building efficient, scalable, and robust cloud subsystems
What we offer
What we offer
  • Comprehensive suite of health and wellbeing benefits that supports physical, financial, and emotional wellbeing
  • specific programs catered to career goals for personal and professional development
  • unconditionally inclusive work environment where varied backgrounds are valued.
  • Fulltime
Read More
Arrow Right