CrawlJobs Logo

Lead AI/ML Engineer

fissionlabs.com Logo

Fission Labs

Location Icon

Location:
India , Hyderabad & Pune

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Responsibility:

  • Design, implement, and optimize end-to-end ML training workflows including infrastructure setup, orchestration, fine-tuning, deployment, and monitoring
  • Evaluate and integrate multi-cloud and single-cloud training options across AWS and other major platforms
  • Lead cluster configuration, orchestration design, environment customization, and scaling strategies
  • Compare and recommend hardware options (GPUs, TPUs, accelerators) based on performance, cost, and availability

Requirements:

  • Experience with cloud-based platforms (AWS, Azure), API integrations, and data models
  • Exposure to AI/ML-enabled platforms or decision-intelligence systems
  • Certifications: CBAP / PMI-PBA / Agile BA / SAFe Product Owner / Scrum Master
  • Experience in stakeholder training, change management, or workshop facilitation
  • At least 4-5 years in AI/ML infrastructure and large-scale training environments
  • Expert in AWS cloud services (EC2, S3, EKS, SageMaker, Batch, FSx, etc.) and familiar with Azure, GCP, and hybrid/multi-cloud setups
  • Strong knowledge of AI/ML training frameworks (PyTorch, TensorFlow, Hugging Face, DeepSpeed, Megatron, Ray, etc.)
  • Proven experience with cluster orchestration tools (Kubernetes, Slurm, Ray, SageMaker, Kubeflow)
  • Deep understanding of hardware architectures for AI workloads (NVIDIA, AMD, Intel Habana, TPU)
  • Expert knowledge of inference optimization techniques including speculative decoding, KV cache optimization (MQA/GQA/PagedAttention), and dynamic batching
  • Deep understanding of prefill vs decode phases, memory-bound vs compute-bound operations
  • Experience with quantization methods (INT4/INT8, GPTQ, AWQ) and model parallelism strategies
  • Hands-on experience with production inference engines: vLLM, TensorRT-LLM, DeepSpeed-Inference, or TGI
  • Proficiency with serving frameworks: Triton Inference Server, KServe, or Ray Serve
  • Familiarity with kernel optimization libraries (FlashAttention, xFormers)
  • Proven ability to optimize inference metrics: TTFT (first token latency), ITL (inter-token latency), and throughput
  • Experience profiling and resolving GPU memory bottlenecks and OOM issues
  • Knowledge of hardware-specific optimizations for modern GPU architectures (A100/H100)
  • Drive end-to-end fine-tuning of LLMs, including model selection, dataset preparation/cleaning, tokenization, and evaluation with baseline metrics
  • Configure and execute fine-tuning experiments (LoRA, QLoRA, etc.) on large-scale compute setups, ensuring optimal hyperparameter tuning, logging, and checkpointing
  • Document fine-tuning outcomes by capturing performance metrics (losses, BERT/ROUGE scores, training time, resource utilization) and benchmark against baseline models
What we offer:
  • Opportunities for continuous learning and certification support
  • Collaborative and growth-oriented work culture
  • Competitive compensation and comprehensive benefits
  • Exposure to modern cloud and integration technologies

Additional Information:

Job Posted:
March 04, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead AI/ML Engineer

Senior Data & AI/ML Engineer - GCP Specialization Lead

We are on a bold mission to create the best software services offering in the wo...
Location
Location
United States , Menlo Park
Salary
Salary:
Not provided
techjays.com Logo
techjays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Services: BigQuery, Dataflow, Pub/Sub, Vertex AI
  • ML Engineering: End-to-end ML pipelines using Vertex AI / Kubeflow
  • Programming: Python & SQL
  • MLOps: CI/CD for ML, Model deployment & monitoring
  • Infrastructure-as-Code: Terraform
  • Data Engineering: ETL/ELT, real-time & batch pipelines
  • AI/ML Tools: TensorFlow, scikit-learn, XGBoost
  • Min Experience: 10+ Years
Job Responsibility
Job Responsibility
  • Design and implement data architectures for real-time and batch pipelines, leveraging GCP services such as BigQuery, Dataflow, Dataproc, Pub/Sub, Vertex AI, and Cloud Storage
  • Lead the development of ML pipelines, from feature engineering to model training and deployment using Vertex AI, AI Platform, and Kubeflow Pipelines
  • Collaborate with data scientists to operationalize ML models and support MLOps practices using Cloud Functions, CI/CD, and Model Registry
  • Define and implement data governance, lineage, monitoring, and quality frameworks
  • Build and document GCP-native solutions and architectures that can be used for case studies and specialization submissions
  • Lead client-facing PoCs or MVPs to showcase AI/ML capabilities using GCP
  • Contribute to building repeatable solution accelerators in Data & AI/ML
  • Work with the leadership team to align with Google Cloud Partner Program metrics
  • Mentor engineers and data scientists toward achieving GCP certifications, especially in Data Engineering and Machine Learning
  • Organize and lead internal GCP AI/ML enablement sessions
What we offer
What we offer
  • Best in class packages
  • Paid holidays and flexible paid time away
  • Casual dress code & flexible working environment
  • Medical Insurance covering self & family up to 4 lakhs per person
Read More
Arrow Right

Sales Engineering Lead

EvoluteIQ is seeking a Sales Engineering Lead to drive pre-sales and sales engin...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
evoluteiq.com Logo
EvoluteIQ
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8–10 years of experience in pre-sales, solution engineering, or consulting for enterprise software, automation, or AI-driven platforms
  • Hands-on knowledge of process automation, AI/ML, data integration, API orchestration, or low-code/no-code environments
  • Experience collaborating with global channel partners, system integrators, or technology alliances
  • Familiarity with one or more BPM (Appian, Pega, Camunda), LCAP (Outsystems, Mendix) and automation stacks (e.g., ServiceNow, UiPath, Power Automate, Blue Prism, MuleSoft) and cloud platforms (AWS, Azure, or GCP), as well as Agentic AI and Generative AI technologies
  • Proven ability to connect technical capabilities with business outcomes and present to C-suite stakeholders
  • Strong communication, executive presentation, and solution storytelling abilities
  • Strategic thinker with the ability to influence joint go-to-market initiatives and co-create customer success outcomes
  • Bachelors or Masters degree in computer science, Engineering, or a related field
  • MBA preferred but not mandatory
Job Responsibility
Job Responsibility
  • Lead all technical pre-sales engagements for new opportunities, from discovery, requirement scoping, and solution design to technical validation, proposal creation and proof-of-concept delivery
  • Collaborate with business, sales, and technical teams to understand customer objectives and expand solution footprints
  • Translate business objectives and challenges into robust technical architectures and solution proposals that clearly articulate value, ROI, and platform differentiation
  • Design and deliver customized product demonstrations, solution architectures, and joint proof-of-concepts highlighting the EIQ platform’s unified automation capabilities
  • Coordinate for partner enablement workshops and provide ongoing knowledge transfer to enhance Eguardian’s sales and delivery maturity
  • Serve as a solution advisor by identifying and shaping new Agentic Automation use cases leveraging AI, ML, RPA, and orchestration
  • Act as a technical liaison between EvoluteIQ’s product, engineering, and partner management teams to ensure solution scalability and roadmap alignment
  • Design and present compelling solution demos, proofs of concept (POCs), and architecture blueprints tailored to client industries (banking, healthcare, insurance, telecom)
  • Build reusable demo assets, templates, and solution accelerators to support repeatable GTM success
  • Partner with Delivery teams to ensure a seamless handover from pre-sales to implementation
What we offer
What we offer
  • Opportunity to shape the strategy of a next-gen hyper-automation platform
  • Work with a cross-disciplinary team in a fast-growing, innovation-driven environment
  • Competitive compensation and growth opportunities
  • A culture of innovation, ownership, and continuous learning
  • Fulltime
Read More
Arrow Right

Senior AI/ML Engineer

Barbaricum is seeking a highly experienced Senior AI/ML Engineer to support Soft...
Location
Location
United States , Crane
Salary
Salary:
Not provided
barbaricum.com Logo
Barbaricum
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Active DoD Secret Clearance (Top Secret preferred)
  • Bachelor’s degree in Computer Science, Engineering, or related technical discipline (Master’s preferred)
  • 10+ years of progressive experience in AI/ML engineering, software development, or applied data science
  • Expertise in developing, deploying, and securing AI/ML applications within mission-critical or defense environments
  • Demonstrated experience with LLMs, MLOps pipelines, and modern ML frameworks (e.g., PyTorch, TensorFlow)
  • Strong background in software and cyber engineering principles, including system hardening, secure coding, and vulnerability mitigation
  • Proven ability to lead complex technical efforts, mentor junior engineers, and interface with government stakeholders
  • DoD 8570 Advanced certification (e.g., SecurityX, GCSA, CCSP, or equivalent) must be obtained and maintained
Job Responsibility
Job Responsibility
  • Partner with project managers and engineering teams to define objectives for AI/ML systems in support of maneuver, surveillance, and engagement missions
  • Develop and prototype AI/ML systems to address mission-specific requirements, including computer vision, sensor fusion, and decision-support applications
  • Conduct rigorous testing and evaluation of AI/ML performance against operational datasets
  • Analyze test data to identify model strengths, weaknesses, and mission relevance
  • Refine and optimize systems to ensure robustness, scalability, and cyber resilience
  • Troubleshoot complex system challenges and provide technical guidance for deployed solutions
  • Deliver comprehensive documentation and technical reports to stakeholders
  • Maintain awareness of emerging AI/ML technologies, software engineering practices, and cyber defense techniques relevant to mission-critical systems
Read More
Arrow Right

AIML Lead Engineer

We build breakthrough software products that power digital businesses. We are an...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
3pillarglobal.com Logo
3Pillar Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of total IT experience
  • at least 4+ years in AI/ML development
  • Strong proficiency in Python and ML frameworks (TensorFlow, PyTorch, Scikit-Learn)
  • Experience with NLP libraries such as spaCy, Hugging Face Transformers, etc.
  • Solid understanding of AI/ML algorithms, data preprocessing, and model evaluation techniques
  • Hands-on experience with Generative AI, LLMs, and Agentic AI
  • Working knowledge of MLOps tools and CI/CD pipelines for AI model deployment
  • Familiarity with computer vision frameworks (OpenCV, etc.)
  • Excellent problem-solving and communication skills
  • Ability to lead and mentor junior engineers
Job Responsibility
Job Responsibility
  • Lead the design and implementation of AI/ML models and solutions for complex business problems
  • Work on NLP, LLMs, and Generative AI to build intelligent systems and conversational agents
  • Develop and optimize models using Python, TensorFlow, PyTorch, and Scikit-Learn
  • Apply deep learning and transformer-based architectures (e.g., BERT, GPT, etc.) for NLP and vision tasks
  • Implement computer vision solutions using OpenCV and related tools
  • Collaborate with cross-functional teams to integrate AI models into production systems
  • Apply MLOps best practices and manage CI/CD pipelines for model deployment
  • Stay updated with the latest AI research, LLM, and Agentic AI trends, and drive innovation across teams
  • Fulltime
Read More
Arrow Right

AI/ML Engineer - Public Sector

Unstructured is seeking an AI/Machine Learning Engineer to join our Public Secto...
Location
Location
United States
Salary
Salary:
Not provided
unstructured.io Logo
Unstructured
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field. Master’s or PhD a plus
  • 4+ years of experience in AI/ML engineering, MLOPS, systems architecture, or similar technical roles
  • 2+ years of experience working with government networks and security requirements
  • An understanding of government security frameworks (FedRAMP, NIST 800-53, FISMA, DISA SRG) and how they apply to ML workloads
  • History of leading or delivering high-impact ML initiatives in enterprise or government environments
  • preference for those with articulable experience assessing performance of alternative models, architectures, and implementation strategies
  • A commitment to meeting the demanding engineering standards required to support national security and defense clients
  • A strong interest in being at the forefront of the AI revolution
  • TS Active Clearance required for the role + ability to travel
  • Familiar with AWS, Azure, and/or GCP services for ML workloads
Job Responsibility
Job Responsibility
  • Develop evaluation and assessment tools and frameworks to measure newly developed models for performance against key metrics across a wide domain of tasks and knowledge sets
  • Identify, propose, and implement modifications of existing models and model implementation frameworks to optimize for new tasks
  • Lead conceptualization of both traditional and agentic implementation strategies for cloud and on-premises model deployments within broader system architectures
  • Lead and optimize distributed ML workloads on multiple government cloud and non-cloud infrastructures
  • Align AI/ML deployments with FedRAMP, NIST 800-53, FISMA, and DISA SRG, maintaining strict security standards
  • Create reference architectures and deployment patterns to streamline ML adoption across government agencies
  • Translate mission objectives into ML-focused technical specifications and project plans
  • Apply advanced security controls and zero-trust architectures to protect ML pipelines and data
  • Continuously assess ML workloads for performance, cost, and security improvements, driving ongoing refinement
What we offer
What we offer
  • Competitive compensation, equity, and benefits
  • Fulltime
Read More
Arrow Right

Senior Engineering Manager- AI/ML

As the Senior Engineering Manager, you will lead by being a highly technical lea...
Location
Location
United States
Salary
Salary:
Not provided
aledade.com Logo
Aledade, Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/BTech (or higher) in Computer Science, Engineering or a related field required
  • 10+ years of production-level experience as an engineer and technical lead building highly scalable and reliable software
  • 5+ years of managerial experience building and leading technical engineering teams
  • 7+ years of experience in machine learning related technologies, with a strong preference for Python
  • Extensive experience in designing and implementing secure, scalable, and maintainable AI/ML platform architectures
  • Proficiency in distributed systems, microservices, containerization technologies (e.g., Docker, Kubernetes), model training infrastructure, orchestration tools, and MLOps principles
  • Sitting for prolonged periods of time
  • Extensive use of computers and keyboard
  • Occasional walking and lifting may be required
Job Responsibility
Job Responsibility
  • Build a high performing team by hiring and nurturing engineering talent
  • Strong technical leadership - drive technical solutioning and building roadmaps
  • Set aggressive and clear goals and remove all roadblocks for the team to achieve them
  • Working seamlessly and collaboratively with stakeholders across Aledade to achieve business outcomes
  • Work closely with engineering leaders to drive engineering excellence in our processes and systems
  • Fulltime
Read More
Arrow Right

AI/ML Engineer

Hewlett Packard Enterprise is seeking an AI/ML Engineer who will design, develop...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in computer science, engineering, information systems, or closely related quantitative discipline
  • typically, 7-10 years’ experience
  • strong programming skills in Python and preferrable familiarity with Golang
  • understanding microservice architecture and how they can be built in a containerized, Kubernetes-managed environment
  • designing and integrating software systems running on multiple platform types into the overall architecture
  • evaluating forms and processes for software systems testing and methodology, including writing and executing test plans, debugging, and testing scripts and tools
  • excellent written and verbal communication skills with the ability to effectively communicate product architectures and design proposals at senior management levels.
Job Responsibility
Job Responsibility
  • Experiment, design, develop and maintain machine learning models and pipelines with a high potential for value and scale
  • collaborate with other ML engineers, data scientists, product managers, and other engineers to ensure successful implementation of machine learning solutions
  • perform research and testing to develop or customize machine learning algorithms
  • conduct model training and evaluation as needed
  • integrate, test, tune and monitor the solutions developed
  • research and evaluate new technologies and tools for machine learning
  • maintain and update existing machine learning systems
  • hands-on develop, productionize, and operate machine learning models and pipelines at scale, including both batch and real-time use cases
  • work with large scale structured and unstructured data, build and continuously improve cutting-edge machine learning models
  • lead project teams for design and development of complex products and platforms, including solution design, analysis, coding, testing, and integration for building efficient, scalable, and robust cloud subsystems
What we offer
What we offer
  • Comprehensive suite of health and wellbeing benefits that supports physical, financial, and emotional wellbeing
  • specific programs catered to career goals for personal and professional development
  • unconditionally inclusive work environment where varied backgrounds are valued.
  • Fulltime
Read More
Arrow Right

AI/ML Engineer

The Applications Development Intermediate Programmer Analyst is an intermediate ...
Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5-9 years of experience with Python, Machine Learning & exposure to Gen AI
  • Knowledge on Java will be added advantage
  • Proficiency in coding Python in building Machine Learning and developing LLM based application in a professional environment
  • Knowledge in Kofax will be a plus
  • SQL skills able to perform data interrogations is must
  • Professional experience developing Java applications
  • Develop LLM solutions for querying structured data with natural language, including RAG architectures on enterprise knowledge bases
  • Build, scale, and optimize data science workloads, applying best MLOps practices for production
  • Lead the design and development of LLM-based tools to increase data accessibility, focusing on text-to-SQL platforms
  • Train and fine-tune LLM models to accurately interpret natural language queries and generate SQL queries
Job Responsibility
Job Responsibility
  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements, including using script tools and analyzing/interpreting code
  • Consult with users, clients, and other technology groups on issues, and recommend programming solutions, install, and support customer exposure systems
  • Apply fundamental knowledge of programming languages for design specifications
  • Analyze applications to identify vulnerabilities and security issues, as well as conduct testing and debugging
  • Serve as advisor or coach to new or lower level analysts
  • Identify problems, analyze information, and make evaluative judgements to recommend and implement solutions
  • Resolve issues by identifying and selecting solutions through the applications of acquired technical experience and guided by precedents
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right