CrawlJobs Logo

Machine Learning Hardware Architect

United States, Sunnyvale 212000.00 - 294000.00 USD / Year · Job Posted January 23, 2026
Apply Position
Job Link Share

Job Description

In this position you will work with Machine Learning Hardware Architects, Digital Designers, and Software engineers to develop custom Machine Learning Hardware accelerators for delivery into multiple SoCs. You will collaborate with a world-class group of researchers and architects to implement and contribute to the development and optimization of low power machine learning accelerators and state-of-the-art SoCs.

Job Responsibility

  • Technical lead for ML Hardware engineers, driving design from Architecture through to Product for AR/VR optimized silicon
  • Lead designs to surpass state of the art for metrics such as compute, bandwidth and power consumption
  • Work across disciplines, brainstorm big ideas, work in new technology areas, juggle/coordinate multiple initiatives, drive a concept into a prototype and ultimately guide the transition into a high-volume consumer product
  • Travel both domestically and internationally

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 12+ years of experience as a Hardware Design Engineer or Silicon Architect for production silicon shipped in volume
  • Experience in Machine Learning IPs Silicon development
  • Experience in digital design µArchitecture, RTL coding
  • Experience with methods for partitioning a solution across hardware and software, evaluating trade-offs such as speed, performance, power, area
  • Results oriented, proactive with demonstrated creative & critical thinking

Nice to have

  • Experience in deep learning algorithms and techniques, e.g., convolutional neural networks, transformers, LLMs
  • Experience with SoC Architecture and subsystem Integration
  • Knowledge of industry trends and disruptive technologies
  • Experience with Firmware, DSP coding and optimization
  • Knowledge of Physical Design and Low power implementation
  • Collaborate and/or lead in a team environment
  • Master/PhD degree in EE/CS or equivalent areas

What we offer

  • bonus
  • equity
  • benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Machine Learning Hardware Architect

8 matching positions

Machine Learning Performance Modeling Architect

Reality Labs focuses on delivering Meta's vision through On-device AI. The compu...
Location
Location
United States , Sunnyvale
Salary
Salary:
178000.00 - 250000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 8+ years of experience in IP/SoC/System performance modeling and workload analysis/optimization for low-power/high-performance accelerators
  • 8+ years of experience with programming languages (C/C++ and Python), script automation and data visualization
  • Experience evaluating architectural tradeoffs in performance, power and image quality
  • Experience employing scientific methods to debug, diagnose and drive the resolution of complex, cross-disciplinary system issues
Job Responsibility
Job Responsibility
  • Lead power/performance modeling and analysis of machine learning software-hardware components and use cases
  • Capture machine learning workloads from applications and system usages
  • Support all phases of Silicon SoC development
  • Contribute to execution of our silicon technology / machine learning roadmap to make beyond state-of-the-art advances in performance, power consumption and form factor
  • Work across disciplines, build new methodologies, juggle/coordinate multiple initiatives
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Search Machine Learning Research Engineer

Perplexity is seeking an experienced Senior Machine Learning Engineer to help bu...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
perplexity.ai Logo
Perplexity
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep understanding of search and retrieval systems, including quality evaluation principles and metrics
  • Proven track record with large-scale search or recommender systems
  • Strong proficiency with PyTorch, including experience in distributed training techniques and performance optimization for large models
  • Expertise in representation learning, including contrastive learning and embedding space alignment for multilingual and multimodal applications
  • Strong publication record in AI/ML conferences or workshops (e.g., NeurIPS, ICML, ICLR, ACL, CVPR, SIGIR)
  • Self-driven, with a strong sense of ownership and execution
  • Minimum of 3 years (preferably 5+) working on search, recommender systems, or closely related research areas
Job Responsibility
Job Responsibility
  • Relentlessly push search quality forward — through models, data, tools, or any other leverage available
  • Architect and build core components of the search platform and model stack
  • Design, train, and optimize large-scale deep learning models using frameworks like PyTorch, leveraging distributed training (e.g., PyTorch Distributed, DeepSpeed, FSDP) and hardware acceleration, with a focus on retrieval and ranking models
  • Conduct advanced research in representation learning, including contrastive learning, multilingual, and multimodal modeling for search and retrieval
  • Deploy models — from boosting algorithms to LLMs — in a scalable and performant way
  • Build and optimize RAG pipelines for grounding and answer generation
  • Collaborate with Data, AI, Infrastructure, and Product teams to ensure fast and high-quality delivery
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

With Prisma AIRS, Palo Alto Networks is building the world's most comprehensive ...
Location
Location
United States , Santa Clara
Salary
Salary:
185200.00 - 299475.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS or Ph.D. in Computer Science, a related technical field, or equivalent practical experience
  • Extensive professional experience in software engineering with a deep focus on MLOps, ML systems, or productionizing machine learning models at scale
  • Expert-level programming skills in Python are required
  • Deep, hands-on experience designing and building large-scale distributed systems on a major cloud platform (GCP, AWS, Azure, or OCI)
  • Proven track record of leading the architecture of complex ML systems and MLOps pipelines using technologies like Kubernetes and Docker
  • Mastery of ML frameworks (TensorFlow, PyTorch) and extensive experience with advanced inference optimization tools (ONNX, TensorRT)
  • A strong understanding of popular model architectures (e.g., Transformers, CNNs, GNNs) is a must
  • Demonstrated expertise with modern LLM inference engines (e.g., vLLM, SGLang, TensorRT-LLM) is required
Job Responsibility
Job Responsibility
  • Lead the architectural design of a highly scalable, low-latency, and resilient ML inference platform capable of serving a diverse range of models for real-time security applications
  • Define technical approaches to less-defined product requirements, ensuring the best fit between product features and technical implementation
  • Explore new product opportunities by maintaining a deep understanding of LLM and Generative AI research trends
  • Provide technical leadership and mentorship to the team, driving best practices in MLOps, software engineering, and system design
  • Drive the strategy for model and system performance, guiding research and implementation of advanced optimization techniques like custom kernels, hardware acceleration, and novel serving frameworks
  • Establish and enforce engineering standards for automated model deployment, robust monitoring, and operational excellence for all production ML systems
  • Act as a key technical liaison to other principal engineers, architects, and product leaders to shape the future of the Prisma AIRS platform and ensure end-to-end system cohesion
  • Tackle the most ambiguous and challenging technical problems in large-scale inference, from mitigating novel security threats to achieving unprecedented performance goals
What we offer
What we offer
  • restricted stock units
  • bonus
  • employee benefits
  • Fulltime
Read More
Arrow Right

Lead Machine Learning Engineer

Machine Learning Engineers specializing in Inference Optimization focus on maxim...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
thoughtworks.com Logo
Thoughtworks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep practical expertise in model and runtime optimization techniques (quantization, pruning, distillation, batching, caching)
  • Proven experience optimizing inference workloads using frameworks such as vLLM, NVIDIA Triton/Dynamo
  • Strong proficiency in deep learning frameworks (e.g. PyTorch, TensorFlow) with production deployment experience
  • Ability to diagnose and optimize performance using profiling tools (e.g. Nsight, PyTorch/TensorFlow profilers)
  • Solid understanding of GPU and accelerator architectures, and experience tuning workloads for cost and performance efficiency
  • Experience designing and benchmarking scalable inference systems across heterogeneous environments (GPU clusters, serverless, edge)
  • Familiarity with observability stacks, telemetry, and cost instrumentation for AI workloads
  • Demonstrated ability to lead small-to-medium engineering teams or technical workstreams
  • Skilled at balancing hands-on delivery with architectural oversight and mentorship
  • Strong communication and stakeholder engagement skills and are able to connect low-level optimizations with business impact
Job Responsibility
Job Responsibility
  • Lead the design and implementation of advanced model optimization pipelines, including quantization, pruning, and distillation
  • Architect and tune inference runtimes and serving frameworks to achieve optimal performance across deployments
  • Guide teams in implementing high-throughput serving strategies (continuous batching, KV caching, speculative decoding, asynchronous scheduling)
  • Develop benchmarks and performance dashboards to measure and communicate system-level efficiency improvements (throughput, latency, GPU utilization, cost)
  • Evaluate trade-offs across accuracy, performance, and cost, and design architectures to meet target SLAs across varied hardware environments (cloud, on-prem, edge)
  • Collaborate with infrastructure, MLOps, and product teams to embed inference optimization into production workflows and platform designs
  • Provide technical leadership and mentorship to engineers, fostering a culture of experimentation, rigor, and continuous performance improvement
  • Contribute to the development of internal frameworks, reference architectures, and playbooks for scalable and cost-efficient inference
  • Engage with clients to translate optimization outcomes into business value and articulate the ROI of technical improvements
What we offer
What we offer
  • Learning & Development
  • Interactive tools
  • Numerous development programs
  • Teammates who want to help you grow
  • Empowering our employees in their career journeys
  • Fulltime
Read More
Arrow Right
New

Applied Scientist

Amazon Devices is an inventive research and development company that designs and...
Location
Location
United Kingdom , Cambridge
Salary
Salary:
Not provided
Amazon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's degree, or a PhD and experience in CS, CE, ML or related field
  • Experience programming in Java, C++, Python or related language
  • Experience in patents or publications at top-tier peer-reviewed conferences or journals
  • Experience in state-of-the-art deep learning models architecture design and deep learning training and optimization and model pruning
Job Responsibility
Job Responsibility
  • Apply and extend compression recipes (knowledge distillation, structured pruning, and post-training and quantization-aware quantization including low-bit and mixed-precision) to assigned models, achieving 20x to 100x compression while preserving model quality
  • Design and run healing recipes (fine-tuning and distillation that recover accuracy lost to compression), iterating on data mixes, objectives, and training settings until the compressed model meets its quality bar
  • Track emerging model architectures and dissect how they work internally, so you can choose where to compress, anticipate where accuracy will break, and design recovery strategies grounded in the model's actual structure
  • Build a library of compression-ready model entries: reference implementations, compression recipes, model cards, and benchmark results that partner teams can run self-service to produce deployment-ready artifacts for edge and cloud targets
  • Define the datasets, benchmarks, and KPIs that matter for your models, and build evaluation methodology that makes accuracy, latency, memory, and cost trade-offs explicit
  • Run fast feasibility gates on new model families and modalities before committing to long efforts, and pivot early when a candidate does not clear the bar
  • Capture platform friction as high-signal feedback: minimal reproductions and tracked fix requests that help platform and compression-science partners root-cause issues, so partner teams never rediscover the same blockers
  • Write reproducible, testable, well-documented code that meets the SDE I bar, so your recipes and results can be reproduced and built on by others
  • Collaborate with Applied Scientists, platform and compiler engineers, hardware architects, and partner teams
  • mentor interns and help newer teammates ramp up
  • Fulltime
Read More
Arrow Right

Staff Software Engineer – Secondary Driving System

At General Motors, our Embodied AI teams are redefining what’s possible in drive...
Location
Location
United States , Sunnyvale
Salary
Salary:
218800.00 - 335300.00 USD / Year
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS, MS, or PhD in Computer Science, Robotics, Electrical/Mechanical Engineering, or a related field
  • or equivalent practical experience
  • 8+ years of professional software engineering experience building production systems in robotics, autonomous vehicles, or other complex real‑time/control systems, including significant experience in perception and/or prediction
  • Strong proficiency in modern C++ (e.g., C++14/17 or later) in large, multi‑contributor codebases
  • experience using Python for tooling, data analysis, and ML experimentation
  • Demonstrated experience leading technical design and delivery of perception, tracking, or prediction systems in real‑time environments, including: Multi‑sensor fusion across camera, radar, and/or lidar (e.g., object‑level fusion, occupancy/freespace fusion, early/late fusion architectures)
  • Classical computer vision and geometric algorithms (feature extraction, multi‑view geometry, stereo, SfM, SLAM/visual odometry)
  • Multi‑object tracking (Kalman/extended/unscented filters, track‑to‑track fusion, track lifecycle management)
  • Motion prediction for road users (analytical kinematic models, maneuver‑based prediction, or learned trajectory forecasting models)
  • Proven track record of delivering reliable, high‑quality robotics or autonomous driving software to production, including: Testing strategies (simulation, HIL, scenario‑based testing, regression suites)
Job Responsibility
Job Responsibility
  • Serve as a technical lead for SDS software across multiple components of the stack, setting direction for algorithms, architectures, and system interfaces across features and releases
  • Own the end‑to‑end technical strategy for key SDS behaviors and features, spanning perception/prediction integration, planning, controls, and system‑level interactions
  • Balance hands‑on technical work with cross‑team leadership: you will still design and implement critical components in modern C++, while also guiding other senior and mid‑level engineers to deliver at scale
  • Collaborate closely with experts in perception, tracking, prediction, state estimation, localization, mapping, planning, controls, systems engineering, and safety to deliver robust, fail‑operational behaviors for Super Cruise and future products
  • Define technical vision & architecture
  • Set the technical direction for SDS software components with a focus on correctness, robustness, and predictable runtime behavior under tight latency and compute budgets
  • Architect scalable, modular multi‑sensor perception pipelines for camera, radar, and lidar, including detection, classification, lane/road feature extraction, freespace/occupancy, and environmental context
  • Establish and evolve interfaces and contracts between perception/prediction and upstream/downstream components (state estimation, localization, mapping, planning, controls, autonomy management)
  • Lead high‑impact projects
  • Lead design and delivery of multi‑object tracking systems (e.g., Kalman/extended/unscented filters, IMM, probabilistic data association, track lifecycle management) that provide stable, high‑quality tracks under real‑world noise and edge cases
What we offer
What we offer
  • medical
  • dental
  • vision
  • Health Savings Account
  • Flexible Spending Accounts
  • retirement savings plan
  • sickness and accident benefits
  • life insurance
  • paid vacation & holidays
  • tuition assistance programs
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Microsoft Robotics (Software Systems)

Microsoft's Discovery and Quantum (MDQ) division develops and delivers advanced ...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Architect and implement core platform components, including robotics SDKs, cloud-hosted Application Programming Interfaces (APIs), edge runtimes, and agent orchestration frameworks that enable developers and partners to compose interoperable autonomy capabilities (perception, planning, control, multi-agent coordination) into deployable mission workflows
  • Design the platform's extensibility and integration architecture, defining how first-party autonomy capabilities, first- and third-party models, partner hardware systems, and customer-specific logic are composed, versioned, tested, and deployed across cloud and edge environments
  • Build production-grade data infrastructure spanning the full robotics lifecycle including instrumentation libraries, data acquisition services, human-in-the-loop workflows, dataset versioning and curation pipelines, and data quality governance supporting both real-world and synthetic/simulated data at scale
  • Own cross-cutting platform concerns including authentication and authorization across cloud-edge boundaries, API versioning and backward compatibility, multi-tenant isolation, and performance at the latencies required by real-time robotic control loops
  • Drive the developer experience for the Microsoft Robotics platform, to include defining the Command Line Interface (CLI), SDK patterns, documentation strategy, sample code, and inner-loop development workflow that make it fast and reliable for internal teams and external partners to build on the platform
  • Collaborate with autonomy, simulation, and evaluation teams to ensure that platform primitives (compute orchestration, data routing, model serving, experiment tracking) meet the performance, reliability, and reproducibility requirements of Machine Learning (ML) training, sim-to-real transfer, and online evaluation workloads
  • Lead technical design reviews, write architecture decision records, and establish engineering practices for the platform team, mentoring senior engineers and raising the bar for code quality, testing, and operational readiness across the organization.
What we offer
What we offer
  • Benefits and other compensation may be eligible
  • Find additional benefits and pay information at the provided link.
  • Fulltime
Read More
Arrow Right

Ai Ml Engineer

AI/ML Engineer | Mangalore / Hybrid India | Full-Time. About the Opportunity: Ab...
Location
Location
India , Mangalore
Salary
Salary:
Not provided
abottstech.com Logo
Abotts
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS/PhD in Computer Science, Engineering, or a related field
  • 2+ years of experience designing and deploying AI/GenAI solutions in production environments
  • Proficiency in Python
  • hands-on experience with TensorFlow, PyTorch, and NLP frameworks
  • Demonstrated experience training large language models and managing compute infrastructure
  • Familiarity with NoSQL and Vector databases
  • Working knowledge of DevOps practices: Git, CI/CD, Linux scripting
  • Experience with GenAI evaluation and testing frameworks
  • Prior experience leading or mentoring engineers
Job Responsibility
Job Responsibility
  • Architect and execute the roadmap for AI, GenAI, and Agentic AI capabilities
  • Lead and mentor a small engineering team — translating vision into actionable plans with clear ownership
  • Deliver rapid, high-quality implementations in a fast-paced, dynamic startup environment
  • Collaborate cross-functionally with hardware and product teams to accelerate go-to-market timelines
  • Contribute to the company's thought leadership through white papers and patent filings
What we offer
What we offer
  • Competitive compensation and may be meaningful early-stage equity (ESOPs)
  • The chance to shape AI strategy at a company redefining an entire industry
  • High-trust, high-accountability culture that rewards initiative
  • Flexible hybrid work environment
  • Full company details shared at first conversation stage
  • Fulltime
Read More
Arrow Right