CrawlJobs Logo

Senior Software Engineer, Cloud Platform

United States, San Francisco 150000.00 - 240000.00 USD / Year · Job Posted December 07, 2025
Apply Position
Job Link Share

Job Description

As a Senior Software Engineer, Cloud Platform at Chef Robotics, you'll be responsible for designing, implementing, and maintaining the cloud infrastructure and deployment systems that power our robotics platform. You'll focus on provisioning robots for seamless deployment, enabling remote software updates to enhance performance and reliability, and developing cloud infrastructure that supports real-time robotics operations. This role requires expertise in cloud platforms, containerization, and infrastructure automation to ensure scalable and reliable deployment of our robotics systems across customer environments.

Job Responsibility

  • Design and implement cloud infrastructure to support robotics platform deployment and operations
  • Provision robots for seamless deployment across diverse customer environments
  • Enable remote software updates to enhance performance and reliability of deployed systems
  • Implement containerization (Docker) and orchestration (Kubernetes) for scalable deployments
  • Manage cloud infrastructure across AWS, GCP, or Azure platforms
  • Improve the performance and reliability of cloud services supporting the Chef system
  • Implement fault-tolerant design patterns to ensure reliability in production environments
  • Establish performance benchmarks and optimize systems to meet latency requirements for robotics operations
  • Implement comprehensive logging, monitoring, and alerting for cloud infrastructure
  • Create diagnostic tools and dashboards for operational visibility
  • Implement CI/CD practices and infrastructure-as-code principles
  • Develop automated deployment pipelines for robotics software and services
  • Create and maintain deployment scripts and configuration management systems
  • Establish version control and rollback mechanisms for system updates
  • Develop tools for consistent environment provisioning and management
  • Maintain secure and efficient data pipelines between on-device and cloud services
  • Implement systems to support real-time communication between robotics systems and cloud infrastructure
  • Design cloud architecture that supports telemetry data collection and processing
  • Ensure secure authentication and authorization mechanisms across cloud services

Requirements

  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
  • 5+ years of professional experience in cloud infrastructure and DevOps roles
  • Expert knowledge of cloud infrastructure and deployment (AWS, GCP, or Azure)
  • Strong proficiency with containerization (Docker) and orchestration (Kubernetes) technologies
  • Extensive experience with CI/CD practices and infrastructure-as-code principles
  • Experience with system monitoring, logging, and performance optimization
  • Understanding of secure data pipeline design and implementation
  • Understanding of infrastructure requirements for robotics or automation systems
  • Experience with real-time or near-real-time systems and cloud architecture
  • Background in developing reliable systems with high availability requirements
  • Knowledge of deployment automation and configuration management
  • Familiarity with system performance optimization including latency and scalability
  • Strong problem-solving skills with a systematic approach to infrastructure challenges
  • Excellence in technical communication and documentation
  • Proactive mindset in identifying potential issues and implementing solutions
  • Comfort with working in a fast-paced startup environment
  • Passion for robotics and automation technology
  • Collaborative approach to cross-functional engineering teams

Nice to have

  • Experience with robotics hardware deployment and management
  • Knowledge of ROS (Robot Operating System) or similar frameworks
  • Experience with time-series databases for telemetry data
  • Familiarity with message queue systems for distributed systems
  • Background in manufacturing, food production, or industrial automation
  • Experience with WebSockets for real-time communication
  • Knowledge of Redis for caching and distributed system patterns

What we offer

  • medical insurance
  • dental insurance
  • vision insurance
  • commuter benefits
  • flexible paid time off (PTO)
  • catered lunch
  • 401(k) matching
  • early-stage equity

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Software Engineer, Cloud Platform

8 matching positions

Software Engineer II or Senior Software Engineer - Simulation Platform

The AI Frameworks team at Microsoft develops AI software that enables running AI...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C++, C, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Developing hardware simulator of next generation AI chips
  • Technical contribution to design, implementation, verification, and documentation of code ensuring on-time deliveries of simulator releases used daily by parter teams (C++ and Python)
  • Collaborate broadly across multiple disciplines and with various partner teams from hardware designers to AI models developers
  • Identify requirements, scope solutions, estimate work, schedule deliverables
  • Fulltime
Read More
Arrow Right

Software Engineer II and Senior Software Engineer- Microsoft Security - Platform Team

We have multiple positions open for Software Engineers and Senior Software Engin...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • B.Sc. or M.Sc. in computer science, software engineering, or equivalent experience
  • 3+ years of professional hands-on software development experience, primarily focused on developing and designing backend services in cloud or on-premises environments
  • Experience working with Kubernetes and Containers
  • Experience in working with cloud infrastructure and services
Job Responsibility
Job Responsibility
  • Contribute to business-critical initiatives in Microsoft Security
  • Requiring deep technical skills and the ability to quickly adapt to new areas
  • Will improve the end-to-end lifecycle of services
  • Analyze complex system behavior, and apply modern engineering practices to streamline deployments and reduce costs
  • Working on high-end technologies and collaborating across disciplines to deliver impactful features
  • Collaborate with multiple teams across Microsoft to deliver key customer solutions and support technology
  • Fulltime
Read More
Arrow Right

Senior Cloud Platform Software Engineer

We are seeking a Senior Cloud Platform Software Engineer to join our team and be...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
zenobe.com Logo
Zenobē
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong hands-on experience with AWS services such as EC2, S3, IAM, RDS, Control Tower etc.
  • Hands-on experience and daily management of Kafka
  • A working knowledge of Kubernetes
  • Proficiency in Terraform for managing cloud infrastructure at scale
  • Familiarity with monitoring/logging tools (e.g., Prometheus, Grafana, ELK, CloudWatch)
  • Strong automation skills (e.g., Ansible, GitHub Actions) for reliability and operational tasks
  • Solid understanding and practical experience with GitOps principles and tools, CI/CD pipelines and DevOps best practices
  • Proficient with version control using Git and collaboration via Git-based workflows
  • Excellent communication skills, able to present technical information clearly to non-technical stakeholders
  • Experience mentoring junior engineers and leading others by example
Job Responsibility
Job Responsibility
  • Designing, implementing, and managing scalable, secure, and highly available cloud infrastructure
  • Help the development of our AWS cloud architecture using automation and DevOps practices
  • Collaborating closely with development teams to troubleshoot complex issues, optimise performance, and enforce compliance with industry standards
  • Evaluating emerging cloud technologies to align with business goals and drive innovation
  • Mentoring other engineers, helping your team grow, and taking on some team and project leadership activities
  • Being a go-to person when another team or another Cloud team member is facing an unknown issue with a production or pre-production workload
  • Planning, leading and executing on our ideas for a more reliable and scalable usage of AWS
  • Collaborate across teams to deliver scalable, real-time and batch data pipelines that support our products and analytics
  • Support and mentor teammates, sharing knowledge and reviewing designs and code
  • Contribute to the architecture and evolution of our data platform
What we offer
What we offer
  • Up to 33% annual bonus
  • 25 days holiday, increasing with length of service up to 30 days, plus bank holidays
  • Private Medical Insurance
  • £1,500 training budget per year
  • EV Salary Sacrifice Scheme
  • Pension scheme, up to 8% matched contributions
  • Enhanced parental leave
  • Cash back health plan
Read More
Arrow Right

Senior Staff Engineer Software (Cloud Platform, Production & Reliability – Machine Identity Security)

The Production Engineering team is responsible for building, scaling, and operat...
Location
Location
United States , Santa Clara
Salary
Salary:
126000.00 - 203500.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in DevOps, Platform Engineering, or Site Reliability Engineering (SRE)
  • Strong experience designing and operating cloud infrastructure on AWS, Azure, or GCP
  • Deep expertise managing and scaling Kubernetes environments (EKS, AKS, or GKE)
  • Strong experience with Infrastructure as Code tools (Terraform, Ansible, or Pulumi)
  • Proven experience designing and maintaining complex CI/CD systems (Jenkins, GitLab CI, ArgoCD, GitHub Actions)
  • Strong programming/scripting skills (Python, Go, or similar) for automation and tooling
  • Experience operating in high-scale, 24/7 production environments with ownership of incident response and reliability
  • Solid understanding of Linux systems and networking fundamentals (DNS, TCP/IP, load balancing, VPC, mTLS)
  • Strong problem-solving skills and ability to work across teams
Job Responsibility
Job Responsibility
  • Design, build, and evolve highly available cloud infrastructure platforms with a focus on scalability, resilience, and reliability
  • Lead improvements across production systems, including performance, availability, and incident response
  • Drive and standardize Infrastructure as Code (IaC) practices to improve consistency and reduce operational overhead
  • Design and optimize CI/CD pipelines to support fast, secure, and reliable software delivery at scale
  • Partner with development teams to improve system reliability, observability, and cloud-native design patterns
  • Define and implement monitoring, alerting, and observability strategies across distributed systems
  • Lead incident response efforts, including root cause analysis and long-term remediation strategies
  • Identify and eliminate operational toil through automation and system improvements
  • Mentor engineers and contribute to raising the bar for production engineering practices
What we offer
What we offer
  • restricted stock units
  • bonus
  • Fulltime
Read More
Arrow Right

Senior+ Software Engineer - Cloud Availability Platform Engineering (Observability)

We are looking for a highly skilled engineer with deep expertise in building and...
Location
Location
United States , San Francisco
Salary
Salary:
166000.00 - 201000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in infrastructure or platform engineering, with a focus on observability and monitoring systems
  • Deep expertise with metrics systems (Prometheus, Thanos, Mimir, Cortex), logging pipelines (Fluent Bit, Vector, Loki, ELK/Opensearch), and tracing platforms (Jaeger, Tempo, OpenTelemetry)
  • Strong programming skills in Go or Python for automation, operators, and custom integrations
  • Experience running observability platforms on Kubernetes and operating them at scale across multi-datacenter environments
  • Proven ability to design, optimize, and scale telemetry pipelines handling high cardinality and high throughput data
  • Solid understanding of distributed systems, performance engineering, and debugging complex workloads
  • Strong collaboration skills and the ability to influence engineering teams to adopt observability best practices
Job Responsibility
Job Responsibility
  • Designing and operating scalable observability systems (metrics, logging, tracing) across multi-datacenter Kubernetes environments
  • Architecting end-to-end telemetry pipelines, including ingestion, storage, querying, and visualization
  • Extending monitoring and alerting with Prometheus, Alertmanager, Thanos/Cortex, Grafana, and OpenTelemetry
  • Building scalable log collection and processing pipelines with Fluent Bit, Vector, Loki, or ELK/Opensearch stacks
  • Implementing distributed tracing platforms (Tempo, Jaeger, OpenTelemetry) and integrating with service meshes, load balancers, and APIs
  • Defining and driving adoption of SLOs, SLIs, and error budgets across services and teams
  • Automating provisioning and scaling of observability infrastructure with Kubernetes, Terraform, and custom tooling (Go, Python)
  • Ensuring reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure)
  • Embedding security best practices into observability platforms, including RBAC, TLS, secret management, and multi-tenant access controls
  • Partnering with engineering teams to embed observability into applications, services, and infrastructure
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Together Cloud Platform

About the Role: Together AI is building the AI Acceleration Cloud, an end-to-end...
Location
Location
United States , San Francisco
Salary
Salary:
160000.00 - 230000.00 USD / Year
together.ai Logo
Together AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices
  • Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources
  • Excellent communication skills – able to write clear design docs and work effectively with both technical and non-technical team members
  • Demonstrated experience with building and operating high-performance and/or globally distributed microservice architectures across one or more cloud providers (AWS, Azure, GCP)
  • Strong systems knowledge across compute, networking, and storage, including concurrency, memory management, performant I/O, and scale
  • Experience developing against and managing a relational database, such as PostgreSQL
  • Expert-level programmer in one or more of programming language (Golang preferred)
  • Proficiency in version control practices and integrating IaC with CI/CD pipelines
  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Identify, design, and develop foundational backend services that power Together’s cloud platform
  • Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure
  • Partner with product teams to understand functional requirements and deliver solutions that meet business needs
  • Write clear, well-tested, and maintainable software and IaC for both new and existing systems
  • Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance
  • Participate in an on-call rotation to address critical incidents when necessary
What we offer
What we offer
  • competitive compensation
  • startup equity
  • health insurance
  • flexibility in terms of remote work
  • Fulltime
Read More
Arrow Right

Senior Software Engineer II - Cloud Compute Platform

As a Software Engineer on the Compute Platform team, you will be a key technical...
Location
Location
United States
Salary
Salary:
197400.00 - 232000.00 USD / Year
confluent.io Logo
Confluent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience delivering scalable software solutions
  • Proven track record of leading the delivery of large-scale, highly available, low-latency systems
  • Deep expertise in Kubernetes including controller development, operator patterns, and multi-cluster architectures
  • Strong proficiency in Go with experience building production-grade distributed systems
  • Experience with multi-tenant platform architectures and security isolation patterns
  • Familiarity with gRPC, Protobuf, and API design for internal platform services
  • Experience with observability tools and operational excellence practices
  • Experience with multi-cloud environments (AWS, GCP, Azure) and cloud-provider integrations
  • Track record of providing technical leadership and mentorship
  • Track record of working collaboratively across teams including product management, SRE, and other engineering teams
Job Responsibility
Job Responsibility
  • Drive the overall technical charter for the Compute Platform, including multi-cluster orchestration, workload placement, and security architecture
  • Design and implement platform APIs and Kubernetes operators using Go to support evolving workload requirements
  • Work closely with product management and engineering leadership to build and drive the roadmap for Confluent's Compute Platform, enabling new business opportunities across Confluent
  • Deliver high-impact initiatives in areas such as workload scheduling, disruption management, network isolation, rolling update strategies, and cross-cluster resource management
  • Lead technical design reviews and drive architectural decisions across organizational boundaries
  • Mentor and grow other engineers on the team through code reviews, pairing, and technical guidance
  • Own operational aspects including availability, reliability, performance monitoring, emergency response, and disaster recovery for our global compute infrastructure
What we offer
What we offer
  • Remote-First Work
  • Robust Insurance Benefits
  • Flexible Time Away
  • The Best Teammates
  • Experience Ambassadors
  • Open and Honest Culture
  • Well-Being and Growth
  • Offers Equity
  • Fulltime
Read More
Arrow Right

Senior Principal Software Engineer ( Cloud Infrastructure and Platform Engineering )

Your Career At Palo Alto Networks, Secure Cloud and AI infrastructure is the fou...
Location
Location
United States , Santa Clara
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS, MS, or PhD in Computer Science or a related technical field, or equivalent experience
  • 9+ years of relevant software engineering experience, with a proven track record of technical leadership and innovation
  • Demonstrated experience defining and leading large-scale, cross-organizational technical initiatives from concept to completion
  • Experience building and scaling platforms that serve thousands of engineers in complex environments
  • Strong foundation in application and infrastructure security, including secrets management, supply chain security, and secure-by-default platform design
  • Recognized expertise in developer platforms, cloud-native infrastructure, container orchestration technologies (e.g Kubernetes) and CI/CD
  • Deep proficiency with a major cloud platform (GCP preferred), including IAM, managed databases, networking, and Workload Identity
  • Experience designing and maintaining Infrastructure as Code (e.g. Terraform) at scale, including module architecture and state management
  • Expertise in authentication/authorization systems: OAuth 2.0, OIDC, token lifecycle management, and zero-trust patterns
  • Hands-on experience applying AI/ML/GenAI to solve complex software engineering problems
Job Responsibility
Job Responsibility
  • Define the Vision: Architect and own the technical roadmap for AI-enhanced developer tools and infrastructure in CIPE at Palo Alto Networks
  • Evaluate and Execute Solutions: Lead the design and implementation of novel systems that leverage Large Language Models (LLMs), static/dynamic analysis, and machine learning to create a world-class, intelligent developer experience
  • Drive Organization-Wide Impact: You are a builder, so you won't just stop at ideation. Beyond concepts, ensure your builds show step-change improvements in key engineering metrics like including code velocity, review cycle time, test effectiveness, incident reduction, and overall feature launches
  • Lead Cross-Functional Initiatives: Spearhead complex, cross-functional projects that require influencing and aligning multiple engineering organizations and their leadership
  • Enable Secure Innovation: Develop foundational AI platforms that empower teams to prototype, deploy, and scale threat-intelligent cloud features, embedding Palo Alto Networks' security natively
  • Serve as Technical Authority: Act as the go-to expert on AI-augmented cloud platforms, mentoring senior engineers and infusing industry-leading practices into our high-stakes ecosystem
  • Innovate at Enterprise Scale: Address intricate challenges in multi-cloud environments (AWS, Azure, GCP, and OCI) supporting thousands of microservices, secure workloads, and global threat detection pipelines
What we offer
What we offer
  • restricted stock units
  • bonus
  • Fulltime
Read More
Arrow Right