Software Engineer - Reliability Job at Linux Recruit (North West)

Software Engineer - Reliability

We are looking for a hands-on, first-principles engineer who is fluent in Linux,...

Location

United States , Palo Alto

Salary:

170000.00 - 360000.00 USD / Year

Luma AI

Expiration Date

Until further notice

Requirements

8+ years of experience as an SRE, production engineer, or infrastructure engineer in a fast-paced, large-scale environment
Deep, hands-on expertise in Linux, containerized systems, and debugging low-level system performance
Strong experience with providers like AWS or OCI
Thrive on solving complex, low-level problems where hardware and software intersect
Energetic and thrive in a less structured, fast-paced environment
Working knowledge of security best practices and familiarity with compliance frameworks, such as SOC 2 and ISO
Practical experience with InfiniBand, RDMA, or RoCE and understand how to optimize throughput for massive distributed training jobs

Job Responsibility

Architect for Reliability & Scale: Participate in critical re-architecture sessions to redesign our systems for higher efficiency and scale
Own Multi-Cloud GPU Clusters: Take end-to-end ownership of our production clusters for training and inference across AWS and OCI, ensuring high availability and peak performance
Drive Security & Compliance: Assist in achieving and maintaining security certifications (SOC 2 Type 1 & 2, ISO standards) by implementing robust infrastructure security practices
Deep Linux Performance Tuning: Use your mastery of Linux systems to troubleshoot and optimize performance at the OS and kernel level
Build Robust Automation: Write high-quality tools and automation in Python, Go, or Bash to manage, monitor, and heal our infrastructure
Debug Complex Hardware/Software Failures: Serve as the final escalation point for the most challenging GPU, networking (InfiniBand/RDMA), and system-level issues

Fulltime

New

Senior Software Engineer / Principal Software Engineer - Copilot CLI

Within GitHub and Microsoft CoreAI, the Copilot CLI team builds GitHub's coding ...

Location

United States , Redmond

Salary:

119800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years

Job Responsibility

Take ownership of critical product and platform areas of the Copilot CLI and shared agent runtime
Set a high technical and quality bar for agentic systems and developer-facing tooling
Design and ship performant, reliable terminal experiences that developers depend on for daily work
Use data, benchmarks, and direct user feedback to guide iteration and investment
Collaborate across org boundaries to enable other teams to build agentic products on top of a shared foundation
Influence architecture, technical direction, and engineering standards beyond your immediate team

What we offer

Certain roles may be eligible for benefits and other compensation

Fulltime

Backend Software Engineer / Senior Software Engineer- Kusto

Are you excited by the challenge of redefining how people explore and analyze ma...

Location

Israel , Tel Aviv, Herzliya

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

4+ years of technical engineering experience with coding in languages including, but not limited to, C#, Python or Java
2+ years building and running services in a cloud environment (Azure, AWS, or GCP)
Experience in designing and operating large-scale distributed systems with high availability and reliability

Job Responsibility

Design, develop, and improve cloud-native services that are scalable, secure, and easy to operate
Drive architectural decisions and lead the development of major components in a distributed, high-SLA system
Collaborate with cross-functional teams in ILDC and abroad to deliver end-to-end solutions
Conduct code and design reviews and mentor junior engineers to grow technical excellence across the team
Help shape the future of real-time analytics in Microsoft Fabric RTI, with customer impact as your north star

Fulltime

Senior Software Engineer and Software Engineer II

OneDrive and SharePoint are rapidly growing services at the center of Microsoft'...

Location

United States , Redmond

Salary:

100600.00 - 199000.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Experience in related to cloud scale distributed design and patterns
The ability to deliver informed designs and plans ahead of production and execution
Knowledge of others' expertise and the ability to involve multiple players (within and outside the organization) in the creation or development of novel products, processes, or research streams

Job Responsibility

Design and deliver systems that enable partners and ISVs to migrate from other cloud providers, improve core systems performance and efficiencies, and ensure zero customer impact throughout the change management cycle
Deliver systems to meet our business continuity planning goals, provide telemetry for optimizing the service and drive our response time for detecting and resolving service issues down
Create, implement, optimize, debug, refactor, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
Contribue to the identification of dependencies, and the development of design documents for a product area with little oversight
Helps to identify other teams and technologies that will be leveraged, how they will interact, and when one's system may provide support to others
Contributes to determining back-end dependencies associated with product, application, service, or platform functionality for product features
Understands downstream effects of solutions and work provided
Helps to identify areas of dependency and overlap with other teams or team members and drives coordination
Remain current in skills by investing time and effort into staying abreast of current developments that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
Reviews work items to deepen knowledge of product features in partnership with appropriate stakeholders (e.g., project managers) and executes project plans, release plans, and work items

Fulltime

Software Engineer II/Sr. Software Engineer

Join Microsoft’s Core AI team and help shape the future of intelligent software ...

Location

United States , Redmond

Salary:

100600.00 - 199000.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements is required for this role
This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Job Responsibility

Design and ship AI-assisted features in Visual Studio that help developers generate, explain, and refactor code—measured by adoption, reliability, and user satisfaction
Bring intelligence into IDE by integrating GitHub Copilot/MCP tools into core IDE workflows with strong attention to performance, privacy, and safety-by-default
Collaborate with partner teams across Microsoft and GitHub to deliver secure, performant solutions and iterate quickly based on real developer feedback
Contribute to designs (APIs, data flows, extensibility points) and participate in code/design reviews to maintain quality and scalability for a large codebase
Instrument and learn using telemetry, experimentation, and diagnostics to improve latency, reliability, and relevance over time

Fulltime

Software engineer 2 / Senior Software engineer - Azure Data

Microsoft's Azure Data engineering team is leading the transformation of analyti...

Location

India , Bangalore

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience
Experience with the Azure stack including Storage, Compute, Networking, Fabric, Purview, Synapse, AKS, DevOps, Data Factory, or Power BI
Experience with big data technologies such as Spark, Kafka, Hadoop, or HBase
Experience building data lake or data engineering products, tools, or pipelines
Familiarity with container-based architectures (Docker, Kubernetes)
Ability to debug complex distributed systems on Linux and/or Windows platforms

Job Responsibility

Write extensible, maintainable code in C#, Java, Scala, or Python for Fabric Materialized Lake View services and HDInsight components
Use AI tools and coding best practices across the development lifecycle
Design data refresh, scheduling, and query optimisation features with minimal supervision
Review code from teammates for correctness, test coverage, security risks, and adherence to team standards
Coach junior engineers through code reviews
Debug complex issues in distributed systems running on Azure, Linux, and Windows
Run live site operations on a rotational, on-call basis
Integrate logging and instrumentation to gather telemetry on system health, performance, reliability, and security
Work with product managers, technical leads, and partners across geographies to define customer requirements for Materialized Lake View features

Fulltime

Software Engineer II & Senior Software Engineer

Security represents the most critical priorities for our customers in a world aw...

Location

United States , Redmond

Salary:

100600.00 - 199000.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, C, C++, C#, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
Master's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 5+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Experience with Troubleshoot and optimize automation, reliability, and monitoring for Live Site running as part of an on-call rotation owned by engineering team
Experience with distributed systems, messaging systems like Kafka etc - Large scale system design

Job Responsibility

Lead the architecture, design and implementation of services for extremely high scale, throughput, durability, and low latency
Innovate and make service deployment and maintenance an efficient well-oiled machine that provides excellent reliability with minimal manual engineer intervention
Ability to conduct in-depth triage, troubleshooting, and forensics across all facets of the cloud stack while executing processes corrective action and continual service improvement
Drive Infrastructure security improvements for mission critical high scale workloads
Lead the definition of requirements, KPIs, priorities and planning of engineering deliverables
Mentor and grow the energetic, diverse, and driven team with a good mix of senior and mid-level

Fulltime

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

The PowerPoint team is embarking on an exciting new chapter - evolving a product...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
8+ years of experience in backend service engineering, including work on high-scale infrastructures
Proficiency in one or more systems programming languages such as C#, C++
1+ years of experience in software engineering, designing and developing systems (and APIs) that deploy and integrate with AI models
2+ years of experience working with rich telemetry, making data driven decisions, and carrying out rapid experimentation
2+ years of experience building software for scale, performance, and reliability
Academic or industry experience with building, finetuning, deploying or building eval-driven systems utilizing the models (any category)

Job Responsibility

Lead design and delivery of complex, scalable AI features ensuring resilience and exceptional user experience
Drive technical strategy and architecture decisions across multiple services, influencing partner teams and aligning with compliance and security requirements
Champion modern engineering practices, including AI-driven approaches, automation, and cloud-native patterns, across the full development lifecycle
Mentor and guide engineers, fostering technical excellence and continuous improvement in security, reliability, and performance
Collaborate cross-org to solve challenging technical problems, streamline processes, and reduce operational costs while improving live-site health
Design and implement scalable backend services optimized for machine learning workflows and large language model integration
Develop and maintain evaluation-driven systems that leverage text and multimodal inputs (e.g., images) to power visual-creation experiences
Build and optimize APIs and infrastructure to support high-performance model inference and experimentation at scale
Collaborate with product, ML, and design teams to integrate models into user-facing features, ensuring seamless functionality and performance
Conduct model evaluations and experiments, analyze results, and iterate on improvements to enhance accuracy and user experience

Fulltime

Select Country

Software Engineer - Reliability

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

Software Engineer - Reliability

Software Engineer - Reliability

Senior Software Engineer / Principal Software Engineer - Copilot CLI

Backend Software Engineer / Senior Software Engineer- Kusto

Senior Software Engineer and Software Engineer II

Software Engineer II/Sr. Software Engineer

Software engineer 2 / Senior Software engineer - Azure Data

Software Engineer II & Senior Software Engineer

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

Our AI answers in your language