CrawlJobs Logo
Briefcase Icon
Category Icon

Filters

×
Cities

Data Engineer - Pyspark United States, San Francisco Jobs (On-site work)

16 Job Offers

Filters
New
Infrastructure Engineer, Data Platform
Save Icon
Own and scale the critical AWS data infrastructure powering analytics, billing, and AI platforms at Together AI. You'll design secure, reliable systems using Terraform, partnering with data and security teams. This San Francisco role requires 5+ years of infrastructure/DevOps expertise with AWS, ...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
160000.00 - 260000.00 USD / Year
together.ai Logo
Together AI
Expiration Date
Until further notice
New
Sr. Distinguished Data Engineer
Save Icon
Seeking a Sr. Distinguished Data Engineer to shape our technology trajectory at Capital One. This role requires deep expertise in AWS, data architecture, and engineering with 9+ years of experience. You will act as a visionary and trusted advisor, driving innovation and mentoring next-generation ...
Location Icon
Location
United States , Cambridge; San Francisco; San Jose; McLean; Richmond
Salary Icon
Salary
286200.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
New
Software Engineer II - Data (Backend)
Save Icon
Location Icon
Location
United States , San Francisco; Sunnyvale
Salary Icon
Salary
171000.00 - 190000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Senior Data Engineer (Graph)
Save Icon
Seeking a Senior Data Engineer with expertise in graph databases like Neo4j to architect scalable data pipelines in San Francisco. You will leverage Python, Airflow, and AWS to transform complex data structures, supporting our Core Data Platform. This role requires 5+ years of experience building...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
90.00 - 93.00 USD / Hour
softwareresources.com Logo
Software Resources
Expiration Date
Until further notice
Backend Engineer, Growth and Data
Save Icon
Join Hebbia's Growth and Data team as a Backend Engineer in New York City or San Francisco. You will architect high-scale backend systems, APIs, and infrastructure using Python/Java/Go and AWS. Build solutions for universal indexing and performance optimization while enjoying top benefits like un...
Location Icon
Location
United States , New York City; San Francisco
Salary Icon
Salary
160000.00 - 300000.00 USD / Year
hebbia.ai Logo
Hebbia
Expiration Date
Until further notice
Data Engineer
Save Icon
Join as our first Data Engineer in NYC or SF. Architect end-to-end data solutions, build scalable ETL pipelines, and manage our central data lake. We seek 5+ years of experience with Python, SQL, and cloud data stacks. Enjoy unlimited PTO, comprehensive benefits, and a competitive equity package.
Location Icon
Location
United States , New York City; San Francisco
Salary Icon
Salary
190000.00 - 250000.00 USD / Year
hebbia.ai Logo
Hebbia
Expiration Date
Until further notice
Frontend Engineer, Growth and Data
Save Icon
Join Hebbia's Growth and Data team as a Frontend Engineer in New York or San Francisco. You will build innovative React/TypeScript interfaces to unlock unique customer value and drive platform growth. Collaborate cross-functionally to own product experiences from ideation to launch. Enjoy top ben...
Location Icon
Location
United States , New York City; San Francisco
Salary Icon
Salary
160000.00 - 300000.00 USD / Year
hebbia.ai Logo
Hebbia
Expiration Date
Until further notice
Machine Learning Engineer - Data Foundation and AI
Save Icon
Join Plaid's Data Foundation & AI team as a Machine Learning Engineer in San Francisco. Design, build, and scale advanced ML/AI systems that power products for millions. You'll need 1-3 years of production ML experience with PyTorch and distributed systems. Enjoy full benefits, equity, and a role...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
186000.00 - 236400.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Senior Software Engineer - Data Infrastructure
Save Icon
Join Plaid's Data Infrastructure team in San Francisco as a Senior Software Engineer. You will build and scale core data and ML platforms using Spark, Data Warehouses, and orchestration tools. Lead key projects, mentor others, and enable product innovation. We offer comprehensive benefits, equity...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Data Engineer
Save Icon
Join Plaid's Data Engineering team in San Francisco to build robust golden datasets that power data-driven products. You'll leverage SQL, Python, and tools like DBT and Airflow to design pipelines on petabyte-scale data. Enjoy full benefits while solving complex data challenges to create insights...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
163200.00 - 223200.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Robotics Data Infrastructure Engineer
Save Icon
Join Verne as a founding Robotics Data Infrastructure Engineer in San Francisco. Architect and deploy critical AWS and edge data pipelines for real-world robots. You'll manage large-scale multimodal datasets and build MLOps tooling, directly impacting production systems. Requires strong Python, A...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
110000.00 - 175000.00 USD / Year
workatastartup.com Logo
YC Work at a Startup
Expiration Date
Until further notice
Senior Data Engineer
Save Icon
Join Crusoe Energy as a Senior Data Engineer in San Francisco. Architect the foundational data platform powering AI and cloud operations. We seek expertise in Python, distributed systems, SQL, and data infrastructure. Enjoy competitive pay, equity, comprehensive health benefits, and a 401(k) match.
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
Not provided
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Data Engineer
Save Icon
Join Suno's founding team as a Data Engineer in a key US tech hub. Design and scale core data foundations using SQL, Python, and modern tools like Airflow and Snowflake. Enjoy equity, unlimited PTO, and a culture passionate about music and engineering excellence.
Location Icon
Location
United States , Boston, NYC, Los Angeles, San Francisco
Salary Icon
Salary
170000.00 - 240000.00 USD / Year
suno.ai Logo
Suno
Expiration Date
Until further notice
Software Engineer, Data Engine
Save Icon
Join our team in San Francisco as a Software Engineer for the Data Engine. You will build robust systems and tools to collect, process, and manage large-scale robotic training datasets. This role requires expertise in Rust/C++ and involves hands-on work across hardware, software, and data pipelin...
Location Icon
Location
United States , San Francisco
Salary Icon
Salary
120000.00 - 160000.00 USD / Year
workatastartup.com Logo
YC Work at a Startup
Expiration Date
Until further notice
Data Engineer Co-op Intern
Save Icon
Join Amazon as a Data Engineer Co-op Intern in a full-time, in-office role. Design automated data pipelines, optimize data warehouses, and utilize SQL and Python. This 12-week internship is for students in a US co-op program, with multiple location options across the United States.
Location Icon
Location
United States , Seattle; Bellevue; Redmond; San Francisco; Sunnyvale; Santa Clara; DC; MD; VA; Austin; New York City; Minneapolis
Salary Icon
Salary
101300.00 - 160000.00 USD / Year
amazon.de Logo
Amazon Pforzheim GmbH
Expiration Date
Until further notice
Senior Data Integration Engineer
Save Icon
Join Crusoe Cloud as a Senior Data Integration Engineer. Design scalable ETL/ELT pipelines using Fivetran, Workato, and DBT on GCP. Enable critical data flow for analytics and AI infrastructure from our Sunnyvale or San Francisco offices. Enjoy competitive pay, equity, and comprehensive benefits.
Location Icon
Location
United States , Sunnyvale; San Francisco
Salary Icon
Salary
147000.00 - 178000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Are you a data architect with a passion for building robust, scalable systems? Your search for Data Engineer - PySpark jobs ends here. A Data Engineer specializing in PySpark is a pivotal role in the modern data ecosystem, responsible for constructing the foundational data infrastructure that powers analytics, machine learning, and business intelligence. These professionals are the master builders of the data world, transforming raw, unstructured data into clean, reliable, and accessible information for data scientists, analysts, and business stakeholders. If you are seeking jobs where you can work with cutting-edge big data technologies to solve complex data challenges at scale, this is your domain. In this profession, typical responsibilities revolve around the entire data pipeline lifecycle. Data Engineers design, develop, test, and maintain large-scale data processing systems. A core part of their daily work involves writing efficient, scalable code using PySpark, the Python library for Apache Spark, to perform complex ETL (Extract, Transform, Load) or ELT processes. They build and orchestrate data pipelines that ingest data from diverse sources—such as databases, APIs, and log files—into data warehouses like Snowflake or data lakes on cloud platforms like AWS, Azure, and GCP. Ensuring data quality and reliability is paramount; they implement robust data validation, monitoring, and observability frameworks to guarantee that data is accurate, timely, and trusted. Furthermore, they are tasked with optimizing the performance and cost of these data systems, fine-tuning Spark jobs for maximum efficiency, and automating deployment processes through CI/CD and Infrastructure as Code (IaC) practices. To excel in Data Engineer - PySpark jobs, a specific and powerful skill set is required. Mastery of Python and PySpark is non-negotiable, as it is the primary tool for distributed data processing. Profound knowledge of SQL is essential for data manipulation and querying. Experience with workflow orchestration tools like Apache Airflow is a common requirement to manage complex pipeline dependencies. A deep understanding of cloud data solutions (AWS, GCP, Azure) and platforms like Databricks is highly valued. Beyond technical prowess, successful candidates possess strong problem-solving abilities to debug and optimize data flows, a keen eye for system design and architecture, and excellent collaboration skills to work with cross-functional teams, including data scientists and business analysts. They are often expected to mentor junior engineers and contribute to establishing data engineering best practices and standards across an organization. If you are ready to build the future of data, explore the vast array of Data Engineer - PySpark jobs available and take the next step in your impactful career.

Filters

×
Category
Location
Work Mode
Salary