This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We're Blue River, a team of innovators driven to create intelligent machinery that solves monumental problems for our customers. We empower our customers – farmers, construction crews, and foresters – to implement safer and more sustainable solutions, driving increased profitability with less reliance on scarce labor. We believe that focusing on the small stuff – pixel-by-pixel and task-by-task – leads to big gains. Blue River Technology is based in Santa Clara, CA.
Job Responsibility
Define, curate, and manage datasets of images, sensor data, and scenarios that are designed to increase the trust and safety of autonomy
Work closely with data engineers and field data capture technicians to mine fleet data and identify open needs
Define frameworks for cataloging and searching scenario-based data to serve multiple stakeholders, including computer vision and robotics teams
Monitor, investigate, and fix data ingestion issues related to dataset curation for training and testing computer vision algorithms
Investigate data quality and actively participate in conceptualizing and developing short and long-term solutions
Provide data and infrastructure support to internal teams
Provide guidance to improve the stability, security, efficiency, and scalability of image data pipelines
Improve code quality through writing unit tests, automation, and performing code reviews
Examine the correlation between customer experience and virtual performance in like scenarios
adjust as needed
Ensure that defined safety and productive test cases are adequately covered with curated scenarios
Requirements
Master's degree in Math, Physics, Data Science, or related field plus 5 years of related experience
Implement and deploy computer vision and machine learning-based data pipeline systems using semantic segmentation, image & video classification, object detection, supervised, and unsupervised learning (5 yrs)
Experience working with data engineers, data scientists, software engineers, and field staff through the lifecycle of developing and deploying a machine learning system (4 yrs)
Perform non-parametric statistical tests and analysis on large image-based data sets using sklearn, scikit-image, scipy, and OpenCV (3 yrs)
Write technical documentation, tutorials, and summaries to train data collection teams and conduct on-site training (3 yrs)
Deploy scalable cloud-based solutions to mine, preprocess, resize, crop, rectify, and filter image-based data sets (5 yrs)
Implement code using Python libraries, including NumPy, SciPy, OpenCV, Pandas, Seaborn, Matplotlib, CUDA, Pytorch, and TensorFlow (5 yrs)
Design, implement, debug, and deploy stereo image-based data pipelines using Apache TeamCity, AWS Airflow, Redis, Google appsheet, Data bricks datatables, Celery, and advanced search solutions on LabelBox with open source models such as CLIP and BLIP (6 mos)
Design, build, and debug custom Python pipelines using Python Functools for processing large image datasets, deploy these pipelines using Docker and Docker-compose (1 yr)
Use statistical sampling algorithms to design efficient data collection methods for large stereo camera-based image datasets and coordinate data collection (6 mos)
10% domestic travel required
What we offer
Eligibility for Blue River's bonus and benefit programs