This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Zensors is the spatial intelligence platform for the physical world. Our AI platform provides real-time insights—from airport queue times to office utilization—helping organizations make smarter operational decisions. Zensors is processing massive streams of video data 24/7 with human-level accuracy. To do this at scale, we rely on cutting-edge optimization to ensure our vision transformer and detection models run efficiently on both cloud and edge compute resources. The AI Infrastructure team at Zensors builds the engine that powers our visual sensing platform. We provide the tools to automate the lifecycle of our AI workflow, including model development, evaluation, optimization, deployment, and monitoring across thousands of video streams. As a Machine Learning Engineer in ML Runtime & Optimization, you will develop technologies to accelerate the training and inference of computer vision models that power smart spaces and cities.
Job Responsibility:
Optimizing Core ML Pipelines: Identifying key bottlenecks in our current video analytics pipeline and performing in-depth analysis to ensure the best possible performance on current server and edge compute architectures
Cross-Stack Collaboration: Collaborating closely with AI research and platform engineering teams to optimize core parallel algorithms and influence the design of our next-generation inference infrastructure
Model Acceleration: Applying advanced model optimization techniques—such as quantization (Int8/FP16), pruning, and layer fusion—to our Vision Transformers (ViTs) and CNNs to maximize throughput and minimize latency
Building Efficient Operators: Working across the entire ML framework/compiler stack (e.g., PyTorch, CUDA, TensorRT, and NVIDIA DeepStream) to write custom optimized ML operator libraries
Resource Efficiency: Reducing the compute cost per video stream to enable massive scalability of our SaaS product
Data Management: Building, improving, maintaining, and operating systems to facilitate the collection, labeling, and use of visual data for ML training
Requirements:
BS/MS or Ph.D. in Computer Science, Electrical Engineering, or a related discipline
Strong programming skills in C/C++ and Python
Experience with model optimization, quantization, and efficient deep learning techniques (e.g., knowledge distillation, pruning)
Deep understanding of GPU hardware performance, including execution models, thread hierarchy, memory/cache management, and the cost/performance trade-offs of video processing
Experience with profiling and benchmarking tools (e.g., Nsight Systems, Nsight Compute) to validate performance on complex architectures
Experience identifying and resolving compute and data flow bottlenecks, particularly in high-bandwidth video processing pipelines
Strong communication skills and the ability to work cross-functionally between research and infrastructure teams
Nice to have:
Familiarity with database systems (e.g., SQL, Neo4j)
Work in Computer Vision, Deep Learning, and Vision Transformers
Experience with video processing frameworks such as NVIDIA DeepStream, DALI, or FFmpeg
Familiarity with ML compilers (e.g., TVM, MLIR) or inference engines like TensorRT or ONNX Runtime
Knowledge of distributed training systems or cloud-scale inference serving (e.g., Triton Inference Server)
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.