This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Xometry is seeking a Senior Data Scientist to join our Generative AI team. The candidate will focus on training and fine-tuning Visual Language Models (VLMs) for multimodal document understanding. The ideal candidate will leverage their expertise in machine learning and computer vision to advance Xometry's capabilities in processing and extracting structured data from complex documents and images. This is a 1-year contract.
Job Responsibility:
Develop, fine-tune, and evaluate Visual Language Models (VLMs) to enhance document understanding, focusing on multimodal data such as text, images, and technical drawings
Design and implement data preparation, cleaning, and augmentation processes tailored to multimodal model training, ensuring high-quality data pipelines for VLMs
Leverage transfer learning and pre-trained models to accelerate model development and optimize performance on Xometry’s specific data
Use cloud resources (e.g., Amazon Web Services) to scale training and fine-tuning processes for VLMs efficiently
Collaborate with data engineering and machine learning operations (MLOps) teams to deploy VLMs into production and monitor their performance
Interpret model outputs and improve model accuracy and robustness by applying data analysis and visualization tools (such as Python, Jupyter Notebooks, and SQL)
Experiment with and implement state-of-the-art model architectures, continuously optimizing VLM performance in a fast-paced, iterative environment
Work within a team-oriented setting, participating in peer reviews, sharing insights, and contributing to an environment of continuous learning and improvement
Requirements:
A bachelor’s degree is required
an advanced degree (M.S. or PhD) in computer science, data science, machine learning, or a related field is highly preferred
5+ years of experience in data science and machine learning, with expertise in Visual Language Models or multimodal machine learning
Strong experience with machine learning libraries and frameworks such as PyTorch, TensorFlow, or Hugging Face
Proficiency in Python, including libraries like pandas, numpy, and scikit-learn
Solid understanding of deep learning techniques and experience with transfer learning, fine-tuning, and model evaluation
Experience with cloud platforms (e.g., AWS SageMaker) for model training and deployment
Familiarity with data processing and visualization tools (SQL, Jupyter Notebooks, Looker, etc.) and basic database knowledge (e.g., Snowflake, MongoDB)
Excellent analytical and problem-solving skills, with a strong ability to work in an environment that values teamwork, innovation, and continuous learning
Nice to have:
Familiarity with computer vision tasks and frameworks, as well as experience with multimodal data, is a plus