仕事内容

About Prior Labs

Prior Labs is building the next major AI modality shift: foundation models for structured data. While foundation models have already transformed text and images, tables and other structured data remain largely underserved despite being central to clinical research, finance, science, and business decision-making. The company is focused on changing that by developing tabular foundation models that truly understand structured data.

The organization has already made major progress in this area, including pioneering tabular foundation models and establishing itself as a leading name in structured-data machine learning. Its TabPFN v2 model appeared as a Nature cover story and reached a new state of the art for tabular ML. Since launch, the model has scaled by more than 20x, crossed 3.5M+ downloads, earned 7,500+ GitHub stars, and gained traction in both research and applied settings, including use cases in lung disease detection, train failure prevention, and clinical-trial decision support.

The company is now focused on the harder stage of the journey: scaling tabular foundation models to millions of rows, thousands of features, real-time inference, and new data modalities, while also building the production infrastructure needed for high-stakes industries. The team is small and highly selective, with more than 30 engineers, researchers, and go-to-market specialists, and includes people with backgrounds from Google, Apple, Amazon, DeepMind, Meta, Microsoft Research, G-Research, Jane Street, Goldman Sachs, and CERN. The work is guided by leading researchers and the company recently raised €9m in pre-seed funding, creating a strong window for joining at an early growth stage.

Role overview

This is a foundational data science role focused on advancing the core capabilities behind tabular foundation models. The work is split between inventing new frontier tools for TFMs and building the dataset and benchmark foundation that supports them. The position is best suited to someone who enjoys deep technical ownership, open-ended research, and setting the direction of work rather than only executing against a predefined scorecard.

What you will do

Create and develop frontier tools that extend TabPFN, including thinking, scaling, and agentic capabilities, as well as methods that help one model generalize across a broad range of data-science tasks.
Help define the research agenda by selecting which model abilities and benchmarks matter most and deciding which problems are worth solving.
Incorporate outside research findings and real customer needs into model and tooling direction, and produce results that advance the field.
Design reliable benchmarks using structured data from real high-impact problems so model evaluation reflects actual performance rather than leaderboard optimization.
Accurately reproduce baseline systems and competitor models that represent applied data science standards, helping identify where TabPFN is ahead and where further improvement is needed.
Develop an automated, agent-assisted pipeline with human oversight to scale the data and benchmark foundation to much larger volumes while preserving rigor.

What we are looking for

Proven ability to solve data-science problems across many domains and datasets with consistently strong results across a broad set of tasks, not just a single top score.
A practical, non-dogmatic approach to the ML stack, including strong results with gradient-boosted trees such as XGBoost as well as deep learning methods.
Solid understanding of common dataset quality issues such as leakage, label noise, distribution shift, duplication, and mislabeled targets, and how they affect training or evaluation signals.
Genuine interest in foundational work, with equal appreciation for dataset and benchmark infrastructure as for frontier model development.
Comfort operating as a senior individual contributor in an ambiguous, early-stage, low-process environment, with strong judgment on data science best practices and complex trade-offs.

Nice to have

Experience creating or expanding evaluation harnesses, benchmark suites, or experiment frameworks used by others.
Experience building LLM- or agent-assisted workflows with human-in-the-loop oversight to scale manual processes.
Experience connecting external research or customer needs with an internal model or product roadmap.
Prior involvement with tabular data, structured-data modeling, foundation models, or community efforts that helped shape an emerging research area.

Life at Prior Labs

You will be joining a compact, ambitious team working on one of the hardest open problems in AI. The environment emphasizes technical excellence, rigorous thinking, speed, and high craft standards. The company values people who care deeply about the quality and impact of their work and the people they work with.

The organization believes strong collaboration matters for a challenge like this, so most roles are based in one of the offices. Teams are being built in Berlin, Freiburg, and New York. In exceptional cases, remote arrangements are possible, but they generally involve frequent travel to one of the offices, and the full company gathers regularly for offsites to plan, build, and celebrate together.

Commitments and hiring approach

Prior Labs welcomes applicants from all backgrounds and identities, including people who may not meet every listed criterion. The company is committed to equal opportunity and to maintaining a safe, inclusive environment regardless of gender, sexual orientation, origin, disability, or other personal characteristics.

The company also places importance on data privacy during hiring and provides a Recruiting Privacy Notice explaining what information is collected, why it is collected, and how long it is retained.

Additional information

This role is for full-time employment. The location listed is Freiburg, Baden-Württemberg, Germany, with remote work possible in exceptional cases. The job is based around deep research and data-science infrastructure work for tabular foundation models.

Research Scientist, Foundational Data Science