Back to AI Talent Tracker
Data & Methodology

AI Research Talent Data

How we track the world's top AI researchers: data sources, methodology, and analytical framework.

15K+

Researchers tracked

4,622

Top-tier (NeurIPS 2024)

8

Years of data

75+

Countries covered

Identifying Top AI Researchers

Measuring "AI talent" is inherently challenging. We use publication at elite machine learning conferences as our primary filter, focusing on researchers who have published at:

  • NeurIPS (Conference on Neural Information Processing Systems)
  • ICML (International Conference on Machine Learning)
  • ICLR (International Conference on Learning Representations)

These conferences maintain rigorous peer review with acceptance rates below 25%. Publishing at these venues indicates a researcher is operating at the frontier of AI research.

Tracking Origin and Location

For each researcher, we collect:

Educational Background

Where did they receive their undergraduate degree? This serves as a proxy for "country of origin" since it reflects where the researcher was trained during their formative years. Data sourced from LinkedIn profiles, institutional websites, and academic CVs.

Current Affiliation

Where do they currently work? This is derived from the affiliation listed on their most recent publications. We categorize affiliations as academic (universities, research institutes) or industry (tech companies, startups).

Graduate Training

Where did they receive their PhD? This helps identify talent pipelines — which universities are training the researchers who go on to lead AI labs and publish breakthrough research.

Analytical Framework

Our analysis tracks talent flows across several dimensions:

  • Production vs. Employment: Which countries train AI researchers vs. which countries employ them
  • Academic vs. Industry: Where do top researchers end up — universities or tech companies?
  • Migration patterns: How do researchers move between countries over their careers?
  • Concentration: How clustered is AI talent at specific institutions or regions?

Data Limitations

Our methodology has important limitations:

  • We only capture researchers who publish at top conferences — this excludes important work in industry labs that isn't published
  • Undergraduate institution is an imperfect proxy for "origin" — some researchers study abroad from an early age
  • Affiliations can change faster than publications — our location data may lag actual moves
  • We focus on machine learning specifically — this excludes AI researchers working in robotics, NLP, or other adjacent fields that publish at different venues

Related Analysis