I'm working to make machines more helpful through unsupervised learning that scales.
I developed the Sparse Transformer with Scott Gray, and also coauthored work showing the emergent capabilities of large language models in a variety of settings (GPT-2, GPT-3, Image GPT, and more).
More recently I've worked on reducing the limitations of those techniques (very deep VAEs) while continuing to apply them on larger supercomputers (MT-NLG and PaLM).
I am half-Japanese, and my Japanese name is 石井興元. よろしくお願いします. I live in San Francisco.
- 2022 - Present: Founding team, Member of Technical Staff at Inflection
- 2021 - 2022: Research at Google Brain/Search on large models.
- 2018 - 2020: Algorithms team at OpenAI
- 2016 - 2017: Speech team at Andrew Ng's lab at Baidu Research.
- 2015 - 2016: Enlitic, a startup focused on applying deep learning to medical imaging.
Please see my Google Scholar page (left) for an updated list of publications.