I'm working to make machines more intelligent and helpful to people.
I research simple methods that leverage increases in computational resources. I developed the Sparse Transformer with Scott Gray, and also coauthored work showing the emergent capabilities of scaled-up autoregressive models in a variety of settings (GPT-2, GPT-3, Image GPT, and more).
Recently I've been interested in developing generative models with fewer limitations than autoregressive models, and showed that very deep VAEs can potentially better scale to high-dimensional data.
I am half-Japanese, and my Japanese name is 石井興元. よろしくお願いします. I lived for a year and a half in Japan and China during my undergraduate studies at Yale. Now, I live in San Francisco.