Neil Pagarkar Rathi

Hi. I am an undergraduate at Stanford advised by Dan Jurafsky and a safety fellow at Anthropic with Alec Radford.

I work on basic science for AI safety: why misalignments occur, and how we can prevent them during model training.

In a past life, I did computational psycholinguistics. I also consume media and do math outreach.

Selected Publications

You can reach me at lastname [at] stanford [dot] edu.

google scholar / twitter / last.fm