Institution

Northeastern University

A US research university with work spanning AI, systems, and machine learning interpretability.

Interpretability · Northeastern University

Position-Aware Circuit Discovery for Language Models

This work fixes a blind spot in automatic circuit discovery: model components can matter at specific token positions, so position-invariant circuits miss real mechanisms.