Persona vectors: Monitoring and controlling character traits in language models
A paper from Anthropic describing persona vectors and their applications to monitoring and controlling model behavior