Getting here: Enter the lobby at 100 University Ave (right next to St Andrew subway station), and message Giles Edkins on the meetup app or call him on 647-823-4865 to be let up to room 6H.
Neural networks - and transformers in particular - exhibit an irritating phenomenon. When trying to interpret each individual neuron, it is often polysemantic: activating in two or more seemingly unrelated contexts.
It’s been theorized that specific linear combinations of neurons known as “features” might provide a better map into the workings of a transformer’s mind. Joshua Carpeggiani will guide us through one particular technique for finding these features, known as sparse autoencoders (SAE). We’ll see how it helps with interpretability!
We welcome a variety of backgrounds, opinions and experience levels.