Here is the rewritten article:
Acknowledgments
We are deeply grateful to Sandhini Agarwal, Daniela Amodei, Dario Amodei, Tom Brown, Jeff Clune, Steve Dowling, Gretchen Krueger, Brice Menard, Reiichiro Nakano, Aditya Ramesh, Pranav Shyam, Ilya Sutskever, and Martin Wattenberg.
Author Contributions
Gabriel Goh: Research lead. Gabriel Goh first discovered multimodal neurons, sketched out the project direction and paper outline, and did much of the conceptual and engineering work that allowed the team to investigate the models in a scalable way. This included developing tools for understanding how concepts were built up and decomposed (that were applied to emotion neurons), developing zero-shot neuron search (that allowed easy discoverability of neurons), and working with Michael Petrov on porting CLIP to microscope. Subsequently developed faceted feature visualization, and text feature visualization.
Chris Olah: Worked with Gabe on the overall framing of the article, actively mentored each member of the team through their work providing both high and low-level contributions to their sections, and contributed to the text of much of the article, setting the stylistic tone. He worked with Gabe on understanding the neuroscience literature and better understanding the relevant neuroscience literature. Additionally, he wrote the sections on region neurons and developed diversity feature visualization which Gabe used to create faceted feature visualization.
Alec Radford: Developed CLIP. First observed that CLIP was learning to read. Advised Gabriel Goh on project direction on a weekly basis. Upon the discovery that CLIP was using text to classify images, proposed typographical adversarial attacks as a promising research direction.
Shan Carter: Worked on initial investigation of CLIP with Gabriel Goh. Did multimodal activation atlases to understand the space of multimodal representations and geometry, and neuron atlases, which potentially helped the arrangement and display of neurons. Provided much useful advice on the visual presentation of ideas, and helped with many aspects of visual design.
Michael Petrov: Worked on the initial investigation of multimodal neurons by implementing and scaling dataset examples. Discovered, with Gabriel Goh, the original “Spider-Man” multimodal neuron in the dataset examples, and many more multimodal neurons. Assisted a lot in the engineering of Microscope both early on, and at the end, including helping Gabriel Goh with the difficult technical challenges of porting microscope to a different backend.
Chelsea Voss†: Performed investigation of the typographical attacks phenomena, both via linear probes and zero-shot, confirming that the attacks were indeed real and state of the art. Proposed and successfully found “in-the-wild” attacks in the zero-shot classifier. Subsequently wrote the section “typographical attacks”. Upon completion of this part of the project, investigated responses of neurons to rendered text on dictionary words. Also assisted with the organization of neurons into neuron cards.
Nick Cammarata†: Drew the connection between multimodal neurons in neural networks and multimodal neurons in the brain, which became the overall framing of the article. Created the conditional probability plots (regional, Trump, mental health), labeling more than 1500 images, discovered that negative pre-ReLU activations are often interpretable, and discovered that neurons sometimes contain a distinct regime change between medium and strong activations. Wrote the identity section and the emotion sections, building off Gabriel’s discovery of emotion neurons and discovering that “complex” emotions can be broken down into simpler ones. Edited the overall text of the article and built infrastructure allowing the team to collaborate in Markdown with embeddable components.
Ludwig Schubert: Helped with general infrastructure.
† equal contributors
Discussion and Review
Review 1 – Anonymous
Review 2 – Anonymous
Review 3 – Anonymous
References
- Invariant visual representation by single neurons in the human brain
- Explicit encoding of multimodal percepts by single neurons in the human brain
- Learning Transferable Visual Models From Natural Language Supervision
- Deep Residual Learning for Image Recognition
- Attention is all you need
- Improved deep metric learning with multi-class n-pair loss objective
- Contrastive multiview coding
- Linear algebraic structure of word senses, with applications to polysemy
- Visualizing and understanding recurrent networks
- Object detectors emerge in deep scene cnns
- Visualizing higher-layer features of a deep network
- Feature Visualization
- How does the brain solve visual object recognition?
- Imagenet: A large-scale hierarchical image database
- BREEDS: Benchmarks for Subpopulation Shift
- Global Weighted Average Pooling Bridges Pixel-level Localization and Image-level Classification
- Separating style and content with bilinear models
- The feeling wheel: A tool for expanding awareness of emotions and increasing spontaneity and intimacy
- Activation atlas
- Adversarial Patch
- Synthesizing Robust Adversarial Examples
- Studies of interference in serial verbal reactions.
- Curve Detectors
- An overview of early vision in inceptionv1
- Deep inside convolutional networks: Visualising image classification models and saliency maps
- Deep neural networks are easily fooled: High confidence predictions for unrecognizable images
- Inceptionism: Going deeper into neural networks
- Plug & play generative networks: Conditional iterative generation of images in latent space
- Sun database: Large-scale scene recognition from abbey to zoo
- The pascal visual object classes (voc) challenge
- Fairface: Face attribute dataset for balanced race, gender, and age
- A style-based generator architecture for generative adversarial networks
Updates and Corrections
If you see mistakes or want to suggest changes, please create an issue on GitHub.
Reuse
Diagrams and text are licensed under Creative Commons Attribution CC-BY 4.0 with the source available on GitHub, unless noted otherwise. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.
Citation
For attribution in academic contexts, please cite this work as:
Goh, et al., "Multimodal Neurons in Artificial Neural Networks", Distill, 2021.
@article{goh2021multimodal,
author = {Goh, Gabriel and †, Nick Cammarata and †, Chelsea Voss and Carter, Shan and Petrov, Michael and Schubert, Ludwig and Radford, Alec and Olah, Chris},
title = {Multimodal Neurons

