Outline How to Apply Attention

Outline how we can gate information in Monty's routing (votes and inputs) to apply attention filters to regions of space.

As we create Monty systems with more LMs, it will become increasingly important to be able to emphasize the representations in certain LMs over others, as a form of "covert" attention. This will complement the current ability to explicitly attend to a point in space through motor actions.

For example in human children, learning new language concepts significantly benefits from shared attention with adults ("Look at the -"). A combination of attending to a point in space (overt attention), alongside narrowing the scope of active representations, is likely to be important for efficient associative learning.

Implementation-wise, this will likely consist of a mixture of top-down feedback and lateral competition.

Help Us Make This Page Better

All our docs are open-source. If something is wrong or unclear, submit a PR to fix it!

Make a Contribution

Learn how to contribute to our docs

Updated about 2 months ago