-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inquiry About Neuron Activation Processing #8
Comments
Hi, thanks for reaching out! We do indeed normalize activations by their top or bottom quantile; see here for where it's done on the frontend. The quantiles are stored in the database in the (We're looking for neurons that fire highly relative to their own distribution, not in terms of absolute magnitude.) |
Not quite - each value is the activation of a neuron on the corresponding token in that specific prompt. |
@kmeng01 Thank you very much! I know how to normalize now. Let me double check if my idea is correct:
And one more question:
And did you do normalization in attribution mode? |
I hope this message finds you well. I am reaching out to share my experience using the tool available at [https://monitor.transluce.org/dashboard/chat ] and to seek your assistance regarding some challenges I have encountered during my own implementation.
I have found the neuron searching through the activation mode on your platform to be extremely helpful. However, when I attempted to replicate this using my own code, I faced difficulties in obtaining the corresponding neuron activations. Despite using the same input text, the activation values I obtained were inconsistent with those produced by your tool.
After seeing the code you have published on GitHub, I noticed that you apply a straightforward approach of taking the maximum and minimum 1e-4 quantile of the activation values.(is that so?)
I implemented the same method in my code, yet I am unable to find the corresponding neurons in my outputs.
One observation I made is that the highest neuron activations in my results tend to concentrate in the last few layers of the model. The top 5 neuron activations per layer are as follows:
This pattern may be indicative of an issue within my code. As such, I would like to inquire whether there are any additional processing steps performed on the neuron activations in your implementation. For instance, do you apply any normalization or other techniques that might affect the activation values?
Your guidance on this matter would be greatly appreciated as it would significantly aid my understanding and facilitate the accurate replication of your results.
Thank you for your time and assistance. I look forward to your response.
The text was updated successfully, but these errors were encountered: