Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the softmax layer of the model architecture #2

Open
sunshineYin opened this issue Dec 5, 2021 · 3 comments
Open

About the softmax layer of the model architecture #2

sunshineYin opened this issue Dec 5, 2021 · 3 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@sunshineYin
Copy link

Hello, I am also a researcher on GIScience and human mobility, and I have read your paper about "DeepGravity" published on Nature Communications. It's an excellent and innovative work. And I am trying to reproduce it for other similar tasks. But I have some questions about your source code and am grateful for your help.

In your paper, there is a model architecture in Figure 1, you design a feed-forward neural network with 15 hidden layers. And input all the three-tuple (xi, xj, rij) into it (which is the same feed-forward network) for training, right? And the output of each three-tuple should be a score sij. And then, you apply a softmax layer to normalize it as probability pij, which indicates the probability of interaction between location i and j. So, my question is how this softmax layer is implemented. Because for each OD-specific three-tuple, the output is a value, not a vector, so how to use softmax for a single value? And besides, because the softmax layer is to get the probability of going to each destination with a fixed origin, right? It is necessary to ensure that the origins of the same batch of OD samples that are input into the softmax layer are the same, so that it makes sense, and I do not find the corresponding implementation part in the source code.

Thank you first and hope to hear from you as soon as possible.

@jonpappalord jonpappalord added the help wanted Extra attention is needed label Dec 6, 2021
@sunshineYin
Copy link
Author

Any reply? Thanks.

@MassimilianoLuca
Copy link
Member

Dear sunshineYin, I am sorry for the late reply.
Actually, the probabilities of the model are computed within the GLM_multinomialRegression class we implemented in the file od_models.py. In particular, there is the function predict_proba that uses a softmax to actually compute the probability of a vector x passed to the model. Note that everything is invocated from the get_cpc function (same file) when we call average_OD_model

Please, let me know if it is more clear and it there is something else we can do

MassimilianoLuca added a commit that referenced this issue Dec 29, 2021
@sunshineYin
Copy link
Author

Dear Luca,

Yeah, I have found your code based on your reply. But I still have some questions.

In your predict_proba function of od_models.py,

probs = sm(torch.squeeze(self.forward(x), dim=-1))
Your input to the softmax is from self.forward(x) , right? But the forward function is a single linear layer, as follows:
out = self.linear1(vX)
The linear layer is as follows, the output dim is 1, right?
self.linear1 = torch.nn.Linear(dim_w, 1)
So my question is: how to use softmax for a single value (1-dim), because it is not a vector? I still don't know how it works :)

Thanks again for your kindness and hope to get your reply soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants