Skip to content

Latest commit

 

History

History
98 lines (47 loc) · 2.01 KB

README.md

File metadata and controls

98 lines (47 loc) · 2.01 KB

LSTM_Attention

X = Input Sequence of length n.
H = LSTM(X); Note that here the LSTM has return_sequences = True,
    so H is a sequence of vectors of length n.
s is the hidden state of the LSTM (h and c)

h is a weighted sum over H: 加权和
h = sigma(j = 0 to n-1)  alpha(j) * H(j)

weight alpha[i, j] for each hj is computed as follows:
H = [h1,h2,...,hn]
M = tanh(H)
alhpa = softmax(w.transpose * M)

h# = tanh(h)
y = softmax(W * h# + b)

J(theta) = negative_log_likelihood + regularity

attModel1

GitHub 项目

datalogue/keras-attention

1

Attention_Recurrent

GitHub 项目

ningshixian/LSTM_Attention

attModel2

GitHub 项目

Keras Attention Mechanism

Example: Attention block*

Attention defined per time series (each TS has its own attention)

attModel3

Github 项目

keras-language-modeling

https://github.com/roebius/deeplearning_keras2/blob/master/nbs2/attention_wrapper.py

attModel4

Github 项目

CDRextraction

hierarchical-attention-networks

Github:

synthesio/hierarchical-attention-networks

self-attention-networks

参考