Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
manzilz committed Dec 13, 2020
0 parents commit 2171b3d
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Big Bird: Transformers for Longer Sequences

We propose, BigBird, a sparse attention mechanism that reduces this quadratic
dependency to linear. We show that BigBird is a universal approximator of
sequence functions and is Turing complete, thereby preserving these properties
of the quadratic, full attention model. The proposed sparse attention can
handle sequences of length up to 8x of what was previously possible using
similar hardware. As a consequence of the capability to handle longer context,
BigBird drastically improves performance on various NLP tasks such as question
answering and summarization.

Code release in progress.

0 comments on commit 2171b3d

Please sign in to comment.