Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip optimization on internal/block_reader.go can lead to incorrect data read from stream #312

Open
mpaliwal3 opened this issue Jan 24, 2023 · 1 comment

Comments

@mpaliwal3
Copy link

11bad6d
The above commit introduced skip logic in internal/block_reader.go. The logic has 2 issues.

Issue 1

The skip logic manipulates the data stream without updating the offset of the stream. This causes the stream and the offset to deviate from each other in specific cases, leading to incorrect bytes being served from the hdfs data read stream.

Repro
Reproing the issue requires a pathological read case. It requires 3 reads in a specific sequence to trigger the issue.
t0: read 10 bytes from file offset 0. File offset: 10, br.offset: 10, br.stream: 10
t1: read 10 bytes from file offset 8192. File offset: 8202, br.offset: 20, br.stream: 8202 ← stream and offset have diverged.
t2: read 10 bytes from file offset 4096. File offset: 4106, br.offset: 30, br.stream: 8212 ← Incorrect Read
If the skip is greater than 64K the skip logic is not invoked.

Fix Proposal:

  1. update the stream offset as well.
  2. refactor the block_reader.go stream to stream struct / interface so that the stream read and offset are atomic.

Issue 2:

amountToSkip is double decremented. Both by caller and in the skip function.

// Skip attempts to discard bytes in the stream in order to skip forward. This
// is an optimization for the case that the amount to skip is very small. It
// returns an error if skip was not attempted at all (because the BlockReader
// isn't connected, or the offset is out of bounds or too far ahead) or the seek
// failed for some other reason.
func (br *BlockReader) Skip(off int64) error {
 blockSize := int64(br.Block.GetB().GetNumBytes())
 
 amountToSkip := off - br.Offset --> here the code is double decrementing the 
 amountToskip

 if br.stream == nil || off < 0 || off >= blockSize ||
 amountToSkip < 0 || amountToSkip > maxSkip { // maxSkip : 65536
 return errors.New("unable to skip")
 }

 ///// -----
 // Code is reading from br.stream but NOT updating br.Offset.
 ///// -----
 _, err := io.CopyN(io.Discard, br.stream, amountToSkip)
 if err != nil {
 if err == io.EOF {
 err = io.ErrUnexpectedEOF
 }

 br.stream = nil
 br.datanodes.recordFailure(err)
 }

 return err
}
@colinmarc
Copy link
Owner

Hi @mpaliwal3, thanks for the report. Do you think you could write a test that reproduces at least the first issue? That would make debugging much easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants