Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for "Unable to find startxref" exception #764

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

unixnut
Copy link
Contributor

@unixnut unixnut commented Feb 27, 2025

Type of pull request

  • [Y] Bug fix (involves code and configuration changes)
  • New feature (involves code and configuration changes)
  • Documentation update
  • Something else

About

Addresses inadequacies in handling startxref statement before EOF marker.

Currently, getXrefData() uses a regex that requires a newline before the offset value.

This fix allows:

  • space before the reference offset.
  • keyword and offset on the same line (e.g. containing "startxref 1746580").

For more detail see #756

TO DO

  • tests
  • example files

unixnut and others added 2 commits February 27, 2025 10:07
Error location: src/Smalot/PdfParser/RawData/RawDataParser.php:952

Fix: getXrefData(): regex (passed to preg_match_all() call that sets
$startxrefPreg) now supports e.g. "startxref 1746580".  (Previously
required newline before offset value.)

PDF details:
Creator:        DocuPrint CM225/228
PDF version:    1.7
Updated regex in src/Smalot/PdfParser/RawData/RawDataParser.php:getXrefData()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant