Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defer UTF-8 validation in struct deserialization #1

Open
5 tasks
alexrutar opened this issue Jan 25, 2024 · 0 comments
Open
5 tasks

Defer UTF-8 validation in struct deserialization #1

alexrutar opened this issue Jan 25, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@alexrutar
Copy link
Contributor

alexrutar commented Jan 25, 2024

Struct deserialization should be improved to reduce the number of UTF-8 conversion checks.

  • Convert Read::identifier into an identifier_bytes method.
  • Validate string, comment, and preamble directly from the bytes using to_ascii_lowercase
  • Otherwise, perform UTF-8 validation (skipping if input is str).
  • Expose the raw bytes to any Deserialize impl so that if deserializing fields into a struct, the struct names can be compared against the raw bytes directly.
  • Implement Deserialize in an example or in the entry module. Since all standard biblatex entry keys fields are ascii and normalized to lowercase, comparisons can be done directly from bytes using to_ascii_lowercase.
@alexrutar alexrutar self-assigned this Jan 25, 2024
@alexrutar alexrutar added the enhancement New feature or request label Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant