Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: [map_rust] 'utf-8' codec can't decode byte 0xf8 in position 38: invalid start byte #1564

Open
tdruez opened this issue Jan 27, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@tdruez
Copy link
Contributor

tdruez commented Jan 27, 2025

Running the map_deploy_to_develop on the following inputs:

Failure during the [map_rust] step:

'utf-8' codec can't decode byte 0xf8 in position 38: invalid start byte

Traceback:
  File "/opt/scancodeio/aboutcode/pipeline/__init__.py", line 199, in execute
    step(self)
  File "/opt/scancodeio/scanpipe/pipelines/deploy_to_develop.py", line 212, in map_rust
    d2d.map_rust_paths(project=self.project, logger=self.log)
  File "/opt/scancodeio/scanpipe/pipes/d2d.py", line 1926, in map_rust_paths
    symbols.collect_and_store_tree_sitter_symbols_and_strings(
  File "/opt/scancodeio/scanpipe/pipes/symbols.py", line 149, in collect_and_store_tree_sitter_symbols_and_strings
    _collect_and_store_tree_sitter_symbols_and_strings(resource)
  File "/opt/scancodeio/scanpipe/pipes/symbols.py", line 159, in _collect_and_store_tree_sitter_symbols_and_strings
    result = symbols_tree_sitter.get_treesitter_symbols(resource.location)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scancodeio/.venv/lib/python3.12/site-packages/source_inspector/symbols_tree_sitter.py", line 85, in get_treesitter_symbols
    symbols, strings = collect_symbols_and_strings(location=location)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scancodeio/.venv/lib/python3.12/site-packages/source_inspector/symbols_tree_sitter.py", line 105, in collect_symbols_and_strings
    traverse(tree.root_node, symbols, strings, language_info)
  File "/opt/scancodeio/.venv/lib/python3.12/site-packages/source_inspector/symbols_tree_sitter.py", line 145, in traverse
    traverse(child, symbols, strings, language_info, depth + 1)
  File "/opt/scancodeio/.venv/lib/python3.12/site-packages/source_inspector/symbols_tree_sitter.py", line 145, in traverse
    traverse(child, symbols, strings, language_info, depth + 1)
  File "/opt/scancodeio/.venv/lib/python3.12/site-packages/source_inspector/symbols_tree_sitter.py", line 145, in traverse
    traverse(child, symbols, strings, language_info, depth + 1)
  [Previous line repeated 2 more times]
  File "/opt/scancodeio/.venv/lib/python3.12/site-packages/source_inspector/symbols_tree_sitter.py", line 142, in traverse
    if source_string := node.text.decode():
                        ^^^^^^^^^^^^^^^^^^

The pipeline should not fail on this but a project message should be created.

@tdruez tdruez added the bug Something isn't working label Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants