Skip to content

Commit

Permalink
Update image regex to handle new page format (#254)
Browse files Browse the repository at this point in the history
Fix test that fails due to being unable to parse image pages, and add a
new test file for this case.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
randomnetcat and pre-commit-ci[bot] authored Sep 23, 2024
1 parent 701c6e4 commit fb18862
Show file tree
Hide file tree
Showing 2 changed files with 613 additions and 1 deletion.
2 changes: 1 addition & 1 deletion wikiteam3/dumpgenerator/dump/image/html_regexs.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@
r'(?im)<td class="TablePager_col_img_name">\s*<a href[^>]*?>(?P<filename>[^>]+)</a>[^<]*?<a href="(?P<url>[^>]+)">[^<]*?</a>[^<]*?</td>\s*'
r'<td class="TablePager_col_thumb">[^\n\r]*?</td>\s*'
r'<td class="TablePager_col_img_size">[^<]*?</td>\s*'
r'<td class="(?:TablePager_col_img_user_text|TablePager_col_img_actor)">\s*(<a href="[^>]*?" title="[^>]*?">)?(?P<uploader>[^<]+?)(</a>)?\s*</td>'
r'<td class="(?:TablePager_col_img_user_text|TablePager_col_img_actor)">\s*(?:<a href="[^>]*?" title="[^>]*?">)?(?:<bdi>)?(?P<uploader>[^<]+?)(?:</bdi>)?(?:</a>)?\s*(?:<span class="mw-usertoollinks">(?:(?!</span>)(?!</td>).)*?</span>)?</td>'
),
]
Loading

0 comments on commit fb18862

Please sign in to comment.