Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix list index out of range for comicrack #604

Closed
wants to merge 7 commits into from

Conversation

Ger4ldK
Copy link

@Ger4ldK Ger4ldK commented Oct 7, 2023

Fixes #582
Fixes #586

The purpose of this PR is to fix the issues encountered in #582, from what I was able to find based on the OP's file and my own files when using the tool, it seems that the source files we have contain 2-in-1 spread pages, which affects how the index counts up when working with the Comicrack XML Pages/Bookmarks tag. I have added a checkbox under Manga mode(as it seems quite common in manga series to have 2-page spreads) to toggle this option and adjust the math in the loop when converting files. If this checkbox is checked, the delta is adjusted when the Spreat splitter is either Unchecked or Indeterminate (state 0 or 1), and in the case where 'Rotate' is chosen (state 2), the pageid is decremented by 1 each time it encounters a spread to move the bookmark to the correct location. In all 3 states, I did not encounter the error again and my bookmarks were also in the correct positions

I am not sure if this is something that everyone is going to be using, but I will make this PR as a reference for anyone who is encountering the issue faced in #582, I've also tried to maintain the previous behavior as best I can. I am quite new to open source work so feel free to provide any feedback!

@axu2
Copy link
Collaborator

axu2 commented Oct 7, 2023

Thank you for your work! This is a long standing KCC issue over the years no one has figured out. Also fixes #586 and many other issues that were closed due to lack of feedback.

Though I'm personally against making users need to check a box to avoid ComicRack related errors. Non comic rack users shouldn't need to check if their source files are merged are not.

Is it possible to do something like:

try:
    original_method()
except IndexError as e:
    your_method()

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 7, 2023

Yep, I also don't think its the best idea to have this extra checkbox, will try the except IndexError method while I'm converting some more of my library, thanks for the work you guys put into the project!

@Ger4ldK Ger4ldK changed the title fix #582 - comicrack xml list oob fix list index out of range for comicrack Oct 8, 2023
@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 8, 2023

updated the title and first post, but in terms of the fixing, i think i found that the IndexError isn't raised on some other books I was converting - these books didn't have a bookmark in the last few pages, which i think is what causes the calculation to go OOR. The error is no longer showing, but I'm getting inconsistent bookmarks again for books that do not have a bookmark near the final pages

@axu2
Copy link
Collaborator

axu2 commented Oct 8, 2023

Oh hmmm, so the error isn't guaranteed, that's good information.

Maybe go back to original implementation? Can you determine options.alreadymerged based on the filenames by looking for kcc-a in any of the files?

If it's not enough information, can you change the filenames so it does have sufficient information? Or somehow pass this information up via a return?

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 8, 2023

Yep, currently working and testing on determining the conditions for the different method by comparing the original file count against the resulting pages after pages are split - a 'true' page count, will update soon

@axu2
Copy link
Collaborator

axu2 commented Oct 10, 2023

Thank you for the cleanup, much easier to review.

Am I missing something or does this code not preserve the previous behavior? I thought this was an issue related to large files joined by comic rack but these changes appear to apply to all files with comic rack bookmarks. Unfortunately I have no comicrack files to test myself.

@vinhtq115 can you provide a test file as well? I want to make sure we don't break your use case. I unfortunately already deleted the one you provided in #558 I believe the file was a complete english official volume with spreads joined. The current code placed the bookmarks correctly for that file.

@vinhtq115
Copy link
Contributor

@vinhtq115 can you provide a test file as well? I want to make sure we don't break your use case. I unfortunately already deleted the one you provided in #558 I believe the file was a complete english official volume with spreads joined. The current code placed the bookmarks correctly for that file.

Here is the test file.

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 11, 2023

Thank you for the cleanup, much easier to review.

Am I missing something or does this code not preserve the previous behavior? I thought this was an issue related to large files joined by comic rack but these changes appear to apply to all files with comic rack bookmarks. Unfortunately I have no comicrack files to test myself.

Oh, I might have misunderstood the splitcheck code then, but my assumption is that when a split is needed, that likely meant that a spread was joined in the source file, which is why i set the options.hasspreadpages flag to True. It's going to run on all CR files that detected a split, but if they don't, it should behave as it did previously. I could duplicate the code and maintain the old version too for easier readability. Let me know if I should do that.

@axu2
Copy link
Collaborator

axu2 commented Oct 11, 2023

@Ger4ldK Be aware that the provided test file is NSFW. Be careful when you open it, though please report how it interacts with your changes.

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 11, 2023

@Ger4ldK Be aware that the provided test file is NSFW. Be careful when you open it, though please report how it interacts with your changes.

thanks for the heads up 😆 my own test files are NSFW too, so no worries, will report back with my results

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 11, 2023

Tested with @vinhtq115's file, and it seems to be following the pages defined in comicinfo.xml, which means it's working, but the bookmarks defined by them are not the same as the ones in the ToC, which is what I did for my own files, so I'm not sure if this is working for them. Might have to redesign this a bit. I don't have ComicRack and just wrote the XML based on the example in the docs, which might make a huge difference in how this all behaves now

@vinhtq115 was the page numbers adjusted in the test file due to the issues in #558?

@vinhtq115
Copy link
Contributor

vinhtq115 commented Oct 11, 2023

Tested with @vinhtq115's file, and it seems to be following the pages defined in comicinfo.xml, which means it's working, but the bookmarks defined by them are not the same as the ones in the ToC, which is what I did for my own files, so I'm not sure if this is working for them. Might have to redesign this a bit. I don't have ComicRack and just wrote the XML based on the example in the docs, which might make a huge difference in how this all behaves now

@vinhtq115 was the page numbers adjusted in the test file due to the issues in #558?

I bookmarked it as I had done in #558 (no differences). If you mean the ToC in page 3 or 4 then yeah, I didn't bookmark using it.

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 11, 2023

Tested with @vinhtq115's file, and it seems to be following the pages defined in comicinfo.xml, which means it's working, but the bookmarks defined by them are not the same as the ones in the ToC, which is what I did for my own files, so I'm not sure if this is working for them. Might have to redesign this a bit. I don't have ComicRack and just wrote the XML based on the example in the docs, which might make a huge difference in how this all behaves now
@vinhtq115 was the page numbers adjusted in the test file due to the issues in #558?

I bookmarked it as I had done in #558 (no differences).

hmm, might rework my design slightly then, will work with your test file since it helps me understand. Thanks for the info!

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 11, 2023

Update, managed to get it working with the testing file, and it does not have issues with mine as well

Summarizing the fix as best I can here, this should fix the error of the index going OOR, and additionally it also fixed issues with bookmark drift that I encountered due to my manual editing of my ComicInfo.xml file. Based on my testing with both files and all 3 possible states, the bookmarks all appear correctly for me, and I'm no longer encountering issues with the index going OOR when the bookmark is near the end, where ((pageid + global_diff) > len(filelist))

@axu2
Copy link
Collaborator

axu2 commented Oct 11, 2023

@Ger4ldK Please confirm the results with the LARGE comicrack joined file from #582

(Too bad we don't have a test suite.)

Once you confirm everything is functionally correct feel free to use GitHub Actions on your fork to generate Windows/macOS binaries and share them here so normal users can test too.

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 15, 2023

Sorry for the delay, tested with the file in #582 and it seems to be fine now, no index error. Will generate the binaries and let the OP of the issue test as well to confirm

Update - binaries can be found here

Mac - https://github.com/Ger4ldK/kcc/actions/runs/6522315537
Windows - https://github.com/Ger4ldK/kcc/actions/runs/6522315199

@axu2
Copy link
Collaborator

axu2 commented Oct 15, 2023

Make sure you tick Stretch/Upscale in the test file so it gets split after 400 mb. Replicate his exact settings to trigger 400 mb split.

Copy link
Collaborator

@axu2 axu2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nitpicks.

@@ -525,16 +526,26 @@ def buildEPUB(path, chapternames, tomenumber):
elif options.splitter == 2:
diff_delta = 2

if options.spreadadjust:
if (options.splitter == 0 or options.splitter == 2):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if options.splitter != 1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to if options.spreadadjust and options.splitter != 1:

kindlecomicconverter/comic2ebook.py Outdated Show resolved Hide resolved
@@ -581,10 +592,13 @@ def imgFileProcessingTick(output):
workerOutput.append(output)
workerPool.terminate()
else:
for page in output:
for page in output[0]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use this python feature somewhere instead of indexing?

item1, item2 = some_function_that_returns_tuple_of_two_items()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, thanks for this, let me rewrite it

@Ger4ldK
Copy link
Author

Ger4ldK commented Oct 17, 2023

You just link the individual runs.

Make sure you tick Stretch/Upscale in the test file so it gets split after 400 mb. Replicate his exact settings to trigger 400 mb split.

was wondering why i wasn't getting a 400mb split even with the same settings, realized i had to select a Kindle device and got this now
image
Any idea what this is caused by? The line at 218 and the error message isn't giving me much info

@axu2
Copy link
Collaborator

axu2 commented Oct 17, 2023

The error is more easy to see if you revert your changes. Search for text error during conversion in the repo. The repo hard codes a lot of error handling...

  1. @vinhtq115 where did you get the bookmarks for your test file? Did you manually add? Your test file already worked prior to this PR. edit: Looking at the code it's handling the DoublePage Attribute. I can confirm vinh's old test file still works.

  2. the large test file had bookmarks generated from comicrack directly from List index out of range #582. Fixing that test file is very important. Honestly we should probably just skip the bookmark related code if a 400 mb split is detected.

@vinhtq115
Copy link
Contributor

  1. @vinhtq115 where did you get the bookmarks for your test file? Did you manually add? Your test file already worked prior to this PR.

I manually created it using ComicTagger.

@@ -513,6 +513,7 @@ def buildEPUB(path, chapternames, tomenumber):
tomenumber), options.uuid))
# Overwrite chapternames if tree is flat and ComicInfo.xml has bookmarks
Copy link
Collaborator

@axu2 axu2 Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this entire code section should be skipped if a 400 mb split is detected (as if the xml doesn't exist?)? If it can be logically separated, make it a separate PR.

global_diff -= 1

if pageid >= filelen:
pageid = filelen - 1
Copy link
Collaborator

@axu2 axu2 Oct 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pageid = min(pageid, len(filelist) -1)

I assume this is the code related to bookmarks near the end? Add a clarifying comment.

Moreover, probably should just use len(filelist) since you are using the var less often.

@@ -513,6 +513,7 @@ def buildEPUB(path, chapternames, tomenumber):
tomenumber), options.uuid))
# Overwrite chapternames if tree is flat and ComicInfo.xml has bookmarks
if not chapternames and options.chapters:
filelen = len(filelist)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use len(filelist) per comment below.

@@ -609,7 +624,7 @@ def imgFileProcessing(work):
if opt.forcepng and not opt.forcecolor:
img.quantizeImage()
output.append(img.saveToDir())
return output
return (output, spreadadjust)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for parentheses to return a tuple

@axu2
Copy link
Collaborator

axu2 commented Nov 28, 2023

Closed due to inactivity. Reopen as needed. May already be fixed by #620

@axu2 axu2 closed this Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

list index out of range List index out of range
3 participants