fix list index out of range for comicrack #604

Ger4ldK · 2023-10-07T14:44:42Z

Fixes #582
Fixes #586

The purpose of this PR is to fix the issues encountered in #582, from what I was able to find based on the OP's file and my own files when using the tool, it seems that the source files we have contain 2-in-1 spread pages, which affects how the index counts up when working with the Comicrack XML Pages/Bookmarks tag. I have added a checkbox under Manga mode(as it seems quite common in manga series to have 2-page spreads) to toggle this option and adjust the math in the loop when converting files. If this checkbox is checked, the delta is adjusted when the Spreat splitter is either Unchecked or Indeterminate (state 0 or 1), and in the case where 'Rotate' is chosen (state 2), the pageid is decremented by 1 each time it encounters a spread to move the bookmark to the correct location. In all 3 states, I did not encounter the error again and my bookmarks were also in the correct positions

I am not sure if this is something that everyone is going to be using, but I will make this PR as a reference for anyone who is encountering the issue faced in #582, I've also tried to maintain the previous behavior as best I can. I am quite new to open source work so feel free to provide any feedback!

…ed bookmark pages and IndexErrors

axu2 · 2023-10-07T16:22:50Z

Thank you for your work! This is a long standing KCC issue over the years no one has figured out. Also fixes #586 and many other issues that were closed due to lack of feedback.

Though I'm personally against making users need to check a box to avoid ComicRack related errors. Non comic rack users shouldn't need to check if their source files are merged are not.

Is it possible to do something like:

try:
    original_method()
except IndexError as e:
    your_method()

Ger4ldK · 2023-10-07T16:34:56Z

Yep, I also don't think its the best idea to have this extra checkbox, will try the except IndexError method while I'm converting some more of my library, thanks for the work you guys put into the project!

…exError

Ger4ldK · 2023-10-08T03:41:44Z

updated the title and first post, but in terms of the fixing, i think i found that the IndexError isn't raised on some other books I was converting - these books didn't have a bookmark in the last few pages, which i think is what causes the calculation to go OOR. The error is no longer showing, but I'm getting inconsistent bookmarks again for books that do not have a bookmark near the final pages

axu2 · 2023-10-08T04:05:50Z

Oh hmmm, so the error isn't guaranteed, that's good information.

Maybe go back to original implementation? Can you determine options.alreadymerged based on the filenames by looking for kcc-a in any of the files?

If it's not enough information, can you change the filenames so it does have sufficient information? Or somehow pass this information up via a return?

Ger4ldK · 2023-10-08T04:56:32Z

Yep, currently working and testing on determining the conditions for the different method by comparing the original file count against the resulting pages after pages are split - a 'true' page count, will update soon

axu2 · 2023-10-10T22:38:25Z

Thank you for the cleanup, much easier to review.

Am I missing something or does this code not preserve the previous behavior? I thought this was an issue related to large files joined by comic rack but these changes appear to apply to all files with comic rack bookmarks. Unfortunately I have no comicrack files to test myself.

@vinhtq115 can you provide a test file as well? I want to make sure we don't break your use case. I unfortunately already deleted the one you provided in #558 I believe the file was a complete english official volume with spreads joined. The current code placed the bookmarks correctly for that file.

vinhtq115 · 2023-10-11T00:27:37Z

@vinhtq115 can you provide a test file as well? I want to make sure we don't break your use case. I unfortunately already deleted the one you provided in #558 I believe the file was a complete english official volume with spreads joined. The current code placed the bookmarks correctly for that file.

Here is the test file.

Ger4ldK · 2023-10-11T01:17:20Z

Thank you for the cleanup, much easier to review.

Am I missing something or does this code not preserve the previous behavior? I thought this was an issue related to large files joined by comic rack but these changes appear to apply to all files with comic rack bookmarks. Unfortunately I have no comicrack files to test myself.

Oh, I might have misunderstood the splitcheck code then, but my assumption is that when a split is needed, that likely meant that a spread was joined in the source file, which is why i set the options.hasspreadpages flag to True. It's going to run on all CR files that detected a split, but if they don't, it should behave as it did previously. I could duplicate the code and maintain the old version too for easier readability. Let me know if I should do that.

axu2 · 2023-10-11T01:27:52Z

@Ger4ldK Be aware that the provided test file is NSFW. Be careful when you open it, though please report how it interacts with your changes.

Ger4ldK · 2023-10-11T02:20:44Z

@Ger4ldK Be aware that the provided test file is NSFW. Be careful when you open it, though please report how it interacts with your changes.

thanks for the heads up 😆 my own test files are NSFW too, so no worries, will report back with my results

Ger4ldK · 2023-10-11T03:14:37Z

Tested with @vinhtq115's file, and it seems to be following the pages defined in comicinfo.xml, which means it's working, but the bookmarks defined by them are not the same as the ones in the ToC, which is what I did for my own files, so I'm not sure if this is working for them. Might have to redesign this a bit. I don't have ComicRack and just wrote the XML based on the example in the docs, which might make a huge difference in how this all behaves now

@vinhtq115 was the page numbers adjusted in the test file due to the issues in #558?

vinhtq115 · 2023-10-11T03:16:51Z

Tested with @vinhtq115's file, and it seems to be following the pages defined in comicinfo.xml, which means it's working, but the bookmarks defined by them are not the same as the ones in the ToC, which is what I did for my own files, so I'm not sure if this is working for them. Might have to redesign this a bit. I don't have ComicRack and just wrote the XML based on the example in the docs, which might make a huge difference in how this all behaves now

@vinhtq115 was the page numbers adjusted in the test file due to the issues in #558?

I bookmarked it as I had done in #558 (no differences). If you mean the ToC in page 3 or 4 then yeah, I didn't bookmark using it.

Ger4ldK · 2023-10-11T03:19:01Z

Tested with @vinhtq115's file, and it seems to be following the pages defined in comicinfo.xml, which means it's working, but the bookmarks defined by them are not the same as the ones in the ToC, which is what I did for my own files, so I'm not sure if this is working for them. Might have to redesign this a bit. I don't have ComicRack and just wrote the XML based on the example in the docs, which might make a huge difference in how this all behaves now
@vinhtq115 was the page numbers adjusted in the test file due to the issues in #558?

I bookmarked it as I had done in #558 (no differences).

hmm, might rework my design slightly then, will work with your test file since it helps me understand. Thanks for the info!

Ger4ldK · 2023-10-11T18:10:51Z

Update, managed to get it working with the testing file, and it does not have issues with mine as well

Summarizing the fix as best I can here, this should fix the error of the index going OOR, and additionally it also fixed issues with bookmark drift that I encountered due to my manual editing of my ComicInfo.xml file. Based on my testing with both files and all 3 possible states, the bookmarks all appear correctly for me, and I'm no longer encountering issues with the index going OOR when the bookmark is near the end, where ((pageid + global_diff) > len(filelist))

axu2 · 2023-10-11T19:16:14Z

@Ger4ldK Please confirm the results with the LARGE comicrack joined file from #582

(Too bad we don't have a test suite.)

Once you confirm everything is functionally correct feel free to use GitHub Actions on your fork to generate Windows/macOS binaries and share them here so normal users can test too.

Ger4ldK · 2023-10-15T05:43:15Z

Sorry for the delay, tested with the file in #582 and it seems to be fine now, no index error. Will generate the binaries and let the OP of the issue test as well to confirm

Update - binaries can be found here

Mac - https://github.com/Ger4ldK/kcc/actions/runs/6522315537
Windows - https://github.com/Ger4ldK/kcc/actions/runs/6522315199

axu2 · 2023-10-15T21:44:12Z

Make sure you tick Stretch/Upscale in the test file so it gets split after 400 mb. Replicate his exact settings to trigger 400 mb split.

axu2

Some small nitpicks.

axu2 · 2023-10-15T21:44:54Z

kindlecomicconverter/comic2ebook.py

@@ -525,16 +526,26 @@ def buildEPUB(path, chapternames, tomenumber):
        elif options.splitter == 2:
            diff_delta = 2

+        if options.spreadadjust:
+            if (options.splitter == 0 or options.splitter == 2):


if options.splitter != 1

updated to if options.spreadadjust and options.splitter != 1:

kindlecomicconverter/comic2ebook.py

axu2 · 2023-10-15T21:51:14Z

kindlecomicconverter/comic2ebook.py

@@ -581,10 +592,13 @@ def imgFileProcessingTick(output):
        workerOutput.append(output)
        workerPool.terminate()
    else:
-        for page in output:
+        for page in output[0]:


Is it possible to use this python feature somewhere instead of indexing?

item1, item2 = some_function_that_returns_tuple_of_two_items()

ooh, thanks for this, let me rewrite it

Ger4ldK · 2023-10-17T13:05:48Z

You just link the individual runs.

Make sure you tick Stretch/Upscale in the test file so it gets split after 400 mb. Replicate his exact settings to trigger 400 mb split.

was wondering why i wasn't getting a 400mb split even with the same settings, realized i had to select a Kindle device and got this now

Any idea what this is caused by? The line at 218 and the error message isn't giving me much info

axu2 · 2023-10-17T20:03:20Z

The error is more easy to see if you revert your changes. Search for text error during conversion in the repo. The repo hard codes a lot of error handling...

@vinhtq115 where did you get the bookmarks for your test file? Did you manually add? Your test file already worked prior to this PR. edit: Looking at the code it's handling the DoublePage Attribute. I can confirm vinh's old test file still works.
the large test file had bookmarks generated from comicrack directly from List index out of range #582. Fixing that test file is very important. Honestly we should probably just skip the bookmark related code if a 400 mb split is detected.

vinhtq115 · 2023-10-17T22:21:12Z

@vinhtq115 where did you get the bookmarks for your test file? Did you manually add? Your test file already worked prior to this PR.

I manually created it using ComicTagger.

axu2 · 2023-10-19T15:21:08Z

kindlecomicconverter/comic2ebook.py

@@ -513,6 +513,7 @@ def buildEPUB(path, chapternames, tomenumber):
                                                   tomenumber), options.uuid))
    # Overwrite chapternames if tree is flat and ComicInfo.xml has bookmarks


Maybe this entire code section should be skipped if a 400 mb split is detected (as if the xml doesn't exist?)? If it can be logically separated, make it a separate PR.

axu2 · 2023-10-21T03:06:32Z

kindlecomicconverter/comic2ebook.py

+                        global_diff -= 1
+
+            if pageid >= filelen:
+                pageid = filelen - 1


pageid = min(pageid, len(filelist) -1)

I assume this is the code related to bookmarks near the end? Add a clarifying comment.

Moreover, probably should just use len(filelist) since you are using the var less often.

axu2 · 2023-10-22T05:33:48Z

kindlecomicconverter/comic2ebook.py

@@ -513,6 +513,7 @@ def buildEPUB(path, chapternames, tomenumber):
                                                   tomenumber), options.uuid))
    # Overwrite chapternames if tree is flat and ComicInfo.xml has bookmarks
    if not chapternames and options.chapters:
+        filelen = len(filelist)


Just use len(filelist) per comment below.

axu2 · 2023-10-22T05:39:52Z

kindlecomicconverter/comic2ebook.py

@@ -609,7 +624,7 @@ def imgFileProcessing(work):
            if opt.forcepng and not opt.forcecolor:
                img.quantizeImage()
            output.append(img.saveToDir())
-        return output
+        return (output, spreadadjust)


no need for parentheses to return a tuple

axu2 · 2023-11-28T15:28:23Z

Closed due to inactivity. Reopen as needed. May already be fixed by #620

Ger4ldK added 2 commits October 7, 2023 11:23

wip: temporary workaround for index going OOB due to split diff

d2f060d

fix (ciromattia#582): added option for 2-page spread causing mismatch…

d33decd

…ed bookmark pages and IndexErrors

Ger4ldK changed the title ~~fix #582 - comicrack xml list oob~~ fix list index out of range for comicrack Oct 8, 2023

wip: maintaining previous behavior, only attempting new method on Ind…

6131e23

…exError

Ger4ldK added 2 commits October 8, 2023 14:36

feat: reverted changes and added method for detection of spread pages

d105f93

refactor: marking comic as split on splitcheck for pages

bcb2d31

fix: tested with proper ComicInfo.xml file, added additional checks

521a3dd

axu2 requested changes Oct 15, 2023

View reviewed changes

refactor: based on feedback

204deec

axu2 reviewed Oct 19, 2023

View reviewed changes

axu2 reviewed Oct 21, 2023

View reviewed changes

axu2 reviewed Oct 22, 2023

View reviewed changes

axu2 closed this Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix list index out of range for comicrack #604

fix list index out of range for comicrack #604

Ger4ldK commented Oct 7, 2023 •

edited

Loading

axu2 commented Oct 7, 2023 •

edited

Loading

Ger4ldK commented Oct 7, 2023

Ger4ldK commented Oct 8, 2023

axu2 commented Oct 8, 2023 •

edited

Loading

Ger4ldK commented Oct 8, 2023

axu2 commented Oct 10, 2023 •

edited

Loading

vinhtq115 commented Oct 11, 2023

Ger4ldK commented Oct 11, 2023

axu2 commented Oct 11, 2023 •

edited

Loading

Ger4ldK commented Oct 11, 2023

Ger4ldK commented Oct 11, 2023

vinhtq115 commented Oct 11, 2023 •

edited

Loading

Ger4ldK commented Oct 11, 2023

Ger4ldK commented Oct 11, 2023 •

edited

Loading

axu2 commented Oct 11, 2023 •

edited

Loading

Ger4ldK commented Oct 15, 2023 •

edited

Loading

axu2 commented Oct 15, 2023 •

edited

Loading

axu2 left a comment

axu2 Oct 15, 2023

Ger4ldK Oct 17, 2023

axu2 Oct 15, 2023

Ger4ldK Oct 17, 2023

Ger4ldK commented Oct 17, 2023 •

edited

Loading

axu2 commented Oct 17, 2023 •

edited

Loading

vinhtq115 commented Oct 17, 2023

axu2 Oct 19, 2023 •

edited

Loading

axu2 Oct 21, 2023 •

edited

Loading

axu2 Oct 22, 2023

axu2 Oct 22, 2023

axu2 commented Nov 28, 2023

		@@ -513,6 +513,7 @@ def buildEPUB(path, chapternames, tomenumber):
		tomenumber), options.uuid))
		# Overwrite chapternames if tree is flat and ComicInfo.xml has bookmarks

fix list index out of range for comicrack #604

fix list index out of range for comicrack #604

Conversation

Ger4ldK commented Oct 7, 2023 • edited Loading

axu2 commented Oct 7, 2023 • edited Loading

Ger4ldK commented Oct 7, 2023

Ger4ldK commented Oct 8, 2023

axu2 commented Oct 8, 2023 • edited Loading

Ger4ldK commented Oct 8, 2023

axu2 commented Oct 10, 2023 • edited Loading

vinhtq115 commented Oct 11, 2023

Ger4ldK commented Oct 11, 2023

axu2 commented Oct 11, 2023 • edited Loading

Ger4ldK commented Oct 11, 2023

Ger4ldK commented Oct 11, 2023

vinhtq115 commented Oct 11, 2023 • edited Loading

Ger4ldK commented Oct 11, 2023

Ger4ldK commented Oct 11, 2023 • edited Loading

axu2 commented Oct 11, 2023 • edited Loading

Ger4ldK commented Oct 15, 2023 • edited Loading

axu2 commented Oct 15, 2023 • edited Loading

axu2 left a comment

Choose a reason for hiding this comment

axu2 Oct 15, 2023

Choose a reason for hiding this comment

Ger4ldK Oct 17, 2023

Choose a reason for hiding this comment

axu2 Oct 15, 2023

Choose a reason for hiding this comment

Ger4ldK Oct 17, 2023

Choose a reason for hiding this comment

Ger4ldK commented Oct 17, 2023 • edited Loading

axu2 commented Oct 17, 2023 • edited Loading

vinhtq115 commented Oct 17, 2023

axu2 Oct 19, 2023 • edited Loading

Choose a reason for hiding this comment

axu2 Oct 21, 2023 • edited Loading

Choose a reason for hiding this comment

axu2 Oct 22, 2023

Choose a reason for hiding this comment

axu2 Oct 22, 2023

Choose a reason for hiding this comment

axu2 commented Nov 28, 2023

Ger4ldK commented Oct 7, 2023 •

edited

Loading

axu2 commented Oct 7, 2023 •

edited

Loading

axu2 commented Oct 8, 2023 •

edited

Loading

axu2 commented Oct 10, 2023 •

edited

Loading

axu2 commented Oct 11, 2023 •

edited

Loading

vinhtq115 commented Oct 11, 2023 •

edited

Loading

Ger4ldK commented Oct 11, 2023 •

edited

Loading

axu2 commented Oct 11, 2023 •

edited

Loading

Ger4ldK commented Oct 15, 2023 •

edited

Loading

axu2 commented Oct 15, 2023 •

edited

Loading

Ger4ldK commented Oct 17, 2023 •

edited

Loading

axu2 commented Oct 17, 2023 •

edited

Loading

axu2 Oct 19, 2023 •

edited

Loading

axu2 Oct 21, 2023 •

edited

Loading