support batch processing #62

tianshanghong · 2023-03-05T23:10:45Z

Support batch processing for translation.

It keeps the the epub format the same as the previous version, avoiding the issue mentioned in add batch func #2
It speeds up the translation speed significantly. From my local test, it shortened the whole translation time of test_books/animal_farm.epub around ~~4.5~~ 6 minutes with --batch_size=20.
- I have not tuned the batch size yet. It should work faster if we increase the param.
It adds a new flag to the program, --batch_size. Users can customize their own batch size according to their OpenAI API plan. (detailes in OpenAI API rate limits)

yihong0618 · 2023-03-05T23:19:08Z

can you info it in both README and README-CN

yihong0618 · 2023-03-05T23:20:35Z

I think this kind of batch also hit the openai api limit?

yihong0618 · 2023-03-05T23:22:11Z

And will exit in my env

I add -b 10

yihong0618 · 2023-03-05T23:25:40Z

the same problem as #2 I wonder it maybe some problem for some users.

yihong0618 · 2023-03-05T23:30:22Z

And will exit in my env

I add -b 10

this maybe caused by I am using multiple keys.

yihong0618 · 2023-03-05T23:42:20Z

after test maybe multiple keys problem.

tianshanghong · 2023-03-05T23:44:10Z

Thanks for the reply! I have not tested it with multiple keys yet. I only have one key in global OPENAI_API_KEY env in a ubuntu 22.04 server version.

Just ran a benchmarking test again to get more precise data:

$ rm -rf __pycache__/ && rm -rf test_books/.animal_farm.temp.bin && time python3 make_book.py --book_name test_books/animal_farm.epub --no_limit --language "Simplified Chinese" --batch_size 20

......
real    6m6.450s
user    0m4.659s
sys     0m0.297s

I will update the README files later today.

zengzzzzz · 2023-03-06T02:25:45Z

if you want to use async io，the async io sleep is better than time sleep , and for most users they still have the limit , that 's not a good news

tianshanghong · 2023-03-06T04:39:48Z

@yihong0618 I tested with two API keys in my global env variable, but did not have the issue in your screenshot. Could you verify your API keys are valid?

@zengzzzzz

if you want to use async io，the async io sleep is better than time sleep

Sounds good. Will upate this.

and for most users they still have the limit , that 's not a good news

I'm a bit confused here. I guess every user who has API Key should have the same limit rate from OpenAI. Did I miss anything?

yihong0618 · 2023-03-06T04:41:26Z

yes one key of mine has problem...
can we ignore the error let the code do not fail？

tianshanghong · 2023-03-06T05:02:19Z

@yihong0618

I did not change the error handling part of the code. But I guess I got your point after reading #14 - to skip the malformed API key automatically. Is that correct?

tianshanghong · 2023-03-06T06:06:18Z

@yihong0618 I added lock for def get_key and fixed the retry machanism for malformed key. Here is my screenshot for testing.

yihong0618 · 2023-03-06T06:19:33Z

cool will test tonight（timezone +8)

yihong0618 · 2023-03-06T15:01:35Z

the code in my env will hang...

GOWxx · 2023-03-10T09:43:59Z

Feedback

Summary

I merged #62 into the main branch on my fork at https://github.com/GOWxx/bilingual_book_maker/tree/test_batch_processing, resolved the conflicts, and used the --batch_size=100 parameter. The effect was very good, and the lemo.epub was translated in just a few seconds.

However, the same code runs normally on macOS 13.1 but not on CentOS 8.6.

Normal operation on Mac OS

Not working on CentOS

Status:

GOWxx · 2023-03-11T11:10:10Z

It seems to be an issue with openai.ChatCompletion.acreate. The same code can run on macOS, but not on CentOS.

tianshanghong added 2 commits March 5, 2023 23:01

support batch processing

41543af

fix format issues

bfdcca6

tianshanghong mentioned this pull request Mar 5, 2023

add batch func #2

Open

replace time.sleep() with asyncio.sleep()

1c27991

use async in error handling part of ChatGPT

7cf17ca

automatically replace malformed key

c5b107e

yihong0618 mentioned this pull request Mar 9, 2023

Request: Concurrently completing a book #116

Closed

This was referenced Mar 10, 2023

请教作者如何能翻译提速 #134

Closed

Concurrent translation of multiple keys #136

Closed

wayhome pushed a commit to wayhome/bilingual_book_maker that referenced this pull request Aug 29, 2024

Corrected model names for the Azure ChatGPT. fixed yihong0618#62

11080fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support batch processing #62

support batch processing #62

tianshanghong commented Mar 5, 2023 •

edited

Loading

yihong0618 commented Mar 5, 2023

yihong0618 commented Mar 5, 2023 •

edited

Loading

yihong0618 commented Mar 5, 2023

yihong0618 commented Mar 5, 2023

yihong0618 commented Mar 5, 2023 •

edited

Loading

yihong0618 commented Mar 5, 2023

tianshanghong commented Mar 5, 2023 •

edited

Loading

zengzzzzz commented Mar 6, 2023 •

edited

Loading

tianshanghong commented Mar 6, 2023 •

edited

Loading

yihong0618 commented Mar 6, 2023 •

edited

Loading

tianshanghong commented Mar 6, 2023

tianshanghong commented Mar 6, 2023 •

edited

Loading

yihong0618 commented Mar 6, 2023

yihong0618 commented Mar 6, 2023 •

edited

Loading

GOWxx commented Mar 10, 2023 •

edited

Loading

GOWxx commented Mar 11, 2023

support batch processing #62

Are you sure you want to change the base?

support batch processing #62

Conversation

tianshanghong commented Mar 5, 2023 • edited Loading

yihong0618 commented Mar 5, 2023

yihong0618 commented Mar 5, 2023 • edited Loading

yihong0618 commented Mar 5, 2023

yihong0618 commented Mar 5, 2023

yihong0618 commented Mar 5, 2023 • edited Loading

yihong0618 commented Mar 5, 2023

tianshanghong commented Mar 5, 2023 • edited Loading

zengzzzzz commented Mar 6, 2023 • edited Loading

tianshanghong commented Mar 6, 2023 • edited Loading

yihong0618 commented Mar 6, 2023 • edited Loading

tianshanghong commented Mar 6, 2023

tianshanghong commented Mar 6, 2023 • edited Loading

yihong0618 commented Mar 6, 2023

yihong0618 commented Mar 6, 2023 • edited Loading

GOWxx commented Mar 10, 2023 • edited Loading

Feedback

Summary

Normal operation on Mac OS

Not working on CentOS

GOWxx commented Mar 11, 2023

tianshanghong commented Mar 5, 2023 •

edited

Loading

yihong0618 commented Mar 5, 2023 •

edited

Loading

yihong0618 commented Mar 5, 2023 •

edited

Loading

tianshanghong commented Mar 5, 2023 •

edited

Loading

zengzzzzz commented Mar 6, 2023 •

edited

Loading

tianshanghong commented Mar 6, 2023 •

edited

Loading

yihong0618 commented Mar 6, 2023 •

edited

Loading

tianshanghong commented Mar 6, 2023 •

edited

Loading

yihong0618 commented Mar 6, 2023 •

edited

Loading

GOWxx commented Mar 10, 2023 •

edited

Loading