Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help with understanding JWPUB format #1

Open
MrCyjaneK opened this issue Apr 24, 2021 · 148 comments
Open

Need help with understanding JWPUB format #1

MrCyjaneK opened this issue Apr 24, 2021 · 148 comments

Comments

@MrCyjaneK
Copy link
Owner

I have no idea how to get words out of Content in .db file located in jwpub archive. what I know. So any help is needed.

@orangethewell
Copy link

Hi! I had this idea (scrapping jwpub files) somedays ago and was searching for anything about these JW Library files. Appearly, these files have some linking directly with jw.org, but even then, I don't got anything about how this linking works. By the way, I was thinking that bytecode should be a id from words table too, but I also don't think this is a directly id, maybe have some instructions that JW made for it.

You're a Jehovah Witness?

@MrCyjaneK
Copy link
Owner Author

I even sent a couple of emails with a request for documentation, but got a response that said that they are unable to answer my question from this email address. So my idea was to call the number from https://www.jw.org/en/jehovahs-witnesses/contact/united-states/, but recently I didn't had much time, so I didn't do that.

And yes, I am

@MrCyjaneK
Copy link
Owner Author

I've put a lot of time into understanding this format, but still no results worth showing.

It's sad that most of the new publications are PDF/JWPUB only, PDF just doesn't scale well, and JWPUB is ugh.,

I still have an idea - scraping wol.jw.org but I'm against sending hundreds/thousands of request (every image, article, quote, source) to get one publication.

@orangethewell
Copy link

orangethewell commented Jun 15, 2021

Haha, I don't think they would simply give us their code, sadly. Anyways, scrapping wol.jw.org would actually works, but it's the worst idea considering that we should add a lot more code and change almost everything. (Considering too this will make the project a lot more heavy for low-end systems)

After all, all we can do is trial and error. We have at least a hint, that files works like a Epub, with XML files inside it, the difference it's hard modified and for some reason there's binary code that isn't a match with a list of words.

I will try doing something with my knowledge with Python, I don't know that I will help in something, but at least I will try for fun. I really like the fact to use JW Library in PC, and it's sad that Watchtower don't have ported it to some Linux distro. I don't think it will go for a long time, maybe some day they release a version for a famous Linux distro.

A final question... I saw with your project that you use the app API from JW but, this is allowed? Isn't a violation from some of the App's terms of use?

@orangethewell
Copy link

orangethewell commented Jun 16, 2021

Hello! So, I made some experiencies with the JWPUB file to know how it works and I think I got some hot things working! First of all, content is directly related with the page and don't accept something new in (maybe because content have a fixed size bytes and I inserted more than that? I don't know). Furthermore, the Words table don't work the way we thought, I changed a word in this table and all I got is the way I find it on the book, now I need to search by "subjecters" instead "subject", and after all, the word in the documents keep the same.

So, after all, I got a "How to Remain in God's Love" Book with the subjects section with title "Edited Subjects" and a blank "Letter from the Governing Body".

EDIT: I read the documentation that you gave, maybe the begin and end can be the initial byte and the final byte to be converted, but there's the question: Converted in what if it's not an index from words table?

@MrCyjaneK
Copy link
Owner Author

MrCyjaneK commented Jun 16, 2021

Wow! That's great! I lost so much time with the Words table.. So you are saying that Contents is directly related to the content? Not just reference the Words?

Have you seen things below the sentence Huh It's quite short. in the docs? https://raw.githubusercontent.com/MrCyjaneK/jwapi/master/docs/jwpub/index.md

You also need to have there:

  • Some heading/subheadings/fonts etc...
  • Images (probably by ID)
  • Links to other publications
  • And the content itself

What is the news from God? translates to:

Decimal 1246 616 1131 758 474 499
Hex 4de 268 46b 2f6 1da 1f3

Which is quite short, so my guess was that it use Words table. Maybe it store rendered publications somewhere in cache, that's why changing the table didn't change the content?

Or another scenario the Contents is compressed in some way..

@orangethewell
Copy link

Okayyy I think I got a problem with the customized JWPUB and I don't know what exactly was charging it.

I saw what was in the jwpub converting doc before and yeah, it could be it but... There's something strange with it and I don't know what exactly happened.

I changed a lot of things in the original db because I thought I was compacting it with a new jwpub file with my code but no, and when I fixed that, I had changed a lot in the DB and I think I got a corrupted publication (Or modified so long that it's don't load anything). Remembering that I changed just one content column. But this is really strange, I didn't saw that yesterday but even then it's strange how it's going.

After all, there's a lot of things working behind the jwpub specifications, there's even a schema specification for publication view and, with words table, there's some strange tables that's is like a pre compiled search. I'm really thinking about what some of a reading program forum responded to a request to create a support for the JW files, they said these files have requests for the JW API. I don't trust in everything, but this really was stuck in my mind, but even then doesn't make any sense, why a 100mb or + will need from JW? And if it's, how the pioneers book are distributed?

@MrCyjaneK
Copy link
Owner Author

I'll check the network thing tonight.. I'll download a publication and just watch for the traffic in burp suite, that should clarify if the requests are sent there or not.

@MrCyjaneK
Copy link
Owner Author

So first of all, I had some problems with android studio, then it was just late and I forgot to reply.
After downloading publications there were no requests (execept for few images, that were unrelated to the publication)

@MrCyjaneK
Copy link
Owner Author

Haha, I don't think they would simply give us their code, sadly. Anyways, scrapping wol.jw.org would actually works, but it's the worst idea considering that we should add a lot more code and change almost everything. (Considering too this will make the project a lot more heavy for low-end systems)

Yea.. but if we fail that's the only option.

After all, all we can do is trial and error. We have at least a hint, that files works like a Epub, with XML files inside it, the difference it's hard modified and for some reason there's binary code that isn't a match with a list of words.

Not really - it can be converted on the go, and then just kept in some html format.

I will try doing something with my knowledge with Python, I don't know that I will help in something, but at least I will try for fun. I really like the fact to use JW Library in PC, and it's sad that Watchtower don't have ported it to some Linux distro. I don't think it will go for a long time, maybe some day they release a version for a famous Linux distro.

That's sad :( I wish that there would be a decent watchtower library app made with gtk ;p

A final question... I saw with your project that you use the app API from JW but, this is allowed? Isn't a violation from some of the App's terms of use?

Since I don't reupload the content, it is legal, but I'm not a lawyer

https://www.jw.org/en/terms-of-use/

and even if it's against the terms.. sigh. I'm not switching back to android, so I'll continue to develop this app.

(sorry for late reply.. I missed this comment)

@orangethewell
Copy link

No no! It's okay, brother! I too don't have so much time for searching more these days, after all, I'm still have 15 years old and have some homework to do here for school. ^^

But I will still following the project flow, if I can get something new here, I make a new response on this issue.

And if you can't got any new thing from the JWPUB convertion, you still have a more easy task to do, like the video player :) (I really like the way the PC JW Library app can be easily "hacked" to have a new video on, lol)

@MrCyjaneK MrCyjaneK mentioned this issue Jun 24, 2021
8 tasks
@MrCyjaneK
Copy link
Owner Author

After spending hours on this thing, I'll not continue to reverse engineer the JWPUB format, until somebody do that.. for me.

For now I'll try to move to flooding wol.jw.org apis and getting the publications page by page (thanks for abandoning epub btw).

image

@MrCyjaneK
Copy link
Owner Author

@MrCyjaneK
Copy link
Owner Author

@MrCyjaneK MrCyjaneK linked a pull request Aug 13, 2021 that will close this issue
@mjacobus
Copy link

mjacobus commented Jan 3, 2023

@MrCyjaneK did you figure out how to read Document.Content?

@MrCyjaneK
Copy link
Owner Author

@MrCyjaneK
Copy link
Owner Author

MrCyjaneK commented Jan 3, 2023

I'm not working on this app anymore, spending time on open source alternative to something that is clearly using DRM when it shouldn't (can somebody give me one single reason for which it is worth to encrypt such content when it is freely available?)
Also I don't feel like playing some sort of cat and mouse thing when somebody can just change the way api sends publications and cut support for earlier versions.

And the elephant in the room. WHY isn't the app open source in the first place?

Until somebody gives me answers to that questions I'm not going to work on this project. wol.jw.org is enough for me.

</project>

@darioragusa
Copy link

Security? If anyone could get a publication and easily edit it the risk of spreading misleading information would be very high.

@MrCyjaneK
Copy link
Owner Author

@darioragusa As they can do with .epub, .mobi, and .pdf.

Also there is a tool for that used widely in the internet, you can sign things with PGP that would allow 3rd party apps to be developed and would cause less risk (currently we can edit the publications - drm is defectivebydesign.org).

@darioragusa
Copy link

@MrCyjaneK I know you can edit the other formats without problems but the most of us use the JW Library app. I download a jwpub knowing that it comes from jw.org or the app and I trust the content. It's not a random txt file sent by a random guy opened with Word or Adobe Reader which may or may not contain the correct informations. An example: if I send to my grandma an EPUB she my be not able to open it but, if I send a jwpub she taps the file, a trusted app she always use show up and for her it's all ok: a normal article with the reliable content that is supposed to be there. A jwpub can still be edited but it's not a thing that anyone with basic knowledge of Word can do: less editors -> less edited files. Perhaps I'm totally wrong but those are my two cents.

@MrCyjaneK
Copy link
Owner Author

less editors -> less edited files. Perhaps I'm totally wrong but those are my two cents.

The thing is current method allows editing, and signing would make it impossible while allowing moders like us to easily read the content

@darioragusa
Copy link

I don't know much about signing files, but I guess that the app should have a key and using this key with (something, idk) they should get a value. It's like checking the hash? If a bit changes the value is different?

@MrCyjaneK
Copy link
Owner Author

It's like checking if the content was modified, the content can be signed to verify that it was created by somebody and after modding it the signature will not match. It's like encrypting but you can see the content and can't modify it.

@darioragusa
Copy link

Ok, but this way they shouldn't save the signatures for every version of every article in every publication in every language?

@MrCyjaneK
Copy link
Owner Author

pgp signatures do not add a lot of extra size to publication so I don't consider this a problem. (hence you could sign a sha512sum of publication and get similar result) + you can sign them as they are served to download.

@darioragusa
Copy link

If the signature is stored with the publication what stops me to change it?

@MrCyjaneK
Copy link
Owner Author

You can change it - you can even sign it with your key but it will be invalid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

This is a message I have decided to sign, try to mess with it and it will no longer be signed.
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCAAdFiEE0gTaRRUXZfyrr8PQPD6SA9PleeEFAmO36L8ACgkQPD6SA9Pl
eeERiAv+MXm2VjIZMvOgXwKT5bDmwMpfK8liOdT/IhoFvNsTwMiWUQRHzp12OJtz
U+V26gq6lmBJsKsyij6AAvefy048mAzGnAMRR5c9uqkYs2R66jqUIRNERCE2XKdu
uiJAhmpMqNughA0/h19/As1xCrepZpo+W1SEE8yEPZp13eZ0gylmS0pBqXR5QcHB
JNAIMV84xOAntQNe2dzs6lBhhWdF3EvE5L50so2EiXGulr5mIdPwIkaCUSQIZYRd
2aWLwcA4j8ZN/UfY6YbCyhSyH5Fm4WXZ17tsPSuOqBE7QhW100gPiQjPDGc5ZUwN
SjvRIxrvCZ9rPg/PQnAOIgxALilBW3y6Jaq73XTBFaOArkOxmWh8rFhL7OkMdyW7
ewpAVjU90ChYEJ17BZpM+cSYIYcRwsYdtNQcQVl1fViBFlBFY1PEm4mvbbHK4GLQ
aRBnSsTbabNQQLij3hk/Wc9RLEe49pk/tmeDlqrtF5ELbFWRBtM0R63H3qfXkEL7
vFqGsJB4
=gBZ4
-----END PGP SIGNATURE-----

@livrasand
Copy link

Thanks I'll give it a try. Though, my Spanish to English translator sometimes gives me something crazy.

Maybe I'm missing something, but at least from what I understood, Reviw wants me to send them files so they create JWPub file for me, via GH issues. That's not how I want to approach the issue. But maybe I didn't understand their wiki.

If you need more specific help, I can help you. Me and the Reviw community have created all these JWPUBs: livrasand.github.io

Send me an email and we can talk about how to help you

@in-Load
Copy link

in-Load commented Oct 6, 2024

Thanks I'll give it a try. Though, my Spanish to English translator sometimes gives me something crazy.
Maybe I'm missing something, but at least from what I understood, Reviw wants me to send them files so they create JWPub file for me, via GH issues. That's not how I want to approach the issue. But maybe I didn't understand their wiki.

Is something like that, but Reviw i use to create a .db file, after that i create my own .jwpub (CyberChef i use for create my blobs). He helping with somethings i dont know, i believe he help you with your file and changing ideas.

That is my .jwpub, i'm working does 2 weeks and work well. He help me with something. So i believe is a good thing use there for asnwer your questions.

RefDocument table i create one, is complicated but cool.

I will add other talks, for now I have only made 1 Rename the file to .jwpub

am_T (1.0).zip

hi everyone what's up ?
Apologies, I'm French, I don't speak English and the translator is not so good to understand or learn, you know.
Thank you so much to all of you for your contributions and repository, and thanks @gokusander for your testing file, but how you created the design interface (Ui) of your file please ?

@gokusander
Copy link

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

@orangethewell
Copy link

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

It's User Interface, probably they were referring to Graphical User Interface, the window that the application show the text and images

@orangethewell
Copy link

orangethewell commented Nov 12, 2024

Good news, guys! I discovered how to extract all the styles from publications. I kinda wasted some time trying to discover it in past, but now I got how they setup the styles.

Basically, I did this:

  1. Create the root element for publication chapter content with those classes: "jwac docClass-13 docId-1102023301 ms-ROMAN ml-T dir-ltr pub-lmd layout-reading layout-sidebar". NOTE: The most important class is jwac, which represents a jw-article, and every other class check if they are inside that class
  2. Copy the colector.css from JW website, which contains every class for (I think) every publication, even printable publication data from website.

image

Another thing I discovered, (Someone already know this, I think) is that backup files hold some metadata for publication markups, like color index, paragraph index, Token index start and index end. There are 3 tables if I'm not mistaken that holds markup data, one with markup start end and paragraph, one with location/publication and another with colorIndex.

image

I still have to mess up with path matches, since I'm back to Windows and compiling the code to Linux would not work at all. But I will discover it out how to make it more legible. For now, I just bloated the code from Rust to JavaScript with React, so it still look a mess. When done, I will push a commit to my repo.

@orangethewell
Copy link

orangethewell commented Nov 21, 2024

There is a Languages table in mepsunit.db, that's where you find the respective language mnemonic for each MepsLanguageId.

I couldn't find it, I found it once, but didn't at the second time, neither Windows or Android, just on .apk data, but as .jwdat

EDIT: Nah, nevermind, just found it on msibundle from windows store version

@GeiserX
Copy link

GeiserX commented Nov 25, 2024

Hey @orangethewell
Would you please share that mepsunit.db somewhere, perhaps over a GitHub repository? I'd need the relationship between langcode and MepsLanguageId in my python scripts. I'm no desktop developer so I'd highly appreciate it, as it would take long for me to learn how to unbundle that from the Windows app. Thank you!

@orangethewell
Copy link

Hey @orangethewell
Would you please share that mepsunit.db somewhere, perhaps over a GitHub repository? I'd need the relationship between langcode and MepsLanguageId in my python scripts. I'm no desktop developer so I'd highly appreciate it, as it would take long for me to learn how to unbundle that from the Windows app. Thank you!

Sadly I can't, since it can fall into a copyright content infringement, but it isn't that hard to get it, just download the windows edition, unzip the file, rename the msixbundle to a zip extension, unzip it, same step on any of the versions inside msixbundle, preferably the suffixed with x64, unzip it then the MEPS unit is available in Data folder

@geimist
Copy link

geimist commented Nov 25, 2024

Perfect - thank you very much. I had searched in vain for a long time for the relationship between LanguageID and symbol in the installation directory (library) on the Mac and couldn't find anything. Now I see that I should have looked directly in the installation package.
On the Mac you can find the DB here: /Applications/JW Library.app/Wrapper/JWLibrary.app/JWLResources/mepsunit.db. The values are in the table language.

@in-Load
Copy link

in-Load commented Jan 23, 2025

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

It's User Interface, probably they were referring to Graphical User Interface, the window that the application show the text and images

Hi man, thanks for your reply and indeed I was talking about that. I don't understand everything through the different posts, but I need help please: I want to try to help a small congregation in Guadeloupe, the elders are a bit old, there are not many young and not many ministerial servant. I would just like to do learn 2 things mainly: retrieve the content of the publications (bible and mwb mainly) in json format from JW.org, and be able to create a jwpub file. If someone can help me that would be really nice brothers 🙏 (Pr 15:22)

@orangethewell
Copy link

Hi man, thanks for your reply and indeed I was talking about that. I don't understand everything through the different posts, but I need help please: I want to try to help a small congregation in Guadeloupe, the elders are a bit old, there are not many young and not many ministerial servant. I would just like to do learn 2 things mainly: retrieve the content of the publications (bible and mwb mainly) in json format from JW.org, and be able to create a jwpub file. If someone can help me that would be really nice brothers 🙏 (Pr 15:22)

I don't see a reason to get content and make a jwpub file from a publication that already has one, but, even that is your objective, you would have to parse the html from JW.org into the json struct you need and reparse it into a html to bundle into jwpub database.

@gokusander
Copy link

Hi man, thanks for your reply and indeed I was talking about that. I don't understand everything through the different posts, but I need help please: I want to try to help a small congregation in Guadeloupe, the elders are a bit old, there are not many young and not many ministerial servant. I would just like to do learn 2 things mainly: retrieve the content of the publications (bible and mwb mainly) in json format from JW.org, and be able to create a jwpub file. If someone can help me that would be really nice brothers 🙏 (Pr 15:22)

You would have to understand your objective, as @orangethewell mentioned, downloading the Bible and editing it doesn't make much sense. You mentioned helping the old elders, but how would that help them? Isn't it just downloading it from the official jw.org website? What is the main idea? That way we can help you.

@in-Load
Copy link

in-Load commented Jan 25, 2025

Thanks @gokusander and @orangethewell . These are two different projects. We'll start with the main one: making the work of the Life and Ministry meeting manager easier, by retrieving the week's schedule, to make it easier for him to schedule participants. I've already had the opportunity to create a private Chrome extension that parses HTML, but for scalability reasons it's not very practical, so I'm looking for a way to be able to retrieve a Json directly from the site JW.org, rather than having to generate it myself.

@Fuseteam
Copy link

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

It's User Interface, probably they were referring to Graphical User Interface, the window that the application show the text and images

Hi man, thanks for your reply and indeed I was talking about that. I don't understand everything through the different posts, but I need help please: I want to try to help a small congregation in Guadeloupe, the elders are a bit old, there are not many young and not many ministerial servant. I would just like to do learn 2 things mainly: retrieve the content of the publications (bible and mwb mainly) in json format from JW.org, and be able to create a jwpub file. If someone can help me that would be really nice brothers 🙏 (Pr 15:22)

i believe you can download jwpub files directly from jw.org

Thanks @gokusander and @orangethewell . These are two different projects. We'll start with the main one: making the work of the Life and Ministry meeting manager easier, by retrieving the week's schedule, to make it easier for him to schedule participants.

if it is just the schedule, https://github.com/AntonyCorbett/OnlyT has a way to to retrieve the times for each participant

@in-Load
Copy link

in-Load commented Jan 29, 2025

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

It's User Interface, probably they were referring to Graphical User Interface, the window that the application show the text and images

Hi man, thanks for your reply and indeed I was talking about that. I don't understand everything through the different posts, but I need help please: I want to try to help a small congregation in Guadeloupe, the elders are a bit old, there are not many young and not many ministerial servant. I would just like to do learn 2 things mainly: retrieve the content of the publications (bible and mwb mainly) in json format from JW.org, and be able to create a jwpub file. If someone can help me that would be really nice brothers 🙏 (Pr 15:22)

i believe you can download jwpub files directly from jw.org

Thanks @gokusander and @orangethewell . These are two different projects. We'll start with the main one: making the work of the Life and Ministry meeting manager easier, by retrieving the week's schedule, to make it easier for him to schedule participants.

if it is just the schedule, https://github.com/AntonyCorbett/OnlyT has a way to to retrieve the times for each participant

Thanks but is not what i need.

@Fuseteam
Copy link

if it is just the schedule, https://github.com/AntonyCorbett/OnlyT has a way to to retrieve the times for each participant

Thanks but is not what i need.

so what do you actually need?

@in-Load
Copy link

in-Load commented Jan 30, 2025

if it is just the schedule, https://github.com/AntonyCorbett/OnlyT has a way to to retrieve the times for each participant

Thanks but is not what i need.

so what do you actually need?

firstly, what I want is to retrieve (in json) the content of the mwb from the JW.org site.

Example: for the text of the day, we can do it with wol.jw.org/wol/dt/r30/lp-e/2025/1/30

There must be that for the mwb I suppose... do you have any suggestions please brothers?

@livrasand
Copy link

livrasand commented Jan 30, 2025

Hi @in-Load, I have such a function in Kingdom Hall Attendant, you can take it and use it as you please.

https://github.com/livrasand/Kingdom-Hall-Attendant/blob/main/app.py#L33420

def extract_data_from_WOL(year, week):
    url = f"https://wol.jw.org/es/wol/meetings/r4/lp-s/{year}/{week}"
    try:
        # Establece un timeout de 10 segundos
        response = requests.get(url, timeout=10)
        # Lanza un error si la respuesta no es 200 OK
        response.raise_for_status()
    except requests.exceptions.Timeout:
        print("La solicitud ha superado el tiempo de espera.")
        return None, {}  # Retorna None y un diccionario vacío si hay un timeout
    except requests.exceptions.RequestException as e:
        print(f"Ocurrió un error: {e}")
        return None, {}  # Retorna None y un diccionario vacío si hay un error
    else:
        soup = BeautifulSoup(response.content, 'html.parser')
        week_info = soup.find('h1').text
        data = {}
        current_h2 = None

        for element in soup.find_all(['h2', 'h3']):
            if element.name == 'h2':
                current_h2 = element.text.strip()
                data[current_h2] = []
            elif element.name == 'h3' and current_h2:
                data[current_h2].append(element.text.strip())

        return week_info, data

def get_previous_and_next_urls(year, week):
    # Calcular la semana anterior
    previous_week_date = datetime.datetime.strptime(f"{year}-{week}-1", "%Y-%W-%w") - datetime.timedelta(weeks=1)
    previous_year = previous_week_date.year
    previous_week = previous_week_date.isocalendar()[1]
    url_previous = f"/nuevo-vida-ministerio?year={previous_year}&week={previous_week}"

    # Calcular la semana siguiente
    next_week_date = datetime.datetime.strptime(f"{year}-{week}-1", "%Y-%W-%w") + datetime.timedelta(weeks=1)
    next_year = next_week_date.year
    next_week = next_week_date.isocalendar()[1]
    url_next = f"/nuevo-vida-ministerio?year={next_year}&week={next_week}"

    return url_previous, url_next

@in-Load
Copy link

in-Load commented Jan 30, 2025

Hi @in-Load, I have such a function in Kingdom Hall Attendant, you can take it and use it as you please.

https://github.com/livrasand/Kingdom-Hall-Attendant/blob/main/app.py#L33420

def extract_data_from_WOL(year, week):
    url = f"https://wol.jw.org/es/wol/meetings/r4/lp-s/{year}/{week}"
    try:
        # Establece un timeout de 10 segundos
        response = requests.get(url, timeout=10)
        # Lanza un error si la respuesta no es 200 OK
        response.raise_for_status()
    except requests.exceptions.Timeout:
        print("La solicitud ha superado el tiempo de espera.")
        return None, {}  # Retorna None y un diccionario vacío si hay un timeout
    except requests.exceptions.RequestException as e:
        print(f"Ocurrió un error: {e}")
        return None, {}  # Retorna None y un diccionario vacío si hay un error
    else:
        soup = BeautifulSoup(response.content, 'html.parser')
        week_info = soup.find('h1').text
        data = {}
        current_h2 = None

        for element in soup.find_all(['h2', 'h3']):
            if element.name == 'h2':
                current_h2 = element.text.strip()
                data[current_h2] = []
            elif element.name == 'h3' and current_h2:
                data[current_h2].append(element.text.strip())

        return week_info, data

def get_previous_and_next_urls(year, week):
    # Calcular la semana anterior
    previous_week_date = datetime.datetime.strptime(f"{year}-{week}-1", "%Y-%W-%w") - datetime.timedelta(weeks=1)
    previous_year = previous_week_date.year
    previous_week = previous_week_date.isocalendar()[1]
    url_previous = f"/nuevo-vida-ministerio?year={previous_year}&week={previous_week}"

    # Calcular la semana siguiente
    next_week_date = datetime.datetime.strptime(f"{year}-{week}-1", "%Y-%W-%w") + datetime.timedelta(weeks=1)
    next_year = next_week_date.year
    next_week = next_week_date.isocalendar()[1]
    url_next = f"/nuevo-vida-ministerio?year={next_year}&week={next_week}"

    return url_previous, url_next

Great 👍, thank you very much @livrasand 🙏. I just did a test, and I saw how to modify your code to retrieve it in the right way. On the other hand, it forces me to parse it... 🤔 do you know if it is possible to retrieve the content directly in json please?

@gokusander
Copy link

gokusander commented Feb 2, 2025

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?

I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

@in-Load
Copy link

in-Load commented Feb 3, 2025

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?

I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Hello my brother @gokusander 🤗 I won't be able to help you unfortunately but I would like you to teach me please, I would also like to do that! Would you prefer that we discuss it in private?

@gokusander
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?
I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Hello my brother @gokusander 🤗 I won't be able to help you unfortunately but I would like you to teach me please, I would also like to do that! Would you prefer that we discuss it in private?

It could be right here, I'm not an expert on the subject. For what purpose do you plan to create it?

Jwpub is a .db database that needs to be edited in html and decrypted to be read by the app. I used Reviw for that (by @livrasand )

@in-Load
Copy link

in-Load commented Feb 3, 2025

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?
I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Hello my brother @gokusander 🤗 I won't be able to help you unfortunately but I would like you to teach me please, I would also like to do that! Would you prefer that we discuss it in private?

It could be right here, I'm not an expert on the subject. For what purpose do you plan to create it?

Jwpub is a .db database that needs to be edited in html and decrypted to be read by the app. I used Reviw for that (by @livrasand )

well just like you: I want to be able to create a file for my own notes or my own assembly summaries for example.

@orangethewell
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?

I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Word search use some bitwise magic to get words and associated data, take a look on livrasand/Reviw#106

@gokusander
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?
I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Word search use some bitwise magic to get words and associated data, take a look on livrasand/Reviw#106

Voce conseguiu criar esse sistema de busca? Eu realmente não entendo de dev. (sou BR também)
Did you manage to create this search system? I really don't understand dev. (I'm from Brazil too)

@darioragusa
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?

I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Hi, maybe this could help.
You can try to reverse my script JWPubExtractor.swift

@orangethewell
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?
I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Word search use some bitwise magic to get words and associated data, take a look on livrasand/Reviw#106

Voce conseguiu criar esse sistema de busca? Eu realmente não entendo de dev. (sou BR também)
Did you manage to create this search system? I really don't understand dev. (I'm from Brazil too)

Não, ainda sem sucesso e um pouco sem tempo. Na verdade, a última coisa que eu estava trabalhando na minha versão para Linux era o sistema de marcação de texto, que é um tanto complexo porque o JW Library usa um sistema de tokenização do texto. Acredito que essa tokenização funcione de forma parecida para o sistema de busca, mas não tenho certeza. Mas até o momento, o que eu e o @livrasand encontramos foi apenas requisições aos dados das tabelas Search e outras relacionadas a ela, onde o app faz uma conta em escala de bits.

Espero em breve poder estudar um pouco mais sobre, mas é porque ultimamente eu realmente estou com tempo apertado 😔

@gokusander
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?
I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Word search use some bitwise magic to get words and associated data, take a look on livrasand/Reviw#106

Voce conseguiu criar esse sistema de busca? Eu realmente não entendo de dev. (sou BR também)
Did you manage to create this search system? I really don't understand dev. (I'm from Brazil too)

Não, ainda sem sucesso e um pouco sem tempo. Na verdade, a última coisa que eu estava trabalhando na minha versão para Linux era o sistema de marcação de texto, que é um tanto complexo porque o JW Library usa um sistema de tokenização do texto. Acredito que essa tokenização funcione de forma parecida para o sistema de busca, mas não tenho certeza. Mas até o momento, o que eu e o @livrasand encontramos foi apenas requisições aos dados das tabelas Search e outras relacionadas a ela, onde o app faz uma conta em escala de bits.

Espero em breve poder estudar um pouco mais sobre, mas é porque ultimamente eu realmente estou com tempo apertado 😔

Ahh sim, sem problemas meu mano. Você já está criando algo mais complexo, eu estou apenas criando meu jwpub pessoal de estudo e pastoreio. Essa busca seria só pra encontrar algo que anotei no meu jwpub.

O livrado meio que abandonou o projeto, mas sorte que aprendi a fazer um do zero mais ou menos, só faltando essa função mesmo. Mas qualquer coisa que precisar e eu puder ajudar só me pingar. Não sou programador, eu trabalho na área da saúde, então o que sei é o básico do básico ahahha

Agradeço mesmo assim, abraços

@gokusander
Copy link

I am creating a personal study .jwpub. I would like to add the search engine to my .jwpub file, could someone help?
I have already created the entire .db with images, videos, notes and references. All that is missing is the word search.

Hi, maybe this could help. You can try to reverse my script JWPubExtractor.swift

I really don't know anything about programming. What I learned to do was from the manual he created a while back. I've been doing my own stuff based on that.

I'll try to read what he sent me about the research. I don't know how to work with reverse scripting, my job is in the health field, hahaha. The jwpub is just for personal study and shepherding call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

17 participants