lib: rewrite manifest2po in Python #21532

jelly · 2025-01-17T16:49:50Z

Extracting translatable strings from manifests is easy enough to do in pure Python and removes an dependency on node_modules.

The differences in the Python implementation versus the JavaScript version:

no longer relies on global state to gather the msgid's
doesn't encode duplicate filenames for the same msgid

Next up would be porting pkg/lib/html2po to Python, this would allow us to drop html-parser from node_modules.

Extracting translatable strings from manifests is easy enough to do in pure Python and removes an dependency on node_modules. The differences in the Python implementation versus the JavaScript version: - no longer relies on global state to gather the msgid's - doesn't encode duplicate filenames for the same msgid

allisonkarlitskaya

Looks good! I like the structure in general. See the usual bag of small nags.

allisonkarlitskaya · 2025-01-17T19:18:08Z

pkg/lib/manifest2po

+import json
+import pathlib
+import re
+from typing import Any, DefaultDict, Dict, Iterable, List, Set


This is part of the build system, so we don't have to worry about compatibility with old Python versions:

for Dict, List, and Set, use the built-in equivalents (dict, list, set).

for Iterable use collections.abc.Iterable

for DefaultDict use collections.defaultdict

...I'm sure you have a good reason for Any, so I'll keep reading ;)

allisonkarlitskaya · 2025-01-17T19:18:39Z

pkg/lib/manifest2po

+import re
+from typing import Any, DefaultDict, Dict, Iterable, List, Set
+
+PO_HEADER = """msgid ""


Make this an r""" string to avoid needing to double-escape the \\n.

allisonkarlitskaya · 2025-01-17T19:19:42Z

pkg/lib/manifest2po

+"""
+
+
+def get_docs_strings(docs: List[Dict[str, str]]) -> Iterable[str]:


This looks more like a Iterable[Mapping[str, str]] but it's not so important...

allisonkarlitskaya · 2025-01-17T19:20:18Z

pkg/lib/manifest2po

+
+
+def get_menu_strings(menu: Dict[str, Any]) -> Iterable[str]:
+    for _, entry in menu.items():


You can use .values() to just get the entries.

allisonkarlitskaya · 2025-01-17T19:21:46Z

pkg/lib/manifest2po

+            yield match
+
+
+def get_menu_strings(menu: Dict[str, Any]) -> Iterable[str]:


Not crazy about the Any here and the no checking of the actual values of the things you're accessing, but I guess there's room for pragmatism here...

allisonkarlitskaya · 2025-01-17T19:22:59Z

pkg/lib/manifest2po

+
+
+def get_manifest_strings(manifest: Dict[str, Any]) -> Iterable[str]:
+    if 'menu' in manifest:


Walrus is your friend.

if menu := manifest.get('menu'): yield from get_menu_strings(menu)

allisonkarlitskaya · 2025-01-17T19:26:19Z

pkg/lib/manifest2po

+            fp.write(f"""
+#: {manifest_filenames}
+msgid "{msgid}"
+msgstr ""
+""")


I feel like it might be easier to read if you just print the lines one at a time. Definitely a matter of taste, though.

allisonkarlitskaya · 2025-01-17T19:26:54Z

pkg/lib/manifest2po

+    parser.add_argument('files', nargs='+', help='One or more input files', type=pathlib.Path, metavar='FILE')
+
+    args = parser.parse_args()
+    strings: DefaultDict[str, Set[str]] = collections.defaultdict(set)


You can use this syntax instead:

strings = collections.defaultdict[str, set[str]](set)

allisonkarlitskaya · 2025-01-17T19:29:04Z

pkg/lib/manifest2po

+    args = parser.parse_args()
+    strings: DefaultDict[str, Set[str]] = collections.defaultdict(set)
+
+    file: pathlib.Path


Does this cause file inside of the loop to be pathlib.Path instead of Any?

allisonkarlitskaya · 2025-01-17T19:32:51Z

pkg/lib/manifest2po

+            # There are variables which when not substituted can cause JSON.parse to fail
+            # Dummy replace them. None variable is going to be translated anyway
+            safe_data = re.sub(r"@.+?@", "1", data)
+            manifest = json.loads(safe_data)
+            for msgid in get_manifest_strings(manifest):
+                strings[msgid].add(str(file))


The last traces of that got removed in 473db0e.

jelly added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Jan 17, 2025

jelly requested review from martinpitt and allisonkarlitskaya January 17, 2025 16:49

allisonkarlitskaya requested changes Jan 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib: rewrite manifest2po in Python #21532

lib: rewrite manifest2po in Python #21532

jelly commented Jan 17, 2025

allisonkarlitskaya left a comment

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

allisonkarlitskaya Jan 17, 2025

		"""


		def get_docs_strings(docs: List[Dict[str, str]]) -> Iterable[str]:



		def get_menu_strings(menu: Dict[str, Any]) -> Iterable[str]:
		for _, entry in menu.items():

		yield match


		def get_menu_strings(menu: Dict[str, Any]) -> Iterable[str]:



		def get_manifest_strings(manifest: Dict[str, Any]) -> Iterable[str]:
		if 'menu' in manifest:

lib: rewrite manifest2po in Python #21532

Are you sure you want to change the base?

lib: rewrite manifest2po in Python #21532

Conversation

jelly commented Jan 17, 2025

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment