Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i18n extract from .twig-files #91

Open
fabian-mcfly opened this issue Jun 29, 2023 · 10 comments
Open

i18n extract from .twig-files #91

fabian-mcfly opened this issue Jun 29, 2023 · 10 comments

Comments

@fabian-mcfly
Copy link
Contributor

I'm not quiet sure if this belongs here or if it should be part of the cakephp repo.

Currently cakephp's i18n extract-command only looks for and reads files with the php-extension (https://github.com/cakephp/cakephp/blob/8b5d4b65ca63478bb525042eb95071d6749b5c30/src/Command/I18nExtractCommand.php#L839)

This leads to missing translatable strings from .twig-files when using this twig-view-package.

Simply adding the twig-extension in the lookup is not sufficient because the _extractTokens()-method of that command uses php's token_get_all()-function (see https://github.com/cakephp/cakephp/blob/8b5d4b65ca63478bb525042eb95071d6749b5c30/src/Command/I18nExtractCommand.php#L431)

The two main questions I have:

  1. How to parse the twig templates files and look for calls to the i18n functions?
  2. How to add that functionality to the command line?
    Should this package provie a standalone extract command? Would it be possible to extend cakephp's I18nExtractCommand?
@fabian-mcfly
Copy link
Contributor Author

fabian-mcfly commented Jun 29, 2023

And since I don't just want to point my finger, I did some testing.

1. Using Twig directly

I tinkered a bit with Twig since there are already multiple token parsers that could give a good result. But it turns out that this might be a little bit too complicated.

$filePath = '/path/to/a/twig/file';
/** @var \Cake\TwigView\View\TwigView $view */
$view = $this->viewBuilder()->build();
$twig = $view->getTwig();

$templateWrapper = $twig->load($filePath);
$moduleNode = $twig->parse($twig->tokenize($templateWrapper->getSourceContext()));

function listFunctionCalls ($node, array &$list, $twig) {
  if (!$node) {
    return;
  }

  if ($node instanceof \Twig\Node\Expression\FunctionExpression) {
    $name = $node->getAttribute('name');
    if (in_array($name, ['__', '__d'])) {
      $parameters = [];
      $list[] = [
        'name' => $name,
        'parameters' => listFunctionParameters($node, $parameters),
      ];
    }
  }

  foreach ($node as $child) {
    listFunctionCalls($child, $list, $twig);
  }
}

function listFunctionParameters($node, &$arguments) {
  /** @var \Twig\Node\Node $child */
  foreach ($node as $child) {
    if ($child instanceof \Twig\Node\Expression\ConstantExpression) {
      $arguments[] = $child->getAttribute('value');
    }

    if ($child::class === 'Twig\Node\Node') {
      listFunctionParameters($child, $arguments);
    }
  }

  return $arguments;
}

$functions = [];
listFunctionCalls($moduleNode, $functions, $twig);

dd($functions);

(based on https://stackoverflow.com/questions/32614432/how-can-i-analyze-twig-templates-without-rendering-them)

A file with contents

{{ __('Übersetzung von "Erstellen"') }}
{{ __d('menus', 'Übersetzung von "Erstellen" in Domain "menus"') }}

will result in

[
  0 => [
    'name' => '__',
    'parameters' => [
      0 => 'Übersetzung von "Erstellen"',
    ],
  ],
  1 => [
    'name' => '__d',
    'parameters' => [
      0 => 'menus',
      1 => 'Übersetzung von "Erstellen" in Domain "menus"',
    ],
  ],
]

It will ignore parameters that use concatenation or other expression, like {{ __d('menus', 'Übersetzung von "Erstellen" ' ~ variable ~ ' in Domain "menus"') }}. This could lead to issues and/or unexpected results.

@fabian-mcfly
Copy link
Contributor Author

2. Overwriting the functions & loading the view

I also tried this approach where I use a special view class

class I18nView extends TwigView {
  protected array $functions = [
    '__',
    '__d',
  ];

  public function initialize (): void {
    parent::initialize();

    $twig = $this->getTwig();

    foreach ($this->functions AS $functionName) {
      $twigFunction = new \Twig\TwigFunction($functionName, [$this, $functionName]);
      $twig->addFunction($twigFunction);
    }
  }

  public function __ (string $singular, ...$args) {
    //Do something with $singular
  }

  public function __d (string $domain, string $msg, ...$args) {
    //Do something with $domain and $msg
  }
}

Those methods could be used to remember all passed arguments so that another method, used in the extract command, could access them.

I first thought that would be smart approach but this has too many potential issues.

@fabian-mcfly
Copy link
Contributor Author

3. Don't care

You all got your pitchforks and torches?
Get the file contents and just replace {{ and {% with <?php and to the same for the closing tags. After that, just run token_get_all like before. It works! :D
(I'm sorry)

For those who dare: https://onlinephp.io/c/fca0b

@markstory
Copy link
Member

Thanks for putting together a proposal and doing the homework. It is greatly appreciated. 👏 While the tokenization and parsing solution is complex it is the most durable long term solution, as Twig's tokenization API is quite stable. If you put together a pull request with how far you've gotten we can add tests and get a collection of usage scenarios with tested support. If we miss a scenario in the future it can be fixed without regressions.

@markstory
Copy link
Member

Just noticed I didn't answer all your questions.

I'm not quiet sure if this belongs here or if it should be part of the cakephp repo.

I think this repository is a better fit as it has the twig dependency. I think having a separate command name is reasonable given this is a plugin.

@ADmad
Copy link
Member

ADmad commented Jun 30, 2023

It will ignore parameters that use concatenation or other expression..

This limitation exists for the parsing of PHP templates too, so I don't it's a problem. Accounting for expressions would be too complicated.

@fabian-mcfly
Copy link
Contributor Author

fabian-mcfly commented Jun 30, 2023

Oh thank you! 😄
I will try to provide a PR!

I came across an issue when using a standalone extract command: it's very likely that a domain.pot-file already exists (created by the main i18n extract-command).
Simply overwriting it will make people lose translatable strings from their controllers. Merging/appending isn't really an option, is it?

In my head, the best case scenario would be to have the base command offer some way to add additional parser classes, which would return items to be used in _addTranslation.
This could be used by other plugins to provide extractors for other filetypes as well.
Does something similar exist anywhere in cakephp already?

@markstory
Copy link
Member

Simply overwriting it will make people lose translatable strings from their controllers. Merging/appending isn't really an option, is it?

Merging pot files should be possible as message strings act as identifiers.

This could be used by other plugins to provide extractors for other filetypes as well.
Does something similar exist anywhere in cakephp already?

Not yet.

@fabian-mcfly
Copy link
Contributor Author

Merging pot files should be possible as message strings act as identifiers.

It could lead to old, unused strings still being part of the *.pot-files. Should I care about that?

@markstory
Copy link
Member

It could lead to old, unused strings still being part of the *.pot-files. Should I care about that?

I don't think so. Users can remove old pot strings manually, or regenerate the pot files from scratch if they want to remove old content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants