Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parser for pkg-config files #94

Merged
merged 4 commits into from
Feb 11, 2025
Merged

Add parser for pkg-config files #94

merged 4 commits into from
Feb 11, 2025

Conversation

lunacd
Copy link
Contributor

@lunacd lunacd commented Jul 22, 2024

Changes

This commit implements a simple parser for pkg-config files. This
implementation supports variable assignments and interpolation. It also
supports comments. The grammar documented here [1] is used as a
reference.

This implementation has the following intentional incorrectness and
omissions:

  1. Version requirements in dependency lists are intentionally omitted
    and left as future work.
  2. Also as future work, the ability to parse pc files should probably be
    optional in the build system.

[1] https://manpages.ubuntu.com/manpages/jammy/man5/pc.5.html

Current State

Implementation and tests finished. Let's review and get a minimal implementation merged. We could follow up with more comprehensive support and test cases.

@lunacd lunacd force-pushed the pc-parser branch 2 times, most recently from 3339fc9 to 93bb888 Compare July 22, 2024 02:49
@lunacd
Copy link
Contributor Author

lunacd commented Jul 22, 2024

@dcbaker Do you happen to know how to convince meson to generate header files into the needed directories? Or maybe there's a more appropriate way to make includes work without forcing a particular file location?

src/meson.build Outdated Show resolved Hide resolved
@dcbaker
Copy link
Collaborator

dcbaker commented Jul 22, 2024

When you get a little farther let me know, I converted all of the pkgconf tests to not use kyua and I can pull those over to cps-config as well.

src/CMakeLists.txt Outdated Show resolved Hide resolved
Comment on lines +17 to +19
${BISON_PcParser_OUTPUTS}
${FLEX_PcScanner_OUTPUTS}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh

It seems probable that this is how the CMake modules for Bison and Flex were designed, but for the record, it's gross to expand strings into add_library calls instead of having a target_bison_codegen function that just adds the source files to your target for you. The function should also be adding the needed include directory for you so you don't have to add "magic" include paths as you do below.

Let's circle back with Kitware about whether the flex and bison codegen need another look, especially now that CMake 3.31 will support code generation better than before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to close the feedback loop, this is still something that needs cleaned up if someone's looking for a good upstream CMake contribution to start with. But it wasn't scoped into any work yet. Using the API like this is the best thing we can do for now.

src/cps/pc_compat/meson.build Show resolved Hide resolved
src/cps/pc_compat/meson.build Outdated Show resolved Hide resolved
src/cps/pc_compat/pc_loader.hpp Outdated Show resolved Hide resolved
src/cps/pc_compat/pc_loader.hpp Outdated Show resolved Hide resolved
src/cps/utils.cpp Outdated Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
#include <gtest/gtest.h>

namespace cps::utils::test {
namespace {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some "negative tests", too. Files that aren't valid pc files are important to test to ensure we give back helpful error messages instead of crashing or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current goal, at least for this PR, is for this to consume correctly formatted PC files. If someone is troubleshooting their handwritten pc file, probably cps-config is not their best tool. Maybe we can follow up with a PR that turns on location tracking and give more readable error messages, but I think for this PR, let's first work with known-good pc files, which I believe is by far the majority use case.

@lunacd lunacd marked this pull request as ready for review July 28, 2024 02:10
@lunacd
Copy link
Contributor Author

lunacd commented Jul 28, 2024

Oops I messed up rebasing. Main and this branch happens to be heavily modifying exactly the same section of code

@lunacd lunacd marked this pull request as draft July 28, 2024 02:20
@lunacd lunacd force-pushed the pc-parser branch 3 times, most recently from 39e20f7 to 908145a Compare July 28, 2024 16:24
@lunacd lunacd marked this pull request as ready for review July 28, 2024 16:28
@lunacd lunacd force-pushed the pc-parser branch 3 times, most recently from 408f6e1 to 926b874 Compare August 16, 2024 01:30
@lunacd
Copy link
Contributor Author

lunacd commented Aug 16, 2024

This hasn't seen a whole lot of movement so I went ahead and implemented "full" support of the pc file syntax. This should be able to parse most of pc files, for the rather limited testing I did. It currently doesn't do anything with the parsed version requirements, because support for that hasn't been implemented in cps-config yet IIUC. Let's get back to that when we have support for CPS version field in requirements.

@lunacd lunacd force-pushed the pc-parser branch 2 times, most recently from 6ccc7bf to 27e9db1 Compare August 19, 2024 21:47
@lunacd lunacd requested review from bretbrownjr and dcbaker August 19, 2024 21:47
@lunacd lunacd force-pushed the pc-parser branch 6 times, most recently from 3a1505c to feba330 Compare September 28, 2024 00:53
@lunacd
Copy link
Contributor Author

lunacd commented Sep 28, 2024

Ah finally! I got it working for all platforms and both build systems again! :)

src/cps/loader.hpp Outdated Show resolved Hide resolved
src/cps/pc_compat/pc.l Outdated Show resolved Hide resolved
@@ -0,0 +1,173 @@
// SPDX-License-Identifier: MIT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an existing BNF grammar for pkg config files we can refer to or quote?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of. But there probably is? I was using https://manpages.ubuntu.com/manpages/jammy/man5/pc.5.html as my reference

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see one either, and pkg-config uses a hand-rolled recursive descent parser.
https://gitlab.freedesktop.org/pkg-config/pkg-config/-/blob/master/parse.c?ref_type=heads#L908

One thing I note from that is CFlags can also be spelled Cflags. 🤷

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkgconf also uses a hand rolled recursive descent parser, but that man page is from pkgconf. My preference would be to emulate pkgconf rather than pkg-config if possible, since it actually has developers, and most distros are shipping pkgconf with a symlink to pkg-config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm fine with landing in the current form, and if we encounter compatibility issues then we can make tweaks as needed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a branch of pkgconf (that probably needs some additional work) that ports the kyua tests to a python runner. I can try to get those around for cps-config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. When you do, feel free to ping me to fix any issues (I bet there would be a lot)

src/cps/pc_compat/pc_loader.cpp Show resolved Hide resolved
@lunacd lunacd requested a review from steve-downey October 8, 2024 15:18
Copy link

@steve-downey steve-downey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

@bretbrownjr bretbrownjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that any of my comments are worth blocking the PR on. If you'd rather turn them into issues for later, that's fine. Assuming that's all sorted, happy to see this merge. Or to merge it myself.

Comment on lines +17 to +19
${BISON_PcParser_OUTPUTS}
${FLEX_PcScanner_OUTPUTS}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to close the feedback loop, this is still something that needs cleaned up if someone's looking for a good upstream CMake contribution to start with. But it wasn't scoped into any work yet. Using the API like this is the best thing we can do for now.

src/cps/pc_compat/pc_base.hpp Outdated Show resolved Hide resolved
.cps_version = std::string{loader::CPS_VERSION},
.components = components,
// TODO: treat PREFIX in pc file specially and translate it to @prefix@
.cps_path = std::nullopt,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do know the pc file path. I feel like we should model that to assist in debugging.

Options:

  • Make an issue, land this PR, and come back to that
  • Use cps_path and point at a *.pc file and expect we'll figure it out
  • Add an extension [1] field for pc_path.

[1] CPS doesn't specify how to add a nonstandard extension, but we could just make something up and we'll clean it up later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Did that and added a TODO to refactor the interface of the load function. Currently filename is actually the parent directory path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No actually, I totally forgot why I left that TODO in there. Prefix calculating logic on the CPS side is currently expecting something like this:

"cps_path": "@prefix@/${libdir}/cps/",
. So yeah, setting it to null works best for this PR, I think. Let's add a follow up issue to figure out how to calculate prefix for pc files

} // namespace cps::pc_compat

// Marking maybe_unused because scanner does not user this parameter
#define YY_DECL yy::parser::symbol_type yylex([[maybe_unused]] cps::pc_compat::PcLoader & loader)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an odd idiom. What's going on?

If we're keeping this around, should we name it CPS_CONFIG_YY_DECL? If this specific #define is needed, an explainer comment would be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the recommended, if not the only way, flex takes a custom loader/driver/whatever during scanning. See https://ftp.gnu.org/old-gnu/Manuals/flex-2.5.4/html_node/flex_10.html. Same thing with YY_INPUT, but I can't remember if I had to use it or not

@@ -21,6 +23,17 @@ namespace cps::utils {
return out;
}

std::string_view trim(std::string_view input) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's semi-bonkers this isn't a thing in fmt or the standard library.

I added a Beman Project issue for someone to think about this for standard C++:
bemanproject/beman#51

tests/cps-files/lib/pkgconfig/pc-comments.pc Show resolved Hide resolved
@lunacd
Copy link
Contributor Author

lunacd commented Oct 25, 2024

Ok I think I got through everything. Rebasing fixup commits

@lunacd lunacd force-pushed the pc-parser branch 2 times, most recently from bb34166 to e31c416 Compare October 25, 2024 15:27
@lunacd lunacd requested a review from bretbrownjr October 26, 2024 00:16
Copy link
Collaborator

@dcbaker dcbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking over this, hoping to get it merged soon. I had a few small things.

I commented on the issue about it, but I think 0.13 has given us a good path for compatibility with pkg-config prefixes, so I think with my 0.13 changes merged that issue should be pretty much solved.

tests/CMakeLists.txt Show resolved Hide resolved
src/cps/pc_compat/meson.build Show resolved Hide resolved
@@ -34,7 +34,7 @@ test(
],
protocol : 'tap',
env : {
'CPS_PREFIX_PATH' : meson.current_source_dir() / 'cps-files',
'CPS_PREFIX_PATH' : meson.current_build_dir() / 'cps-files',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change to build dir, and why in this patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No clue. I'll try taking the change out and see if anything screams

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, meson tests failed without this change... So yes it's needed. I'll need a bit more time than skimming through the code tho. I hope to get to this tomorrow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I don’t object to the change, but does it belong in the previous patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Seems like it's needed because the fully configured test files are available only in the build dir: https://github.com/cps-org/cps-config/blob/main/tests/cps-files/lib/cps/meson.build. I agree it probably don't belong in this PR, but I have trouble understanding why main is passing prefix test with meson?

@lunacd lunacd force-pushed the pc-parser branch 3 times, most recently from 3c8dbb8 to 4c2ab19 Compare February 8, 2025 23:59
@dcbaker
Copy link
Collaborator

dcbaker commented Feb 9, 2025

Also, if you're interested, the WIP branch I have for using an easier to configure test runner is here: https://gitea.treehouse.systems/dcbaker/pkgconf/commit/9e6df8e3f934f107680cec3dcdce073482f20705

There are still some tests that don't seem to be ported correctly, and some tests that aren't ported at all, but it's a start.

@lunacd
Copy link
Contributor Author

lunacd commented Feb 9, 2025

It seems that the clang build is having issues after rebasing on main

@dcbaker
Copy link
Collaborator

dcbaker commented Feb 9, 2025

I think that's all the Meson jobs because it doesn't look like the CPS_TEST_DIR environment variable is set for Meson.

@lunacd
Copy link
Contributor Author

lunacd commented Feb 9, 2025

Direct leak of 32 byte(s) in 1 object(s) allocated from:
...
#6 0x55c4b7a2ac49 in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector(std::initializer_list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&) /usr/include/c++/11/bits/stl_vector.h:629
#7 0x55c4b7a2ac49 in cps::pc_compat::PcLoader::load(std::istream&, std::filesystem::__cxx11::path const&) /workdir/builddir/../src/cps/pc_compat/pc_loader.cpp:134
#8 0x55c4b78e27ac in TestBody /workdir/builddir/../tests/pc_parser.cpp:40

Am I crazy to think that RAII should have deallocated that?

@lunacd
Copy link
Contributor Author

lunacd commented Feb 10, 2025

Ah okay this is a weird one. It seems that returning using GNU's "Statements and Declarations in Expressions" extension within a designated initializer does not call destructors for the objects constructed within the designated initializer call. All of that happens to be true in this initialization in that one test case:

https://github.com/lunacd/cps-config/blob/b63b58a210334d91cc58d48b0b31c42168a0611e/src/cps/pc_compat/pc_loader.cpp#L128-L139

This might be a miscompile? I don't know the defined semantics of either that GNU extension or desginated initializer well enough to say for sure. I extracted that CPS_TRY out of the initializer and the memory leak went away

@dcbaker
Copy link
Collaborator

dcbaker commented Feb 10, 2025

Tyler pointed out that is what would happen and fixed search.cpp to do exactly what you did. So I think that is how the behavior works.

@dcbaker
Copy link
Collaborator

dcbaker commented Feb 11, 2025

I think the implementation is good at this point, I wouldn't complain if the git history got cleaned up, but this has been waiting so long I'm not going to force it. The only real question I have left is whether merging #112 before this would make it easier, since the 0.13 prefix behavior more closely matches the behavior of pkg-config?

This commit implements a simple parser for pkg-config files. This
implementation supports variable assignments and interpolation. It also
supports comments. The grammar documented here [1] is used as a
reference.

This implementation has the following intentional incorrectness and
omissions:
1. All consecutive whitespaces in fragment lists and literal values are
replaced by a single string. I'm not sure whether this will affect
anything at all, so I made this simplification to allow for more
succinct grammar.
2. Version requirements in dependency lists are intentionally omitted
and left as future work.
3. This implementation only contains a class to load PC files, but that
is not used anywhere. To keep this PR minimal, the logic to fall back
to pc files when cps files are not found is also left as future work.
4. Also as future work, the ability to parse pc files should probably be
optional.

[1] https://manpages.ubuntu.com/manpages/jammy/man5/pc.5.html
This commit adds unit tests for the pkg-config parser. In the future,
this could benefit from some smoke tests using pc files shipped by major
distributions.
This hooks PC file parsing into existing logic by translating PC file
content into a "bad" CPS file. To a limited extent, this enables
cps-config to replicate pkg-config behavior when operating on pc files.

This implementation has the following known
issues/omissions/imperfections:
1. Cflags and Libs are taken as-is.
2. Library location is assumed to be a possibly non-existent
`@prefix@/lib/{name}.a`.
3. Requires are not parsed and not respected.
4. Libs.private and Cflags.private is not respected.
This commit enables pc file properties of these forms to be parsed:
libfoo >= 1.0.0, libbar < 4.0.1

Note that CPS does not fully replicate what pc files support because
only a single version field exist for requirements, which indicates the
minimum version, i.e. a >= in pc files. Support for this has not yet
been implemented in cps-config so this implementation only parses
those values successfully without using them in any way.

Also fixed compilation errors on Windows mingw

MSSTL does not provide an implicit conversion from `fs::path` to
`std::string`.
@lunacd
Copy link
Contributor Author

lunacd commented Feb 11, 2025

Okay I cleaned up the history by merging the "fix something something" ones into other commits where it made the most sense. As for merge order, I have no preference. This implementation intentionally left out the prefix handling part of it. Otherwise this would be a truly gigantic PR. I expect efforts to make this pass pkgconf's tests would involve bridging that gap.

@dcbaker
Copy link
Collaborator

dcbaker commented Feb 11, 2025

Alright, if we've punted on that then I'm happy to merge this next and I can rebase if necessary.

I'll also try to start pulling over the test runner stuff I wrote for the pkgconf tests here.

@dcbaker dcbaker merged commit de8f24c into cps-org:main Feb 11, 2025
8 checks passed
@lunacd lunacd deleted the pc-parser branch February 11, 2025 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants