Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid includes within extern C namespace #960

Open
carlopi opened this issue Jul 31, 2024 · 9 comments
Open

Avoid includes within extern C namespace #960

carlopi opened this issue Jul 31, 2024 · 9 comments
Assignees

Comments

@carlopi
Copy link

carlopi commented Jul 31, 2024

Line 244 of xxhash.h has a:

#if defined (__cplusplus)
extern "C" {
#endif

that is followed up by includes of system-provided headers, that are not guaranteed to work within an extern C linkage rules.

In particular, while compiling apache-arrow with Emscripten, I get:

/Users/carlo/emsdk/upstream/emscripten/cache/sysroot/include/emscripten/em_asm.h:151:1: error: templates must have C++ linkage
  151 | template<typename, typename = void> struct __em_asm_sig {};
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/carlo/duckdb-wasm/submodules/arrow/cpp/src/arrow/vendored/xxhash/xxhash.h:173:1: note: extern "C" language linkage specification begins here
  173 | extern "C" {
      | ^
In file included from /Users/carlo/duckdb-wasm/submodules/arrow/cpp/src/arrow/compute/kernels/vector_hash.cc:27:
In file included from /Users/carlo/duckdb-wasm/submodules/arrow/cpp/src/arrow/array/dict_internal.h:37:
In file included from /Users/carlo/duckdb-wasm/submodules/arrow/cpp/src/arrow/util/hashing.h:49:
In file included from /Users/carlo/duckdb-wasm/submodules/arrow/cpp/src/arrow/vendored/xxhash.h:18:
In file included from /Users/carlo/duckdb-wasm/submodules/arrow/cpp/src/arrow/vendored/xxhash/xxhash.h:3434:
In file included from /Users/carlo/emsdk/upstream/emscripten/cache/sysroot/include/compat/arm_neon.h:293:
In file included from /Users/carlo/emsdk/upstream/emscripten/cache/sysroot/include/emscripten.h:1:

that make compilation fail.

Current workaround, very much ad-hoc, is explicitly including <emscripten.h> (conditionally on EMSCRIPTEN being defined) before including xxhash.h. This works, but it's not great and solves only this specific case. PR is at duckdb/duckdb-wasm#1803.

Would it be possible to move the includes outside of the extern C namespace, given I believe not to be compliant with C++ compilers?

Thanks a lot.

@Cyan4973
Copy link
Owner

possibly ?
The "generic" headers (like <stddef.h>) can most likely be pushed outside,
but the "specific" ones (like <emscriptem.h>) are only included if some specific conditions are met, and it would be necessary to replicate these conditions outside of extern "C" block, which now becomes must less desirable for code maintenance.

To be investigated.

@Cyan4973
Copy link
Owner

What would make sense is to fold this objective into the planned effort to separate the source code into multiple small units.
There it will be easier to pay attention to #include outside of the extern "C" block

@Cyan4973 Cyan4973 self-assigned this Jul 31, 2024
@carlopi
Copy link
Author

carlopi commented Jul 31, 2024

Thanks for the reply!
Possibly a way out, altought very verbose, is closing and then reopening the extern C brackets, something like:

#if defined(__cplusplus)
extern `C`
#endif

// some code

#if something
#include <x>
#endif

// some more code

#if defined(__cplusplus)
}
#endif

to

#if defined(__cplusplus)
extern `C`
#endif

// some code

#if defined(__cplusplus)
}
#endif

#if something
#include <x>
#endif

#if define(__cplusplus)
extern `C`
#endif

// some more code

#if defined(__cplusplus)
}
#endif

unsure if doing this via macro or something would make sense.

@Cyan4973
Copy link
Owner

There are limits (one cannot generate macros with macros), but yeah, something like that is indeed possible.
It just will fit more naturally in a multi-units design, from which a single amalgamated xxhash.h could still be generated.

@t-mat
Copy link
Contributor

t-mat commented Aug 1, 2024

As another possible solution, we'll be able to use the following inline extern C style

extern "C" void my_public_func(void);

We can also define it as a macro

#if defined(__cplusplus)
#define MY_PUBLIC_API extern "C"
#endif

MY_PUBLIC_API void my_public_func(void);

Msvc's dllexport is able to be supported in the same style.

@Cyan4973
Copy link
Owner

I was wondering if there was some kind of relatively easy scenario that could be played locally in order to produce this issue ?

@carlopi
Copy link
Author

carlopi commented Dec 28, 2024

It's not super trivial, I have not been able to reproduce locally now, possibly this has been fixed in Emscripten headers (I tried 3.1.71 and 3.1.61).

I would say it's not safe to include headers that might be implemented arbitrarily within extern C namespaces, since those headers might have branches on cplusplus being there for feature detection or so.

But given I can't provide a reproduction AND this might be very minor, your call on whether this makes sense looking into or not.

@Cyan4973
Copy link
Owner

I understand that it's generally preferable to avoid using #include "..." statements within an extern "C" block, and I would prefer to eliminate such cases.

The challenge, however, lies in the current structure of the code. Attempting this change would be a significant undertaking, deeply affecting the source code and potentially introducing new risks in the process.

Additionally, without any testing framework in place, there's no way to confirm that the change would actually address any tangible issue. This introduces considerable uncertainty, as the benefits are unclear and it's difficult to identify or reproduce scenarios where this modification would make a meaningful difference.

Of course, if the change were straightforward, it would be an entirely different matter, and I would simply implement it. I believe that, in the future, as the source code is refactored into smaller, more manageable units, this goal will naturally become more feasible and much easier to achieve.

@carlopi
Copy link
Author

carlopi commented Dec 29, 2024

Agree, then probably closing this, if this pops back for some other weird libc implementation then maybe there will be a better shot at reproducing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants