-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow multiple parallel linuxcnc instances #2722
base: master
Are you sure you want to change the base?
Conversation
Set environment variable LINUXCNC_INSTANCE to some numeric value. This value times 16 is added to shared memory keys and to tcp port numbers. If unset it is treated as 0.
/* get instance env variable */ | ||
const char* instance = getenv("LINUXCNC_INSTANCE"); | ||
if (instance) { | ||
long offset = strtol(instance, NULL, 10) * 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://cplusplus.com/reference/cstdlib/strtol/ indeed returns 0 if the value is not set or if it cannot be converted. How would you like to constrain the LINUX_INSTANCE to be of value 1 or higher? That way we could detect an error in the specification of the instance number.
s += getenv("LINUXCNC_INSTANCE"); | ||
} | ||
if (s.size() + 1 > sizeof(sockaddr_un::sun_path)) { | ||
rtapi_print_msg(RTAPI_MSG_ERR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the instance number should appear in all the error messages, too.
Just some questions coming to mind:
Ooops, was not aware of #2716 (comment) . |
They are supposed to be completely isolated. When they are, you could then, for example, parallelize tests; reducing runtime from 17 minutes to approx. 2 minutes on my machine. This I would like very much.
No, you cannot share hardware. But, you can control multiple machines if you partition hardware between instances. But with cheap small boards available (like rpi5), it may be too late for this option. There are also several instances of shared memory IDs missed in the patch set. I think the change should modify the rtapi_shmem_new() function instead of local changes (and libnml should be updated to use it). |
I would like that too. But people will start multiple instances on a single machine with some real mill attached to it. When these all read the very same config then I just dare to assume that may trigger some sort of mischief.
Low-latency communication between those instances I have not yet given up upon.
I kind of sensed that not too much effort would be made to extend its functionality. IIRC then ZeroMQ was at some point considered as a possible substitute. The prospect to add a communication between instances, likely also across a network, I feel like may warrant a transition to something novel. I had just asked the friendly AI folks about possibly preferable alternatives and they pointed me to NNG (https://nng.nanomsg.org/) or DDS (https://www.dds-foundation.org/who-is-using-dds-2/), the latter being already used by ROS 2. My vision is to share a series of cameras the monitor a milling process and help coordinating a robot or detect a broken end mill. |
If there is interest in this i can certainly develop it further and polish it to a useable state. Maybe even get rid of legacy SYS V shared memory segments on the way.
If you want to take a look https://github.com/rmu75/linuxcnc/tree/rs/zmq-experiments it is certainly possible to replace all that NML stuff with zeromq. I used flatbuffers as serialisation format, others are certainly possible too. It's been more than a year, so caveat emptor. IIRC DDS is a real behemoth, using CORBA IDL, openDDS even includes TAO and ACE -- I wouldn't want to touch that, it has a very strong "comittee", "military procurement" and "enterprise" smell. Communication between the "relaxed" and the "harder" realtime part of linuxcnc doesn't involve NML (or zeromq in my experiment) and probably never will. That's just going through shared memory and an ad-hoc message queue. Synchronisation between multiple instances will probably involve some HAL components, and those are free to communicate with each other in any way they like/need, e.g. shared memory, zeromq or something completely different. The question is why you need two instances in parallel and what kind of operation needs to be synchronised. Doing synchronised thread cutting with two instances will probably induce some headaches. |
ROS2 has a bridge to ZeroMQ, so this would be fine. Synchronisation for the very moment to me only means an exchange of material between some cobots and the mill. Harmless. My motivation for ZeroMQ would be my current understanding that I do not ultimately need to distinguish local and remote instances - they would just have different addresses, right? I yet fail to see use cases for hobbyists, I must admit. But maybe those surface as soon as the technology is out? |
Please, we must keep CORBA at the other side of the universe. No need to invite any committee to determine the next communication strategy that would need to be filed and verified on Alpha Centauri. I'd rather invite the Vogons for some poetry readings. The prime motivation to allow multiple instances, for me, is significantly speeding up testing. Geting rid of SysV IPC and converting to Posix IPC may be beneficial too, but that is less important. Replacing NML, well, that may be an option (done separately from parallelizing). There are probably several bugs in LCNC's NML use (following indications from cppcheck messages) that need to be fixed anyway. I'm not sure it matters much using another library, though. It would need to be argued for why the change is necessary and beneficial and cannot be done with the current implementation. |
Set environment variable LINUXCNC_INSTANCE to some numeric value. This value times 16 is added to shared memory keys and to tcp port numbers. If unset it is treated as 0. issue #2716
This is a draft and has probably unresolved issues.