-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding an edge leads to an unrelated node never entering runtime #52
Comments
Thanks for the bug report, I will try to reproduce and investigate this problem. One question: does this error also occur if N is not using SPORE? |
I replaced N with the following node:
and the same problem occurs. |
Hi Michael,
I have seen your bug report and will look into it ASAP. I suspect that this
might be a MUSIC bug.
The barrier should not be a problem (if everybody calls it before entering
runtime) even though you are not really supposed to do anything with
COMM_WORLD since it is owned by the MUSIC library.
One idea: Could this have something to do with the particular combination
of port types and topology. I'm wondering what would happen if the "red
edge" from R to N was spiking rather than continuous---would that also
cause a hang?
Is it difficult for you to test that?
Best regards,
Mikael
Den 9 feb 2018 14:41 skrev "Michael Hoff" <[email protected]>:
I replaced N with the following node:
#!/usr/bin/env python3
import numpy as np
import music
from mpi4py import MPI
setup = music.Setup()
event_out = setup.publishEventOutput("motor")
event_in = setup.publishEventInput("visual")
def event_func(d, t, i):
print("{} {} {}".format(d, t, i))
event_in.map(event_func, music.Index.GLOBAL, base=0, size=16384, maxBuffered=0)
event_out.map(music.Index.GLOBAL, base=0, size=8, maxBuffered=0)
cont_in = setup.publishContInput("reward")
cont_in_buffer = np.array([0], dtype=np.int)
cont_in.map(cont_in_buffer, base=0)
MPI.COMM_WORLD.Barrier()
print("entering runtime")
times = setup.runtime(0.02)
for time in times:
print(time)
and the same problem occurs.
Note, that I use the MPI barrier to synchronize with the ros_music_adapters
and that I'm using python3 (if any of these aspects might relate to the
issue).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#52 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADcfCYhFbIWCyanctHlg0X5CFDjAAuENks5tTEscgaJpZM4R9BGv>
.
|
Yes, indeed! With the edge being event-based, everything appears to work just fine. |
OK---I'm afraid I have to debug this. I need to get something finished
until next week. Then I'll look into this.
Best regards,
Mikael
…On Fri, Feb 9, 2018 at 3:19 PM, Michael Hoff ***@***.***> wrote:
One idea: Could this have something to do with the particular combination
of port types and topology. I'm wondering what would happen if the "red
edge" from R to N was spiking rather than continuous---would that also
cause a hang?
Is it difficult for you to test that?
Yes, indeed! With the edge being event-based, everything appears to work
just fine.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#52 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADcfCRywmVS7jxzBE0eu0cNvZPd0frUCks5tTFP2gaJpZM4R9BGv>
.
|
No hurry, but if it is possible for you to create an archive with a failing
example for me, that would be appreciated.
Mikael
On Fri, Feb 9, 2018 at 3:27 PM, Mikael Djurfeldt <[email protected]>
wrote:
… OK---I'm afraid I have to debug this. I need to get something finished
until next week. Then I'll look into this.
Best regards,
Mikael
On Fri, Feb 9, 2018 at 3:19 PM, Michael Hoff ***@***.***>
wrote:
> One idea: Could this have something to do with the particular combination
> of port types and topology. I'm wondering what would happen if the "red
> edge" from R to N was spiking rather than continuous---would that also
> cause a hang?
>
> Is it difficult for you to test that?
>
> Yes, indeed! With the edge being event-based, everything appears to work
> just fine.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#52 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ADcfCRywmVS7jxzBE0eu0cNvZPd0frUCks5tTFP2gaJpZM4R9BGv>
> .
>
|
I guess I have found the problem while constructing a minimal failing example. The source port of the red edge has not been an output, but an input port. Sorry for the inconvenience.. However, the resulting problem is still very unintuitive. Is there a simple way to change MUSIC in a way that it fails fast or even reports an error message in such cases of obvious misconfiguration? |
Good that this is resolved!
Yes---error reporting should be improved. Here I actuslly had expected
MUSIC to complain. Please send your minimal failing example!
Den 9 feb. 2018 19:03 skrev "Michael Hoff" <[email protected]>:
… I guess I have found the problem while constructing a minimal failing
example. The source port of the red edge has not been an output, but an
input port. Sorry for the inconvenience..
However, the resulting problem is still very unintuitive. Is there a
simple way to change MUSIC in a way that it fails fast or even reports an
error message in such cases of obvious misconfiguration?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#52 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADcfCfU4pi7-YchIXbMlsNRUGH1-ENEbks5tTIhRgaJpZM4R9BGv>
.
|
Weird. Even though properly configuring the edge does apparently indeed resolve the error, D can also be "fixed" by replacing its binary with a dummy substitute which essentially does the same in terms of MUSIC communication. As a consequence, you need the ros_music_adapter project to run the "minimal" failing example attached to this comment. [edit]use |
Thanks!
…On Mon, Feb 12, 2018 at 10:55 AM, Michael Hoff ***@***.***> wrote:
Weird. Even though properly configuring the edge does apparently indeed
resolve the error, D can also be "fixed" by replacing its binary with a
dummy substitute which essentially does the same in terms of MUSIC
communication. As a consequence, you need the ros_music_adapter project to
run the "minimal" failing example attached
<https://github.com/INCF/MUSIC/files/1715896/music-issue52.zip> to this
comment.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#52 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADcfCcLQO8ZqssuxbIUTSQYBoNb1v7wiks5tUAqmgaJpZM4R9BGv>
.
|
Consider the following MUSIC topology:
V, P, C, D are nodes from the ros_music_adapters, written in CPP; R is a python (pymusic) node; and N is a pynest node using SPORE.
The problem arises when the red edge (R, N) is inserted into the MUSIC configuration file. Running MUSIC does then result in all nodes functioning normally, except for D, which is getting stuck somewhere before entering the runtime.
This bug appears to be related to #37 (#35), because essentially the same error symptoms occur. However,
--disable-isend
does not resolve the issue, nor is the MPI version 1.6.5.Let me emphasize: slightly different topologies are functional in the sense, that no single node is getting stuck before runtime.
Working example 1:
Working example 2:
Debug information:
The text was updated successfully, but these errors were encountered: