Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] MAVLink periodically experiences missing complete data (ARK Jetson PAB with the ARKV6X). #24276

Open
strongco223 opened this issue Jan 30, 2025 · 7 comments
Assignees

Comments

@strongco223
Copy link

Describe the bug

When integrating the ARK Jetson PAB, ARKV6X, and Gremsy Zio, We observed periodic loss of complete MAVLink data under the following condition:
Image

  • If the gimbal is not installed or fails to initialize successfully, complete MAVLink data periodically stops transmitting.
  • The issue occurs inconsistently during the first 1–2 cycles after boot.
  • Eventually, the issue stabilizes into a pattern where MAVLink data is available for ~6 minutes and then missing for ~3 minutes in a repeating cycle.
  • During the complete data loss period, no MAVLink-configured ports receive complete MAVLink messages; only MISSION_CURRENT data is collected.

Image

To Reproduce

To simplify the issue, it can be consistently reproduced by powering on the system with the provided parameters.

(No peripherals are connected; only the onboard computer and the flight controller are linked.)

  1. Power On
  2. Connect to any port configured for MAVLink. (USB UART-to-TTL converter)
  3. A few minutes later
  4. Only able to receive MISSION_CURRENT data.
  5. A few minutes later
  6. Receive complete data.
  7. Cycle to 3.

BugParams.params.txt

Expected behavior

We expected to control the gimbal using RC while ensuring that MAVLink data is not lost periodically.

Screenshot / Media

MAVLink periodically has no data Video

Image

Flight Log

PX4 Flight Review

Software Version

1.15.0

Flight controller

ARK Jetson PAB with the ARKV6X

Vehicle type

Multicopter

How are the different components wired up (including port information)

Image

Additional context

  • If the gimbal is installed (and successfully initialized) or if NMT_MODE_IN is set to -1, the issue does not occur.

  • It can be reproduced on two different ARK Jetson PAB with the ARKV6X boards.

  • The issue occurs in both 1.15.0 and 1.16 tests.

  • Current Log Collection Method:
    Image

Please let us know if you need any additional data to help diagnose the issue. We would also appreciate any insights or discussions on potential causes of this issue.

@dakejahl
Copy link
Contributor

dakejahl commented Jan 30, 2025

Hmm it's very regular at 365s. I am able to reproduce this as well. It looks like it stops outputting mavlink for some reason. The MISSION_CURRENT message is still sent because those are being sent out by the _mission_manager which lives inside of the mavlink_receiver thread. All of the "streams" in mavlink_main stop being sent.

Image

@dakejahl
Copy link
Contributor

dakejahl commented Jan 31, 2025

Looks like it fails shortly after the ethernet interface broadcast address is found

INFO  [mavlink] using network interface eth0, IP: 192.168.0.3
INFO  [mavlink] with netmask: 255.255.255.0
INFO  [mavlink] and broadcast IP: 192.168.0.255

But this shouldn't be occuring at all, since the ethernet interface of the MCU is not connected on the ARK Jetson carrier. The failure mode and warnings/errors also only occur when the gimbal is enabled (MNT_MODE_OUT and MNT_MODE_IN), but oddly enough the gimbal driver itself isn't the issue, I can gimbal stop the driver and the issue still occurs.

@dakejahl
Copy link
Contributor

dakejahl commented Feb 1, 2025

This issue occurs when the network is restarted. It also fixes itself sometime later when the network is restarted. Looks like nutx is not catching the missing phy. Still unsure why this only happens with gimbal enabled...

netinit_monitor: eth0: devup=1 PHY address=00 MSR=ffff
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
netinit_monitor: eth0: devup=1 PHY address=00 MSR=ffff
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
netinit_monitor: eth0: devup=1 PHY address=00 MSR=ffff
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
netinit_monitor: eth0: devup=1 PHY address=00 MSR=ffff
stm32_txtimeout_expiry: ERROR: Timeout!
stm32_ifdown: Taking the network down
stm32_ifup: Bringing up: 192.168.0.4
stm32_ethconfig: Reset the Ethernet block
stm32_ethconfig: Initialize the PHY
stm32_phyinit: Phy reset in 10 ms
stm32_phyinit: PHYSR[31]: ffff
stm32_phyinit: Duplex: FULL Speed: 100 MBps
stm32_ethconfig: Initialize the MAC and DMA
stm32_ethconfig: Enable normal operation
stm32_macaddress: eth0 MAC: ce:f1:92:f9:9c:2b
udp_callback: flags: 0010
sendto_eventhandler: flags: 0010
sendto_eventhandler: wrb=0x2401013c sndlen=41
udp_send: UDP payload: 41 (0) bytes
udp_send: Outgoing UDP packet length: 69
stm32_transmit: d_len: 83 d_buf: 0x2400bfa0 txhead: 0x2400dda0 tdes3: 00000000
stm32_transmit: txhead: 0x2400ddc0 txtail: 0x2400dda0 inflight: 1
psock_udp_sendto: Queued WRB=0x2401013c pktlen=41 write_q(0x240100b4,0x2401013c)
RROR [mavlink] COMMAND_LONG vehicle_command lost, generation 234 -> 293
ERROR [mavlink] COMMAND_LONG vehicle_command lost, generation 234 -> 293
ERROR [mavlink] COMMAND_LONG vehicle_command lost, generation 234 -> 293
100b4 sndlen=10
udp_send: UDP payload: 10 (0) bytes
udp_send: Outgoing UDP packet length: 38
stm32_transmit: d_len: 52 d_buf: 0x2400cba0 txhead: 0x2400ddc0 tdes3: 00000000
stm32_transmit: txhead: 0x2400dde0 txtail: 0x2400dda0 inflight: 2
psock_udp_sendto: Queued WRB=0x240100b4 pktlen=10 write_q(0x2401002c,0x240100b4)
stm32_txavail_work: ifup: 1
udp_callback: flags: 0010
sendto_eventhandler: flags: 0010
sendto_eventhandler: wrb=0x2401002c sndlen=41
udp_send: UDP payload: 41 (0) bytes
udp_send: Outgoing UDP packet length: 69
stm32_transmit: d_len: 83 d_buf: 0x2400d7a0 txhead: 0x2400dde0 tdes3: 00000000
stm32_transmit: txhead: 0x2400de00 txtail: 0x2400dda0 inflight: 3
psock_udp_sendto: Queued WRB=0x2401002c pktlen=41 write_q(0x2400ffa4,0x2401002c)
stm32_txavail_work: ifup: 1
udp_callback: flags: 0010
sendto_eventhandler: flags: 0010
sendto_eventhandler: wrb=0x2400ffa4 sndlen=10
udp_send: UDP payload: 10 (0) bytes
udp_send: Outgoing UDP packet length: 38
stm32_transmit: d_len: 52 d_buf: 0x2400d1a0 txhead: 0x2400de00 tdes3: 00000000
stm32_transmit: txhead: 0x2400dda0 txtail: 0x2400dda0 inflight: 4
stm32_txpoll: No tx descriptors availablepsock_udp_sendto: Queued WRB=0x2400ffa4 pktlen=10 write_q(0x2401035c,0x2400ffa4)
stm32_txavail_work: ifup: 1
stm32_dopoll: No tx descriptors
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
netinit_monitor: eth0: devup=1 PHY address=00 MSR=ffff
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
netinit_monitor: eth0: devup=1 PHY address=00 MSR=ffff

@strongco223
Copy link
Author

Hello @dakejahl , thank you for pointing out Ethernet as a potential cause of the issue!

Over the past few days, we conducted several tests related to Ethernet settings. Among them, one effective test was setting MAV_2_CONFIG to disabled (default: Ethernet, but the ARK Jetson PAB pinout does not seem to have it?). After making this change, the issue of periodic incomplete MAVLink data did not occur again (tested twice over a 1-hour period).

Image

I'm not entirely sure if this setting is directly related to the issue. Could you provide any insights on this?

The parameter settings are attached for your reference.
Ethernet_Disabled.params.txt

Thanks again for your help!

@dakejahl
Copy link
Contributor

dakejahl commented Feb 4, 2025

Yes it is directly related. The root issue is that the nuttx ethernet driver initialization is succeeding and creating the eth0 interface when it should be failing. The Jetson Carrier does not have ethernet phy connected to the FC ethernet signals.

Image

@dakejahl
Copy link
Contributor

dakejahl commented Feb 4, 2025

stm32_ifup: Bringing up: 192.168.0.4
stm32_ethconfig: Reset the Ethernet block
stm32_ethconfig: Initialize the PHY
stm32_phyinit: stm32_phyinit stm32_phyinit stm32_phyinit stm32_phyinit stm32_phyinit
stm32_phyinit: stm32_phyread: ret: 0
stm32_phyinit: Phy reset in 10 ms
stm32_phyinit: stm32_phyread: MII_PHYID1: 0 ret: 0
stm32_phyinit: link status: phyval: 65535

When there is no PHY the register reads return all high, since the line is pulled up internally. We probably should check that the PHYID is non-zero and/or that the status register is not 0xffff

@dakejahl
Copy link
Contributor

dakejahl commented Feb 5, 2025

I submitted a nuttx PR which fixes this issue
apache/nuttx#15757

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants