Vangelis freeze on custom build (Resolved)

Hi guys,

I’ve been fighting making my custom build to work. My hardware is: pi 5, arturia minifuse 2 otg (usb), display Minix SF10T (hdmi+usb). It’s a touch only setup. Zynthian boot from nvme and I’m using a 100 w power by Iniu.

This setup works pretty well with Oram stable, then I updated to Vangelis.

I can easily setup touch interface in Vangelis, but the gui freezes after a while and I’ve discovered how to reproduce the issue: unhide v5 keypad touch widget, load a snapshot, then hide v5 touch widget at this point mixer controls work, unhide v5 touch widget and then widget buttons are unresponsive, the mixer controls work until main is selected, then mixer is also unresponsive.

I think others zynthianers with custom builds have reported something similar.

I did all these testing with Gemini AI and follow the findings.

What’s your opinion about Gemini findings?


Report: RPi 5 System Deadlock on Vangelis Branch (I/O & Interrupt Saturation)

1. System Environment

  • Platform: Raspberry Pi 5 (RP1 Controller).

  • Zynthian Version: Vangelis (Branch).

  • Hardware Setup:

    • Powered USB Hub connected to RPi 5.

    • Arturia MiniFuse 2 (Audio Interface).

    • SiS HID Touch Controller (ID 0457:0819).

  • Initial Comparison: System was stable under Oram (Stable). Instability/Deadlock triggers immediately after migration to Vangelis.

2. Methodology & Findings

  • Finding 1: USB Bus Saturation. Monitoring interrupts via cat /proc/interrupts | grep xhci revealed a severe imbalance in IRQ distribution across the RP1 buses.

    • Observation: Almost all traffic was concentrated on xhci-hcd:usb2 (195,782 interrupts vs 1 on usb4), leading to dmesg warnings:

      Plaintext

      [ 6.398025] xhci-hcd xhci-hcd.0: WARN: buffer overrun event for slot 6 ep 4 on endpoint
      [ 1296.408672] retire_capture_urb: 222 callbacks suppressed
      
      
    • Correction Attempt: Re-routing hardware to force load balancing across usb2 and usb4 successfully distributed interrupts (~281k on usb4, ~600 on usb2), yet the system still deadlocked upon snapshot loading. This proves the issue is not just physical port congestion, but a core software concurrency bug in Vangelis.

  • Finding 2: Snapshot-Triggered Deadlock. Loading a Snapshot triggers a complete system freeze. The failure is sequential and reproducible:

    1. V5 Buttons become unresponsive: Physical input handling via the V5 driver fails first.

    2. Mixer View Collapse: Attempting to open the “Main Mixer” triggers a GUI hang.

    3. Total System Deadlock: The zynthian_main.py process tree hangs, resulting in an unresponsive UI and audio processing loop.

  • Finding 3: Resource Contention (HTOP Analysis). Analysis of htop during the hang shows a massive number of python3 (zynthian_main.py) child processes in S (Interruptible Sleep) or D (Uninterruptible I/O) state.

    • Inference: The GUI thread performs synchronous blocking calls to query the status of audio engines (jalv, fluidsynth, linuxsampler) via the Mixer view. Under Vangelis, due to tighter bus timing or changed event-loop priorities, these calls are timing out or deadlocking, causing the GUI thread to wait indefinitely for an I/O response that never arrives.

3. Summary of Hypothesis

The instability in Vangelis on the RPi 5 appears to be a synchronous-to-asynchronous architecture conflict. The Snapshot process and Mixer view queries are likely blocking the main event loop while waiting for audio engine state updates that are delayed by the RP1 controller’s interrupt handling. Unlike Oram, which seems more tolerant of these latency spikes, Vangelis enters a deadlock state (likely a mutex lock or thread hang) when audio state queries are concurrent with high-traffic HID/Audio USB interrupts.

4. Recommendations for Developers

  1. Audit Mixer Query Implementation: Investigate if the GUI thread is performing synchronous blocking calls to audio engine statuses. Implementing asynchronous queries (using asyncio or worker threads) may prevent the GUI from deadlocking.

  2. Interrupt Handling on RP1: Review IRQ affinity/priority handling in the Vangelis kernel for the RP1 controller, as the system struggles to balance HID and Audio streams compared to previous generations.

  3. V5 Driver Resilience: Determine why physical input drivers (V5 buttons) cease operation before the GUI freeze, as this suggests the event-polling loop is being starved by the Mixer’s blocking calls.


Best regards.

58 posts were merged into an existing topic: Chain Manager screen lock up on Vangelis (Resolved)