Draft: Try to fix build on mips64el
-
Add patch to avoid error-prone ELF header parsing, fixing build on mips64el
Helps: #1042980
-
d/rules: Increate arbitrary test timeouts
The default test timeout is 30 seconds, but the perf-* tests take more like 45 seconds on mips64el.
Helps: #1042980
/cc @syq
This isn't fully working yet. The tests are still failing for me on eller, with:
GNOME Shell-Message: 11:41:00.189: Registering session with GDM
(EE) failed to write to Xwayland fd: Broken pipe
Please could you investigate further? It would also be useful to know whether this gnome-shell version works or crashes on real mips hardware (build with DEB_BUILD_OPTIONS=nocheck
if necessary).
Merge request reports
Activity
It would also be useful to know whether this gnome-shell version works or crashes on real mips hardware
It would also be useful to know whether gnome-shell 43.x (unstable) works or crashes on real mips hardware. The failing tests are new in version 44, so they don't actually tell us whether there is a regression when compared with version 43.
added 1 commit
- 84be34c2 - Add patch to fix ELF header parsing on mips64el and riscv64
Thanks you patch seems much better than mine. Let's consider to upstream it.
And for the test:
- the test can pass on mips64 machine without MSA (MIPS SIMD), with softpipe instead of llvmpipe. So, it is a bug of LLVM. I will try to fix it. For now, can we run the test with softpipe on mips?
- and on MIPS with MSA, mesa try to use it, and trigger some problems. It is still the bug of LLVM. So maybe we should revert the changes to mesa before LLVM MSA JIT is fixed. https://gitlab.freedesktop.org/mesa/mesa/-/commit/88b234d7a7cd71fcb4955428010f238ec9530431
Edited by YunQiang SuThanks you patch seems much better than mine. Let's consider to upstream it.
The patch from Daniel van Vugt (which replaced the one I wrote) has in fact been applied upstream, although probably only for v45.
the test can pass on mips64 machine without MSA (MIPS SIMD), with softpipe instead of llvmpipe. So, it is a bug of LLVM. I will try to fix it. For now, can we run the test with softpipe on mips?
We probably could, but are the resulting gnome-shell binaries going to be broken on mips machines?
You didn't answer my questions about whether gnome-shell works (as a Wayland/X11 user interface, not just at build time) on mips machines:
-
if you install GNOME from unstable (gnome-shell 43) on a real mips(64)el machine, does gnome-shell work, or does it crash like this?
-
if that works, and then you upgrade to gnome-shell 44 (built from this branch but with
DEB_BUILD_OPTIONS=nocheck
, does it still work, or does it crash like this?
on MIPS with MSA, mesa try to use it, and trigger some problems. It is still the bug of LLVM. So maybe we should revert the changes to mesa before LLVM MSA JIT is fixed
If this is a LLVM bug, please could you open a bug in llvm-toolchain-15 and mark it as affecting mesa and gnome-shell? I think you understand the situation a lot better than I do!
Edited by Simon McVittie-
We probably could, but are the resulting gnome-shell binaries going to be broken on mips machines?
The answer is yes, and no. In most case the MIPS/Loongson machines using gnome-shell, will have a AMD graphic, so llvmpipe won't be used at. For this case, gnome-shell works well.
If this is a LLVM bug, please could you open a bug in llvm-toolchain-15 and mark it as affecting mesa and gnome-shell? I think you understand the situation a lot better than I do!
Yes. I will submit this bug report to Debian and upstream. I will try to fix them upstreamly.
I didn't see a bug report opened against either llvm-toolchain-15 or mesa. For now I have opened https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1049404 for the issue involving llvmpipe on mips(64)el: please follow up there with any further information.
Is this actually the same as https://bugs.debian.org/993550, originally reported against gtk4?
We work around that in gtk4 by using
GALLIUM_DRIVER=softpipe
andLIBGL_ALWAYS_SOFTWARE=true
onmips%
CPUs. I'll see whether the same thing works for gnome-shell.Or perhaps the same as https://bugs.debian.org/1010838, also originally from gtk4.
Ohh, I can reproduce the same failure of perf-closeWithActiveWindows on ARM64 with softpipe.
The reason is that ffi_arg_pointers[1] needs dereference 1 more time with softpipe than llvmpipe.... So it should be a bug of softpipe.
Edited by YunQiang SuIf the crash seen with softpipe involves
ffi_arg_pointers
fromgjs/gi/function.cpp
, then that seems more likely to be a bug in gobject-introspection, gjs, gnome-shell or mutter than a bug in LLVM or Mesa.My guess would be that there's some fallback rendering path that is rarely tested and therefore contains bugs, because all real-world GNOME Shell users are using either a hardware GPU or llvmpipe, and nobody uses softpipe in practice.
Ohh, it is not about 1 more time dereference, I guess it is about multithread problem.
On my ARM64 machine, if no breakpoint is set, segfault will always happen. If 2 breakpoints is set on both: b function.cpp:1050 if function=shell_wm_completed_map shell_wm_completed_map The test will always pass.
So I guess some other thread change the data to shell_wm_completed_map.
please report it as a separate gnome-shell bug to start with
I didn't see a gnome-shell bug report, so I have opened https://bugs.debian.org/1049407 for the crash seen with softpipe. Please send any follow-ups there.
Bug reported to mesa upstream: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9514
nano sleep some time (1<<23 ns for my arm64 server) before the ffi_call can pass the test.
and taskset also helps the possibility of test pass.
Edited by YunQiang Su