-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenGL 2.x support #1
Comments
adreno a2xx lacks integer support for shaders, so it would never be able to support enough for gl2 glsl.. a3xx does support integers so that should be possible, although I'm not entirely sure about occlusion query.. that does seem to be supported in gles3 and the hw is gles3 capable but I don't have gles3 drivers yet to see how that works. I'm not sure if there might be something else missing for full gl2 support. |
I'm going to assume there's no point doing a float<->int conversion hack (if such a thing is possible) for a2xx since that type of thing is usually very expensive... As for the other features, I guess I'll find out when I tell the driver to lie about its capabilities :P |
the problem is no bitwise operations, so float<->int conversion isn't sufficient. You can lie about the caps, it will work up until the point where a shader triggers use of some TGSI instructions that can't be supported. But I ran into a problem w/ some intrinsics compiling into things I can't support without int if I advertise a higher GLSL version. |
Well I managed to get glxinfo to report OpenGL 2.1 by adding stub occlusion query support (pretty much copying from the i915 drivers). It should surprise no one that glxgears rendered crescents instead of gears at under 15FPS. |
hmm, at least for GL1.4, glxgears needed TGSI_OPCODE_LIT, which I hadn't implemented yet (although es2gears which uses glsl shaders works ok). 15fps seems like it might be software fallback? For es2gears, on HP touchpad, iirc it was around ~100+ fps, nexus4 (a320) is somewhat faster at ~400fps. And I'm sure in both cases there is lots of room for optimization. |
I'm pretty sure it's running freedreno and not swrast (OpenGL vendor: freedreno, OpenGL render string: Gallium 0.4 on FD001)... Either way, it seems to be doing most of the work on the CPU, so there's definitely a software fallback happening somewhere. |
fwiw, here are the configure flags that I use: ./autogen.sh --prefix=/usr --with-dri-drivers= --with-gallium-drivers=freedreno,swrast --with-egl-platforms=x11 --enable-gles2 --enable-gles1 --enable-debug --enable-gallium-egl --disable-gallium-llvm --enable-xa |
My only significant deviation from those flags is I don't think I enabled xa. It probably won't make much of a difference. Real occlusion queries would be nice, but I figured there's no harm in sharing dumb little hacks like this. I sorted out the segfault, and I thought I'd mention es2gears runs at 60fps in near-fullscreen on my TouchPad. I'm crossing my fingers for the rumored Snapdragon Nexus 7... |
Yeah, unless you want my stub occlusion query hack (unlikely since you could probably implement it for real in less time than it took me), and/or have another use for my limited skills, you may close this ticket, being that it was more of a feature request than a bug report, |
fwiw, the a3xx should support occlusion query, and I suspect the a2xx does as well. But, as far as I can tell the gles2 blob drivers do not implement any occlusion related extension, and I think we only get gles3 on a3xx. So at least for the newer devices, I think we should eventually be able to do this in hw. But I don't have gles3 blob drivers on my nexus4, and I cannot download the latest drivers from qcom without a EULA, so I haven't been able to look into how this works yet. Anyways, if you have a patch, feel free to post it, or even just send to mesa-devel (whichever you prefer) and I will review it. |
The previous code for sRGB overrides assumes that the source and destination formats are equal, other than the color space. This won't be feasible when we add support for format conversions. Here are a few cases, and how the old code handled them: 1. RGB8 -> SRGB8, MSAA ==> SRGB8 -> SRGB8 2. RGB8 -> SRGB8, single ==> RGB8 -> RGB8 3. SRGB8 -> RGB8, MSAA ==> RGB8 -> RGB8 4. SRGB8 -> RGB8, single ==> SRGB8 -> SRGB8 Apparently, preserving the behavior of #1 is important. When doing a multisample to single-sample resolve, blending the samples together in an sRGB correct fashion results in a noticably higher quality image. It also is necessary to pass Piglit's EXT_framebuffer_multisample accuracy color tests. Paul, Eric, Anuj, and I talked about this, and aren't sure that it matters in the other cases. This patch preserves the behavior of #1, but otherwise reverts to doing everything in linear space, changing the behavior of case #4. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Daniel Vetter <[email protected]>
With the recent SRGB changes all my automated OpenGL llvmpipe tests (piglit, conform, glretrace) start asserting with the backtrace below. I'm hoping this change will fix it. I'm not entirely sure, as this doesn't happen in my development machine (the bug probably depends on the exact X visual). Anyway, it seems the sensible thing to do here. Program terminated with signal 5, Trace/breakpoint trap. #0 _debug_assert_fail (expr=expr@entry=0x7fa324df2ed7 "0", file=file@entry=0x7fa324e3fc30 "src/mesa/state_tracker/st_format.c", line=line@entry=758, function=function@entry=0x7fa324e40160 <__func__.34798> "st_pipe_format_to_mesa_format") at src/gallium/auxiliary/util/u_debug.c:281 #0 _debug_assert_fail (expr=expr@entry=0x7fa324df2ed7 "0", file=file@entry=0x7fa324e3fc30 "src/mesa/state_tracker/st_format.c", line=line@entry=758, function=function@entry=0x7fa324e40160 <__func__.34798> "st_pipe_format_to_mesa_format") at src/gallium/auxiliary/util/u_debug.c:281 No locals. #1 0x00007fa3241d22b3 in st_pipe_format_to_mesa_format (format=format@entry=PIPE_FORMAT_R8G8B8A8_SRGB) at src/mesa/state_tracker/st_format.c:758 __func__ = "st_pipe_format_to_mesa_format" #2 0x00007fa3241c8ec5 in st_new_renderbuffer_fb (format=format@entry=PIPE_FORMAT_R8G8B8A8_SRGB, samples=0, sw=<optimised out>) at src/mesa/state_tracker/st_cb_fbo.c:295 strb = 0x19e8420 #3 0x00007fa32409d355 in st_framebuffer_add_renderbuffer (stfb=stfb@entry=0x19e7fa0, idx=<optimised out>) at src/mesa/state_tracker/st_manager.c:314 rb = <optimised out> format = PIPE_FORMAT_R8G8B8A8_SRGB sw = <optimised out> #4 0x00007fa32409e635 in st_framebuffer_create (st=0x19e7fa0, st=0x19e7fa0, stfbi=0x19e7a30) at src/mesa/state_tracker/st_manager.c:458 stfb = 0x19e7fa0 mode = {rgbMode = 1 '\001', floatMode = 0 '\000', colorIndexMode = 0 '\000', doubleBufferMode = 0, stereoMode = 0, haveAccumBuffer = 0 '\000', haveDepthBuffer = 1 '\001', haveStencilBuffer = 1 '\001', redBits = 8, greenBits = 8, blueBits = 8, alphaBits = 8, redMask = 0, greenMask = 0, blueMask = 0, alphaMask = 0, rgbBits = 32, indexBits = 0, accumRedBits = 0, accumGreenBits = 0, accumBlueBits = 0, accumAlphaBits = 0, depthBits = 24, stencilBits = 8, numAuxBuffers = 0, level = 0, visualRating = 0, transparentPixel = 0, transparentRed = 0, transparentGreen = 0, transparentBlue = 0, transparentAlpha = 0, transparentIndex = 0, sampleBuffers = 0, samples = 0, maxPbufferWidth = 0, maxPbufferHeight = 0, maxPbufferPixels = 0, optimalPbufferWidth = 0, optimalPbufferHeight = 0, swapMethod = 0, bindToTextureRgb = 0, bindToTextureRgba = 0, bindToMipmapTexture = 0, bindToTextureTargets = 0, yInverted = 0, sRGBCapable = 1} idx = <optimised out> #5 st_framebuffer_reuse_or_create (st=st@entry=0x19dfce0, fb=<optimised out>, stfbi=stfbi@entry=0x19e7a30) at src/mesa/state_tracker/st_manager.c:728 No locals. #6 0x00007fa32409e8cc in st_api_make_current (stapi=<optimised out>, stctxi=0x19dfce0, stdrawi=0x19e7a30, streadi=0x19e7a30) at src/mesa/state_tracker/st_manager.c:747 st = 0x19dfce0 stdraw = 0x640064 stread = 0x1300000006 ret = <optimised out> #7 0x00007fa324074a20 in XMesaMakeCurrent2 (c=c@entry=0x195bb00, drawBuffer=0x19e7e90, readBuffer=0x19e7e90) at src/gallium/state_trackers/glx/xlib/xm_api.c:1194 No locals. #8 0x00007fa3240783c8 in glXMakeContextCurrent (dpy=0x194e900, draw=8388610, read=8388610, ctx=0x195bac0) at src/gallium/state_trackers/glx/xlib/glx_api.c:1177 drawBuffer = <optimised out> readBuffer = <optimised out> xmctx = 0x195bb00 glxCtx = 0x195bac0 firsttime = 0 '\000' no_rast = 0 '\000' #9 0x00007fa32407852f in glXMakeCurrent (dpy=<optimised out>, drawable=<optimised out>, ctx=<optimised out>) at src/gallium/state_trackers/glx/xlib/glx_api.c:1211 No locals. Acked-by: Brian Paul <[email protected]>
Fixes: Program received signal SIGSEGV, Segmentation fault. bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 445 mask_pic->srf->tex->format); (gdb) bt #0 bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 #1 xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488 #2 0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>, pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0, pDst=0x5923b8) at msm-exa-xa.c:533 We can't yet handle solid fill mask, so explicitly reject that, rather than segfaulting. Otherwise DDX would need to check XA version to see if solid fill mask were supported. Signed-off-by: Rob Clark <[email protected]>
Fixes: Program received signal SIGSEGV, Segmentation fault. bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 445 mask_pic->srf->tex->format); (gdb) bt #0 bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 #1 xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488 #2 0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>, pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0, pDst=0x5923b8) at msm-exa-xa.c:533 We can't yet handle solid fill mask, so explicitly reject that, rather than segfaulting. Otherwise DDX would need to check XA version to see if solid fill mask were supported. Signed-off-by: Rob Clark <[email protected]>
Fixes: Program received signal SIGSEGV, Segmentation fault. bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 445 mask_pic->srf->tex->format); (gdb) bt #0 bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 #1 xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488 #2 0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>, pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0, pDst=0x5923b8) at msm-exa-xa.c:533 We can't yet handle solid fill mask, so explicitly reject that, rather than segfaulting. Otherwise DDX would need to check XA version to see if solid fill mask were supported. Signed-off-by: Rob Clark <[email protected]>
So when checking/building sse code we have three possibilities: 1 Old compiler, throws an error when using -msse* 2 New compiler, user disables sse* (-mno-sse*) 3 New compiler, user doesn't disable sse The original code, added code for #1 but not #2. Later on we patched around the lack of handling #2 by wrapping the code in __SSE4_1__. Yet it lead to a missing/undefined symbol in case of #1 or #2, which might cause an issue for #2 when using the i965 driver. A bit later we "fixed" the undefined symbol by using #1, rather than updating it to handle #2. With this commit we set things straight :) To top it all up, conventions state that in case of conflicting (-enable-foo -disable-foo) options, the latter one takes precedence. Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test. v2: Clean the #includes. Suggested by Ilia, Matt & Siavash. Cc: "10.3 10.4" <[email protected]> Tested-by: David Heidelberg <[email protected]> Tested-by: Siavash Eliasi <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
This is a partial revert of c893069. It split the {start,base}_vertex_location handling into several steps: 1. Set brw->draw.start_vertex_location = prim[i].start and brw->draw.base_vertex_location = prim[i].basevertex. (This happened once per _mesa_prim, in the main drawing loop.) 2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset appropriately. (This happened in brw_prepare_shader_draw_parameters, which was called just after brw_prepare_vertices, as part of state upload, and only happened when BRW_NEW_VERTICES was flagged.) 3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim). If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on the second (or later) primitives, we would do step #1, but not #2. The first _mesa_prim would get correct values, but subsequent ones would only get the first half of the summation. The reason I originally did this was because I needed the value of gl_BaseVertexARB to exist in a buffer object prior to uploading 3DSTATE_VERTEX_BUFFERS. I believed I wanted to upload the value of 3DPRIMITIVE's "Base Vertex Location" field, which was computed as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) + brw->vb.start_vertex_bias. The latter value wasn't available until after brw_prepare_vertices, and the former weren't available in the state upload code at all. Hence the awkward split. However, I believe that including brw->vb.start_vertex_bias was a mistake. It's an extra bias we apply when uploading vertex data into VBOs, to move [min_index, max_index] to [0, max_index - min_index]. >From the GL_ARB_shader_draw_parameters specification: "<gl_BaseVertexARB> holds the integer value passed to the <baseVertex> parameter to the command that resulted in the current shader invocation. In the case where the command has no <baseVertex> parameter, the value of <gl_BaseVertexARB> is zero." I conclude that gl_BaseVertexARB should only include the baseVertex parameter from glDraw*Elements*, not any internal biases we add for optimization purposes. With that in mind, gl_BaseVertexARB only needs prim[i].start or prim[i].basevertex. We can simply store that, and go back to computing start_vertex_location and base_vertex_location in brw_emit_prim(), like we used to. This is much simpler, and should actually fix two bugs. Fixes missing geometry in Unvanquished. Cc: "10.4 10.3" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529 Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
This implements a workaround (exact excerpt as a comment in the code). The docs specify [clearly, after you struggle for a while] that the offset isn't relative to state base. This actually makes sense. This fixes hangs on SKL. Buffer #0 is meant to be used for normal uniforms. Buffer #1 is typically used for gather constants when using RS. Buffer #1-#3 could be used to push a bunch of UBO data which would just be somewhere in memory, and not relative to the dynamic state. NOTE: I've moved away from the ternary operator for the new gen9 conditions. Admittedly it's probably not great to do this, but I really want to fix this all up in the subsequent patch and doing it here makes that diff a lot nicer. I want to split out the gen8/9 code to make the function a bit more readable, but to keep this easily cherry-pickable I am doing this fix first. If we decide not to merge the cleanup patch then I can revisit this. Cc: "10.5 10.6" <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Tested-by: Valtteri Rantala <[email protected]>
Equivalent to commit 8ac3b52 but with sel operations. In this case we select the PredCtrl based on the writemask. This patch helps on cases like this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD In this case, cmod propagation can't optimize instruction #2, because instructions #1 and #2 have different writemasks, and we can't update directly instruction #2 writemask because our code thinks that sel at instruction #3 reads all four channels of the flag, when it actually only reads .x. So, with this patch, the previous case becames this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Now only the x channel of the flag is used, allowing dead code eliminate to update the writemask at the second instruction: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD So now cmod propagation can simplify out #2: 1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F 2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Shader-db numbers: total instructions in shared programs: 6235835 -> 6228008 (-0.13%) instructions in affected programs: 219850 -> 212023 (-3.56%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 1192 HURT: 0
Not 100% sure, but I think being an unsigned literal will help: CID 1358505 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension: load1->def.num_components with type unsigned char (8 bits, unsigned) is promoted in load1->def.num_components * (load1->def.bit_size / 8) to type int (32 bits, signed), then sign-extended to type unsigned long (64 bits, unsigned). If load1->def.num_components * (load1->def.bit_size / 8) is greater than 0x7FFFFFFF, the upper bits of the result will all be 1. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Matt Turner <[email protected]>
At this point, it would require a logic error in nir_validate to not have already populated this hashtable entry, but coverity doesn't realize that: CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Matt Turner <[email protected]>
For example, at nir_lower_locals_to_regs.c:365: CID 1358911 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer deref_ptr. I'm pretty sure that coverity doesn't understand the relationship between arr->data and arr->size. I think throwing in a !null check should do the trick.
For example, in nir_opt_dead_cf.c:140: CID 1358914 (#1 of 1): Dereference null return value (NULL_RETURNS)13. dereference: Dereferencing a null pointer block.
CID 1265536 (#1 of 2): Explicit null dereferenced (FORWARD_NULL)6. var_deref_op: Dereferencing null pointer parent.
CID 1265536 (#1 of 2): Explicit null dereferenced (FORWARD_NULL)6. var_deref_op: Dereferencing null pointer parent. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
For example, at nir_lower_locals_to_regs.c:365: CID 1358911 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer deref_ptr. I'm pretty sure that coverity doesn't understand the relationship between arr->data and arr->size. I think throwing in a !null check should do the trick.
For example, in nir_opt_dead_cf.c:140: CID 1358914 (#1 of 1): Dereference null return value (NULL_RETURNS)13. dereference: Dereferencing a null pointer block.
Not sure why coverity calls this an out-of-bounds read vs out-of-bounds write. CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local: Overrunning array r of 3 16-byte elements at element index 3 (byte offset 48) using index chan (which evaluates to 3). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Brian Paul <[email protected]>
CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local: Overrunning array of 2 16-byte elements at element index 2 (byte offset 32) by dereferencing pointer &inst.Dst[i]. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Brian Paul <[email protected]>
ptr can actually never be null so just drop the check. CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking ptr suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Rob Clark <[email protected]>
For example, at nir_lower_locals_to_regs.c:365: CID 1358911 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer deref_ptr. I'm pretty sure that coverity doesn't understand the relationship between arr->data and arr->size. I think throwing in a !null check should do the trick.
For example, in nir_opt_dead_cf.c:140: CID 1358914 (#1 of 1): Dereference null return value (NULL_RETURNS)13. dereference: Dereferencing a null pointer block.
For example, at nir_lower_locals_to_regs.c:365: CID 1358911 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer deref_ptr. I'm pretty sure that coverity doesn't understand the relationship between arr->data and arr->size. I think throwing in a !null check should do the trick.
For example, in nir_opt_dead_cf.c:140: CID 1358914 (#1 of 1): Dereference null return value (NULL_RETURNS)13. dereference: Dereferencing a null pointer block.
For example, at nir_lower_locals_to_regs.c:365: CID 1358911 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer deref_ptr. I'm pretty sure that coverity doesn't understand the relationship between arr->data and arr->size. I think throwing in a !null check should do the trick.
For example, in nir_opt_dead_cf.c:140: CID 1358914 (#1 of 1): Dereference null return value (NULL_RETURNS)13. dereference: Dereferencing a null pointer block.
For example, at nir_lower_locals_to_regs.c:365: CID 1358911 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer deref_ptr. I'm pretty sure that coverity doesn't understand the relationship between arr->data and arr->size. I think throwing in a !null check should do the trick.
For example, in nir_opt_dead_cf.c:140: CID 1358914 (#1 of 1): Dereference null return value (NULL_RETURNS)13. dereference: Dereferencing a null pointer block.
There is currently no protection against walking a hash (using _mesa_HashWalk()) and modifying it at the same time, for instance by inserting or deleting elements. This leads to segfaults in multithreaded code if e.g. someone calls glTexImage2D (which may have to walk the list of FBOs) while another thread is calling glDeleteFramebuffers on another thread with the two contexts sharing lists. The reason for this is that _mesa_HashWalk() doesn't actually take the mutex that normally protects the hash; it takes an entirely different mutex. Thus, walks are only protected against other walks, and there is also no outer lock taking this. There is an old comment saying that this is to fix problems with deadlock if the callback needs to take a mutex; we solve this by changing the mutex to be recursive. A demonstration Helgrind hit from a real application: ==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1 ==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400 ==13412== at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395) ==13412== by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350) ==13412== by 0x1EE98174: _mesa_HashRemove (hash.c:365) ==13412== by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669) ==13412== by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473) ==13412== by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442) [...] ==13412== This conflicts with a previous read of size 8 by thread #20 ==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318 ==13412== at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415) ==13412== by 0x1EE982A8: _mesa_HashWalk (hash.c:426) ==13412== by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683) ==13412== by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043) ==13412== by 0x1EED9410: teximage (teximage.c:3073) ==13412== by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105) ==13412== by 0x166A68: operator() (mixer.cpp:454) There are many more interactions than just these two possible. Cc: 11.2 12.0 13.0 <[email protected]> Signed-off-by: Steinar H. Gunderson <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
Found by address sanitizer: ==22621==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61400000cbd8 at pc 0x7f561610a4ff bp 0x7ffca85f9d50 sp 0x7ffca85f94f8 READ of size 344 at 0x61400000cbd8 thread T0 #0 0x7f561610a4fe (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x5f4fe) freedreno-zz#1 0x7f560bb305a5 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53 freedreno-zz#2 0x7f560bb305a5 in blob_write_bytes ../../../mesa-src/src/compiler/glsl/blob.c:136 freedreno-zz#3 0x7f560be7d7ff in encode_type_to_blob ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:153 freedreno-zz#4 0x7f560be81222 in write_program_resource_data ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:950 #5 0x7f560be81222 in write_program_resource_list ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1118 #6 0x7f560be81222 in shader_cache_write_program_metadata(gl_context*, gl_shader_program*) ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1407 #7 0x7f560b825fdb in link_program ../../../mesa-src/src/mesa/main/shaderapi.c:1163 Fixes: 073a84f ("glsl: stop adding pointers from glsl_struct_field to the cache") Reviewed-by: Timothy Arceri <[email protected]>
Fixes an assert we start hitting with kms/gbm: #0 0x0000007fbf3d6e3c in raise () from /lib64/libc.so.6 freedreno-zz#1 0x0000007fbf3c4a68 in abort () from /lib64/libc.so.6 freedreno-zz#2 0x0000007fbf3d04e8 in __assert_fail_base () from /lib64/libc.so.6 freedreno-zz#3 0x0000007fbf3d0550 in __assert_fail () from /lib64/libc.so.6 freedreno-zz#4 0x0000007fbf5a73c4 in gbm_dri_bo_create (gbm=0x5820f0, width=2160, height=1440, format=875713112, usage=0, modifiers=0x695e00, count=1) at ../src/gbm/backends/dri/gbm_dri.c:1150 #5 0x0000007fbf5a49c4 in gbm_bo_create_with_modifiers (gbm=0x5820f0, width=2160, height=1440, format=875713112, modifiers=0x695e00, count=1) at ../src/gbm/main/gbm.c:491 #6 0x0000007fbbac3d64 in get_back_bo (dri2_surf=0x6f4cc0) at ../src/egl/drivers/dri2/platform_drm.c:258 #7 0x0000007fbbac4318 in dri2_drm_image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x6f4cc0, buffer_mask=1, buffers=0x7fffffe210) at ../src/egl/drivers/dri2/platform_drm.c:409 #8 0x0000007fbf5a5318 in image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x70e150, buffer_mask=1, buffers=0x7fffffe210) at ../src/gbm/backends/dri/gbm_dri.c:135 #9 0x0000007fbe4308c4 in dri_image_drawable_get_buffers (drawable=0x6fc730, images=0x7fffffe210, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:339 #10 0x0000007fbe430c44 in dri2_allocate_textures (ctx=0x614b30, drawable=0x6fc730, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:466 #11 0x0000007fbe435580 in dri_st_framebuffer_validate (stctx=0x714160, stfbi=0x6fc730, statts=0x6f2660, count=1, out=0x7fffffe3b8) at ../src/gallium/state_trackers/dri/dri_drawable.c:85 #12 0x0000007fbe7b2c84 in st_framebuffer_validate (stfb=0x6f2190, st=0x714160) at ../src/mesa/state_tracker/st_manager.c:222 #13 0x0000007fbe7b4884 in st_api_make_current (stapi=0x7fbf0430d8 <st_gl_api>, stctxi=0x714160, stdrawi=0x6fc730, streadi=0x6fc730) at ../src/mesa/state_tracker/st_manager.c:1074 #14 0x0000007fbe434f44 in dri_make_current (cPriv=0x703c20, driDrawPriv=0x704490, driReadPriv=0x704490) at ../src/gallium/state_trackers/dri/dri_context.c:301 #15 0x0000007fbe42c910 in driBindContext (pcp=0x703c20, pdp=0x704490, prp=0x704490) at ../src/mesa/drivers/dri/common/dri_util.c:579 #16 0x0000007fbbabab40 in dri2_make_current (drv=0x69d170, disp=0x69c6e0, dsurf=0x6f4cc0, rsurf=0x6f4cc0, ctx=0x70cb40) at ../src/egl/drivers/dri2/egl_dri2.c:1456 #17 0x0000007fbbaa8ef4 in eglMakeCurrent (dpy=0x69c6e0, draw=0x6f4cc0, read=0x6f4cc0, ctx=0x70cb40) at ../src/egl/main/eglapi.c:862 #18 0x0000007fbf5736ac in InternalMakeCurrentVendor (dpy=dpy@entry=0x614fb0, draw=draw@entry=0x6f4cc0, read=read@entry=0x6f4cc0, context=context@entry=0x70cb40, apiState=apiState@entry=0x6fc940, vendor=0x6975f0) at libegl.c:861 #19 0x0000007fbf573764 in InternalMakeCurrentDispatch (dpy=0x614fb0, draw=0x6f4cc0, read=0x6f4cc0, context=0x70cb40, vendor=0x6975f0) at libegl.c:630 #20 0x0000000000403640 in init_egl (egl=0x5805a8 <gl>, gbm=0x580528 <gbm>, samples=0) at ../common.c:263 #21 0x0000000000403c1c in init_cube_smooth (gbm=0x580528 <gbm>, samples=0) at ../cube-smooth.c:225 #22 0x0000000000408618 in main (argc=1, argv=0x7fffffe8d8) at ../kmscube.c:145 Fixes: 1ce5d75 freedreno: core buffer modifier support Signed-off-by: Rob Clark <[email protected]>
This might be a long shot - more a question than a bug - but how much work would it take to be able to advertise GL 2.x support? I have no idea how many of the extensions missing from OpenGL ES 2 are needed for GL 2, let alone what it'll take to implement them, but from what little I've read - specifically about the Intel GMA950 drivers going from only supporting GL 1.5 to 2.1 - coding support for occlusion queries might make this possible.
Just a hunch from a quick review of the code. I wish I understood half of it...
The text was updated successfully, but these errors were encountered: