Attendance:
- Cary Phillips
- Christina Tempelaar-Lietz
- John Mertic
- Joseph Goldstone
- Kimball Thurston
- Larry Gritz
- Nick Porcino
- Peter Hillman
- Rod Bogart
Guests:
Li Ji, ILM
Alex Wells, Intel
Pierre-Anthony Lemieux, Sandflow Consulting
Michael Smith, Intel
Discussion:
3.3.2 release
Kimball: no outstanding issues
Cary will stage a release this weekend
Build test suite, compile against new version, running dynamically linking against old version, if test suite passes, then say it’s ABI compatible
Security policy
What is our stated policy about how many back versions we are willing to patch?
Kimball: 3 vfx platforms worth, if platform skips release, would have to support the in-between releases as well.
Website incorrect, change wording to patch “only upon request” for other versions
RH8 still alive, not sure what version is on their system
May need to help the long-lived RHEL releases too.
Save more conversation on this topic to include Larry.
Imath: Vector index operator, cast union trick
https://github.com/AcademySoftwareFoundation/Imath/issues/446
Kimball: want to understand support of nameless union construct but it’s not part of C++. Not til C++23, we won’t be able to support for a while.
Alex W: Dynamic indexing: what is happening today - implementation takes address of a float, then uses array index. Only legal to dereference within size of object for which you took an address. Anything else is UB.
Alex: Examples using Compiler Explorer to look at generated code:
Example: Take 2 vec3s shove into box, asking for center of box, invokes operator, called Vec3Fix, derived from Vec3. Supposed to be taking 2 vec3s adding together multiply by 5 but only see 1 multiply rather than 2.
Change implementation slightly : take the ‘this’ pointer instead of address of vec3. reinterpret cast to larger type (missed this… 12 bytes instead of 3) now it isn’t UB. reinterpret_cast brings up aliasing issues.
Dynamic array address must be in memory or on stack, greatly affects code gen.
Extra code being generated to store it, then have dynamic access.
Move this code to a GPU or in a vectorized loop SIMD4 wide, setup to avoid SOA: when it’s loading box doing nice wide moves, 4 items at a time, multiplies 4 at a time. But the array access needs a float pointer … has to be pushed back to the stack so we can use the accessor on it. Then has to do array indexing 4 separate times . All because it needed to be stored back on stack.
Impact on GPU - it requires private memory to store it. It won’t be stored in the register file - BIG HIT. Want to see all my loads and stores on global memory pointers, in register file.
“Select implementation” is another option: uses select_when template to only use the vec3 directly. 6 floats read in, writing out 1 float. Select_when gets compiled away. Supporting a dynamic index at all, leaves open to some horrible code gen when you go to SIMD. On GPU requires private memory for dynamic array access - chunks of your variables not stored in register file as you hoped.
With select_when approach, have masking support.
Even if you can make dynamics well-defined behavior, might want to support select instead and not even support dynamic access.
Kimball: is there a way of saying a parameter argument must be a literal constant, e.g. a 0, 1, or 2.
Unfortunately language doesn’t give that to you.
Can declare a const but it makes the compiler do work to optimize away. Relying on the optimizer.
Kimball: big performance degradation in debug mode then.
More examples: intersection box and vector. Box has generic implementation using dynamic array access. Generates a lot of code, not optimized well. Could change these to constant indices and that will give a better debug experience. Could keep generic algorithm if you got it to statically unroll.
Kimball: challenge because we don’t know what people are doing with it in the real world
Kimball in chat: My real fear is that someone has something like
Vec3<float> mat3x3[3];
Vec3<float> &first = mat3x3[0];first[7];
Li Ji in chat: So basically in a very crude way:
#Marco: [0] := this.x; [1] := this.y; [2] := this.z
So nothing is dynamic and the index has to be resolved at compile timeKimball: wondering if we should have a .@ semantics instead.
Cary: in terms of where we are now, we have problematic code that’s been out there 20 years, but recently ran into real world case where it fails. Presumably luck has run out and we need to do something about this. There’s a compiler that gives bad results with this. Doing nothing not a viable option.
Cary: are we committed to coming up with solution that requires no end user changes.
Kimball: union approach
Alex: might depend on rules around nameless unions. Previously what last data member you wrote to was the last data member.
Kimball: issue is requires C++23 to be a viable solution
Alex: way better code gen when dealing with direct data members. Refactored code to get rid of array subscript as much as possible. Select_when approach will get rid of other code paths. Adding support for const indices - const_eval might be able to use with array subscripts so you know at compile time. const_eval is front end of the compiler. Code elimination will get rid of the code on the front end. Will also happen in debug mode, done by compiler. Great way to allow array subscripts access data member directly. Non-const eval, can use selection instead.
Alex: might consider short-term correction, access the ‘this’ pointer instead of the ‘float’ pointer to get rid of one undefined access.
Cary: that will give us correctness at the cost of performance.
Kimball: aliasing issue… because it’s in the lifetime, can we not use ‘restrict’?
Alex: aliasing situation: restrict tells compiler this won’t interact with anybody else. This is the opposite problem - going to access some element of it, that’s array-based. Compiler may think distinct ways of access y element, e.g. tv.y and tv[axis] , are 2 different accesses so it can’t optimize. Compiler, based on types, could think these 2 parts of memory are unrelated. You want the opposite of restrict.
Alex: tv is non-const so it’s going to return a reference. It’s an aliasing issue you might run into, possibly only under SIMD gen. Creating a situation where things can go awry. More correct use of unions you’re not using it to switch types. If you are switching types, putting it through memory. With “select” you can get the best of both worlds.
Alex happy to review a PR, too busy to create the PR.
Kimball: Will check with Larry, need to be sure select works well with gcc.
Cary: continue this discussion next week.
Do reinterpret cast, commit that, correctness fix. Then do select fix… or do we go straight to select fix?
Alex: could have a build option.
Kimball: could do reinterpret cast for this release, then ‘select’ for next release.
Peter: could wrap it in an ifdef for a patch as well.
Cary: make reinterpret cast in release, add select change to main.
Peter: option to go back to ‘this’ option if there is a performance, all inline code, don’t need to reinstall any libraries with the change
Cary: should eradicate all dynamic indexing.
Peter: put comments around the operator with warning
Pierre-Anthony Lemieux
Will send the JPEG 2000 slides to the Slack in preparation for the next call
Security issue - null dereference write, deep scanline
Peter will send to Kimball
hold off on patch release until we get to this issue
Kimball fixed in scanline probably but not deep scanline, will try to fix tomorrow.