branchrust_reboot/main/triggerparentdelayedexit_optimcancel
63 Commits over 31 Days - 0.08cph!
Update: set TriggerParent.TickMode to 1 by default
- codegen
Tests: booted up craggy in editor, checked that it's set to 1 and called at boot
Merge: from main
Tests: compiles
Merge: from main
Tests: compiles
Clean: remove TriggerParentDelayedExit.allow_tick_skipping support
- codegen
Originally was an optim, but discovered that I had a bug that always stayed disabled. If we need it, I'll reimplement it for TickMode 1
Tests: compiles
Update: Tickmode 1 - add allowtriggersleeping optim killswitch
- codegen
Tests: built a boat, set tickmode 1, checked in profiler what runs. set AllowTriggerSleeping 0, checked profiler - it started running queries
Optim: TickMode 1 - allow triggers to sleep if they don't move on specific axis and internal entities also don't move on specific axis
- took out internals of BaseEntity.HasMovedInLS to BaseEntity.ComparePos(Vec3 from, to)
- Player Boats now set interest in XZ when alive, XYZ when sinking
- Buildfix for SERVER only code (whoops)
This brings 100 boat test to 0.25ms (down from 1.6ms, -85%), or down from 3.1ms, -92% from TickMode 0
Tests: built a boat, set tickmode 1, jumped on and off - got unparented. turned on the engine, jumped off - got unparented once in the water. noclipped above with engine on - got unparented after a delay. Spawned 100 boats.
Update: BaseEntity now remember it's last LS position and can report on which axis it has moved
- added unit test for it
Needed for upcoming parent trigger optim
Tests: ran unit test
Optim: reduce branching in BaseEntity.HasEntityInParents
Saves a tiny amount (60micros for 600 checks), but the change is relatively trivial and is used in a bunch of places
Tests: spawned 100 boats with 3 sleepers
▇ ▍▇▋▆▇▉ ▌█▅▉▍▌▅▍▅▌ ▌ ▍▅▆▍ ▅▆▄▆▊▊ ▌▆▆▄▍▅▋▋ ▉▆▊▊▆▊ ▋▍▌▋ ▇█▅ ▋▋▉▆ ▊▋ ▇▋▅▋▄▊▍▍▆▍▉▍▊ █▅▉▉▆▆ ▌▊▅▆▇▆▍▆█ ▅█▊▌▇▇▇▄█ ▉▊▍▇▋▉▍▆ ▋▊▄▄▆▅▇▍▋▅▅█ ▋█▄██▌▋▆ ▊▊ ▆▄ ▄▄▄▊▆▉▌▊▆▄█▆▅█▌▇▇▍ ▉▆▅▉▋ ▄ ▉▉▌▌ ▌▊▊ ▅▇▍▍ ▆▅ ▊▊▆▆ ▉ ▆▊▍ ▌▆ ▄▄█▍▋▄▄ ▍▉▇▍▅▋▊▍▌▉
Optim: TriggreParent.TickMode 1 - batch swimming checks
100 boats now take 1.6ms max (down from previous 2.1ms, -24%) vs 3.1ms (-49%) of TickMode 0.
Tests: built a boat, sank it by shooting, got unparented
Update: expose StableObjectCache internal T[]
- had to introduce approx equals to WaterInfo as unit tests were failing (why now?)
- consolidated common logic to fix missing shoreline check in batched version (we need unit tests as part of builds)
Need it to simplify code around water factor batching
Tests: unit tests
Update: adding a TODO to enable minor logic caching
Tests: none, trivial change
Update: TriggerParentDelayedExit cleanup logic only runs when trigger is disabled
We always ran cleanup when OnEmpty got invoked, even if decided to delay the exit, which disabled delay functionality
Tests: built a wonky ship and jumped around gaps - parenting was in effect long enough
Bugfix: TickMode 1 - add missing overrideOtherTriggers check
Tests: none, trivial change
Update: remove dead vars and cache == null check
Tests: none, trivial change
Update: add paste_line <name> [count = 1] [offset = 1] - will spawn the clipboard in a line using player's facing with offset-sized spacing
- also if pasting players, make sure they run their ForceTriggerUpdate
- codegen
Tests: spawned a 100 boats in a line
Bugfix: TriggerParentDelayedExit::SupportsTickSkipping was always evaluating to false
- switched it to be off by default, as it was likely off during my earlier testing
Tests: none, trivial change
Bugfix: VerifyRays - avoid trying to read hit array outside of working area
I asssumed we always call it with tightly-sized arrays, but it's not the case on client
Tests: shot in the sky with a couple bursts - no NREs
Optim: TraceRaysUnordered - optimize collider validation using burst jobs
- added UtilityJobs.FlipBoolJob and ScatterToJob<T>
This brings costs down across the board:
* 128 rays - 0.37ms -> 0.25ms, 33% improvement (serial was 0.49ms, 49%)
* 1k rays - 2.59ms -> 1ms, 61% (serial was 4.33ms, 77%)
* 8k rays - 18.9ms -> 6.26ms, 67% (serial was 34.79ms, 82%)
Tests: unit tests
Optim: TraceRaysUnordered - run water traces in parallel to raycasts
Not the best impl, but shows improvement for smaller ray counts (where we're not dominated by Verify):
* TraceRaysUnordered - 128rays: 0.44ms -> 0.37ms, 1k+ rays same
* TraceRays - 128rays: 0.56ms -> 0.50ms, 1k+ rays same
Can apply the same to sphere casts as well.
Tests: unit tests
Optim: TriggerParent.TickMode 1 - use cached water and ladder results in UsePlayerUpdateJobs 2 env
- codegen
This should allow us to skip ~30% of the overall runtime. Can be disabled via TriggerParent.UsePlayerV2Shortcuts 0 (defaults to 1)
Tests: none, will check tomorrow once brain is fresh
Optim: TriggerParent.TickMode 1 - avoid unnecessary entity lookups in RunClippingChecks
Microoptim, but why not
Tests: none, trivial change
Optim: TriggerParent.TickMode 1 - skip RunClippingChecks for entities that failed RunCheckForObjUnderFeet
Tests: none, trivial change
Clean: remove a couple TODOs
Was worried about a bug, but deeper scrutiny think the code is correct
Tests: none, trivial change
Clean: refactor TriggerParent.ShouldParentEntitiesJobs
- Added profiling scopes
No functional changes. This makes data flow easier to track and manage, and helps visualizing stages in perf snapshots.
Tests: built a long-boat in tickmode 1, spawend 6 players on it, went to edit mode and back, jumped on-and-off as it was moving
Bugfix: Add missing Native collection disposal in GamePhysics
Audited all of GamePhysics, looks to be the only cases that slipped in
Tests: unit tests + built a boat with tickmode 1
Optim: use persistent buffers for TriggerParent::RunCustomJobsQueue
- codegen
Tests: with tickmode 1 - built a boat, jumped on-off
Clean: rip out tickmode 1 and 2, rename Jobs mode as tickmode 1
- codegen
Out of all versions it's fastest, so only going to continue with it vs baseline 0
Tests: compiles
Bugfix: properly clean-up TriggerParent and TriggerParentDelayedExit when it gets disabled
Reimplements intents of `139965`.
Tests: built & finished a boat, jumped around while moving, put it into edit, finished, jumped around - no errors, no extra invokes
Bugfix: TriggerParentDelayedExit - skip entity-delay logic when trigger gets disabled
This left invalid invokes running when editing existing player boat, inflating the perf cost
Tests: edited existing boat, checked profiler - saw no perf samples gor persistent queue
Bugfix: early out when double-remove happens
Not sure why yet, but saw the same with double-add before, so going to replicate. Also noticed that we can be tracked, but have null-entitycontents, which leads to wasted Invokes, adding overhead - checking
Tests: built boat, spawned npc, went to edit mode - no more errors
Update: Codegen
Tests: ran "check compile errors"
Optim: in tickmode 3, run OverlapOBB and TraceRealm queries in parallel
Should help in many-boats-players scenarios, in theory
Tests: built a boat and jumped around as it was moving
Bugfix: SelectNearestNHitsJob - ensure we emit an end if we couldn't select requested number of hits
- Added couple early returns to avoid div by 0 issues
Tests: rode a boat with tickmode 3
Update: initial full version of TriggerParent.ShouldParentEntitiesJobs
inlines ToClipping so that we can batch OverlapOBB, but doesn't batch expensive-but-rare checks - need to see if we need this during playtest
Tests: built a boat, finished it, enabled tickmode 3, tried jumping on/off the boat
Update: Added GamePhysics.OverlapOBBs
- added basic consistency unit test: TestOverlapOBBsConsistency
Tests: ran unit test
Bugfix: GamePhysics.TraceRealmRays now considers non-entity hits as valid
- fix build failures in CLIENT only mode
- also added option to specify whether to run water query or not
Tests: ran unit tests(C+S and C separately) - they pass
Tests: rework GamePhysicsTests.TraceRealms to work in C+S and C modes
No longer permitting running S realm tests in C editor (as that created impossible spawn scenarios)
Tests: ran unit test in both modes
Tests: add TestTraceRealmRays unit test
Shows that there's a bug with TraceRealmRays - will submit next
Tests: ran unit tests
Optim: Use burst to generate sort jobs
Despite replacing previous optim, saves us another 0.1ms on top of it (4.25ms -> 4.15ms avg for 2k rays). Left a comment that we need alloc-free tasks to optim further(scheduling sort jobs is expensive, but thread safe).
Tests: ran unit tests
Optim: manually kick off sort jobs to worker threads when enough is accumulated
On a 2k TestTraceRays gives us 0.15ms (4.25ms -> 4.4ms). Building jobs using burst might be faster though, gonna try that next.
Tests: ran unit tests
Bugfix: ensure we free counts array only after all sorting jobs are done
Tests: ran unit tests (though they didn't catch it before)
Tests: reduce ray count for Perf.TestTraceRay/-s to 2k
8k is too much, hangs serial rays. 2k is already very bad (3.4s avg per run for serial, 7.1ms for batched)
Tests: reran tests