850 Commits over 243 Days - 0.15cph!
Optim: GetWaterFactors is converted to indirect style throughout
- Added an edge case to all tests to validate correct bounds operation(spotted a bug I recently added)
This is last optim I'll be pursuing - 10k points test now runs at ~6ms compared to 28ms serial mode. Next up is to add a couple more benchmarks and go through todo cleanup.
Tests: ran unit tests and played back staging demo twice
Update: Propagating indices to GetWaterLevels
- Updated TestWaterLevelsConsistency to validate various permutations
Tests: unit tests
Update: converting GetWaterLevels to indirect form
- Like previous CL, operates on an internal forward dummy range
- sprinkled ReadOnly accessors in prep for clean up
Time to start bridging the indices across the calls.
Tests: unit tests
Update: converting GetWaterInfos to indirect form
- Not propagating indirection params yet - running on dummy forward range
- Sprinkled ReadOnly usage in prep for cleanup
Tests: ran unit tests
Update: make GetWaterFactors internal NativeArrays persistent
Tests: unit tests
Optim: prepare params for WAterLevel.GetWaterInfos using a burst job
Helps shrink the managed loop, and gives us minor savings - about 0.5ms for 10k case. Also prepares us last indirect conversion.
Tests: ran unit tests
Tests:
Optim: GetWaterFactors now maintains a stable transform cache
- We now use a burst job to gather transform info for all players (ends up being fast enough to not care)
- Updated perf test to count for a "warmup" run, since we incrementally build the transform cache
Previous updates slowed down code from 9ms to 12ms, this optim brings us back to ~7ms for 10k cases and enables more optims
Tests: unit tests
Update: rewrite FinalizeTickParallel in indirect form
- GetWaterFactors interface adapted to take indices, but internals are still in gather->process->scatter
- Expanded TestWaterFactorsConsistency to cover for various indirection scenarios
This simplifies the internals of FinalizeTickParallel and allows for more Burst jobs, but also allows for conversion of GetWaterFactors in indirect mode and state caching.
Tests: unit tests + mul-tiple server demo playback (numbers within range)
Update: use PlayerCache for ServerUpdateParallel
Should enable us to optimize for data persistence - will attempt next CL.
Tests: played back staging server demo twice, got same analytics numbers
▊▄▆▇▉: ▉▋▄▅ ▋▆▊▆▋▅▅_▌▇▋█▆▋▍_▅▄▌
- ▉▅▇▋▍▄▉▅ ▉█▊▌▍▇▌, ▌▉▇▆▉█ ▊▍ ▇█▍█▄ ▊▍▄▅▆ ▋▄▊▊█▋▅ ▅▅▅▊▉▄
▇█▊█▅: ▊▌▉█, ▍▅▄▆▍▊█ █▊██▉▆
▄▅▅▊▆▍: ▇▉▋▊ ▄▆▅▅▄▆▉ █▊▇▌▍█▄▋▍ ▍▆▉▆▌▇▇ ▄▉▇▅ ▋█▊▉▍▌▍▊▌ ▇▉▊▍▅▊▆ █▄▄▊▅▍▆▍
▅▍█▌▊: ▍▅▊▊, ▅▌▍▄▋▌'▅ ██▌▉▇▊▊▅▆ ▄▌▄ █▊▄▌▅ ▊█▄▇▅▍▍
Add: PlayerCache utility
- Comes with unit tests
Intrusive cache that tracks if elements inside moved. Building block to allow persisting state between frames.
Tests: ran unit tests
Optim: Calculate water factors via a Burst job
- Job generates factors off by less than 1mm, so I had to swithc asserts to approx equal
Tests: ran unit tests
Update: Pass NativeArrays to GetWaterFactors instead of Spans
Tests: ran unit tests
Update: replacing BasePlayer.FinalizeTickParallel internal buffers with NativeArrays
- Also renamed WaterFactors to GetWaterFactors (to free up a name for a static array)
This should enable conversion of internals of GetWaterFactors to Burst jobs
Tests: ran the staging demo playback - same InWater and OutOfWater counts
Tests: new PerfSerialWaterFactor and PerfBatchWaterFactors perf tests
Also just realised that all my perf tests are not doing 1k point tests, but 10k. Whoops. I'll keep it.
Tests: ran new perf tests
Tests: add TestWaterFactorsConsistency test
- Converted other tests to use BasePlayer instead of BaseEntities
- Also exposed WaterFactor methods from BasePlayer (other overloads were already public)
Doesn't stress all paths, only the main path that has updated logic
Tests: ran all unit tests
Clean: adding ext to safely clean-up NativeReference
Tests: ran unit tests
Bugfix: don't leak WaterLevel's persistent allocations
Currently only allocating on Server, since no other code actively relies on this.
Tests: ran unit tests. Started Craggy in editor with leak detection - no leaks pointing to WaterLevel statics
Clean: remove dead batch method
Tests: compiles in SERVER+CLIENT
Clean: simplify GetWaterLevels code by using NativeArray.Expand
Tests: ran unit tests
Optim: use persistent buffers inside GetWaterInfos
- Still have a TODO on managing their lifecycle
- imrpvoved NativeArray.Expand to allow skipping of copying and using uninitialized allocs (opt-in)
Very minor effect on timings, but allows us to avoid sync points in managed runtime, should we go towards these optims.
Tests: ran unit tests
Optim: replace WaterInfo resolving amnaged loop with a Burst job
1k sample point GetWaterInfos perf test runs in 1.9ms (previous was 2.45ms)
Tests: ran unit tests
Optim: use Burst jobs to process results from GetIgnore entity heads
Only one managed loop left
Tests: run uit tests (though I don't have a test case for entities partially within WaterVolume)
Clean: replace a magic number with a named constant
Tests: unit tests
Optim: replace secondary query setup managed loop with a Burst job
Tests: unit tests
Optim: recache water heights from WaterVolumes using a Burst job
- also fixes a bug with invalid indexing that I introduced earlier today
Tests: unit tests
Bugfix: fix TestWaterInfosConsistency test using 0 sized bounds for WaterVolumes
Noticed a bug while converting internal logic to burst job that wasn't picked up by test
Tests: ran unit tests, now correctly detects an issue - will fix next.
Optim: replace couple internal managed loops with Burst jobs
1k perf test for GetWaterInfos - 2.45ms (vs previous 3.9ms)
Tests: ran unit tests
Update: use GetIgnoreIndirect(vec3, float, ...)
Further simplifies code and reduces the number of temp NativeArrays we need
Tests: unit tests
Update: Use GetIgnoreIndirect(vec3, vec3, ...) to further simplify code
As a bonus this avoids a bit of data repacking and NativeArray creation - there's more to come
Tests: ran unit tests
Update: use WaterTestFromVolumesIndirect to simplify code
This could be a small optimization, since we have to move less data around, but not focusing on that - just trying to reduce complexity.
Tests: ran unit tests
Tests: TestWaterInfosConsistency now generates fake WaterVolumes
Need this since I'm starting to modify the "volumes" path as well.
Tests: ran the unit tests
Update: replacing a bunch of GetWaterInfos internal BufferLists with NativeArrays
Preparing to replace more of managed logic with burst jobs
Tests: Ran unit tests
Optim: replace height fetching with a burst job
Tests: unit tests
Optim: replace initialization with with a burst job
Tests: ran unit tests
Update: replace Spans with NativeArray in WaterLevel.GetWaterInfos
Will allow to pursue using burst jobs internally.
Tests: tests passed
Update: leaving an optim todo idea comment
Tests: not applicable
Update: sprinkle some profiling scopes
Tests: ran unit tests
Optim: use persistent allocs in GetWaterLevels
- I need to properly clean those up at server shutdown, but I'll solve that later
1k waves perf test shows ~100micros savings and no allocs - 1.45ms vs old 1.55ms. This is final optim in the area for now, making us ~80% faster than vanilla managed code(8.6ms).
Tests: ran unit tests
Optim: remove last managed loop that picks between dynamic waves and static water
1k test with waves shaves off ~0.25ms - 1.55ms vs previous 1.8ms
Tests: ran unit tests
Optim: gather coarse distances to shore via indirect batch
1k waves perf test shows another ~0.5ms shaved - from 2.3ms to 1.8ms
Tests: ran unit tests
Optim: grab TerrainHeights via indirect batch
1k waves perf test used to take 3ms, now 2.3ms
Tests: ran unit tests
Update: gathering OceanSim's water heights in indirect way
- Also fixed the OceanSim's GetHeightsJobIndirect job as it works on world positions, not uvs
This allows me to convert the rest of the logic to Burst jobs.
Tests: ran unit tests
Update: Bunch of utility Burst jobs for WaterLevel.GetWaterLevels outstanding jobification
Tests: none, they are not plugged in
Update: Use NativeArray for TerrainTexturing.ShoreVector storage of distances and vectors
- Had to add an editor-only safety check for WaterCamera for a super rare exception
Tests: Played procgen and Craggy in editor. Forced a bunch of domain reloads to validate WaterCamera doesn't break anymore
Update: perf tests for WaterInfo/-s
For 1k sample points - batch version is 4(no waves)/3(with waves) times faster than serial
Tests: not applicable
Update: addding perf tests to GetWaterLevel/-s
- Doesn't have cases that have water volumes, so slowest path not stressed in both cases
On 1k locations batch version is 10x faster than serial(130micros vs 1.18ms) with no waves, with waves - 3x faster(3ms vs 9ms)
Tests: not applicable
Optim: replacing GetWaterLevels hot path with burst jobs
- Converted some internal NativeArrays to persistent, lazy growing ones to reduce allocation overhead
On a 25 player case shows a 25% improvement/5micros, though the sample size is too small(not enough players to check). I need a good way to test this at larger scales, or these 5min waiting times will murder me.
Tests: ran unit tests, ran staging demo multiple times, water checks counters in the expected range.
Bugfix: properly access count from CoarseQueryGridBoundsJobIndirect
Tests: ran unit tests