973 Commits over 274 Days - 0.15cph!
Bugfix: fix TestWaterInfosConsistency test using 0 sized bounds for WaterVolumes
Noticed a bug while converting internal logic to burst job that wasn't picked up by test
Tests: ran unit tests, now correctly detects an issue - will fix next.
Optim: replace couple internal managed loops with Burst jobs
1k perf test for GetWaterInfos - 2.45ms (vs previous 3.9ms)
Tests: ran unit tests
Update: use GetIgnoreIndirect(vec3, float, ...)
Further simplifies code and reduces the number of temp NativeArrays we need
Tests: unit tests
Update: Use GetIgnoreIndirect(vec3, vec3, ...) to further simplify code
As a bonus this avoids a bit of data repacking and NativeArray creation - there's more to come
Tests: ran unit tests
Update: use WaterTestFromVolumesIndirect to simplify code
This could be a small optimization, since we have to move less data around, but not focusing on that - just trying to reduce complexity.
Tests: ran unit tests
Tests: TestWaterInfosConsistency now generates fake WaterVolumes
Need this since I'm starting to modify the "volumes" path as well.
Tests: ran the unit tests
Update: replacing a bunch of GetWaterInfos internal BufferLists with NativeArrays
Preparing to replace more of managed logic with burst jobs
Tests: Ran unit tests
Optim: replace height fetching with a burst job
Tests: unit tests
Optim: replace initialization with with a burst job
Tests: ran unit tests
Update: replace Spans with NativeArray in WaterLevel.GetWaterInfos
Will allow to pursue using burst jobs internally.
Tests: tests passed
Update: leaving an optim todo idea comment
Tests: not applicable
Update: sprinkle some profiling scopes
Tests: ran unit tests
Optim: use persistent allocs in GetWaterLevels
- I need to properly clean those up at server shutdown, but I'll solve that later
1k waves perf test shows ~100micros savings and no allocs - 1.45ms vs old 1.55ms. This is final optim in the area for now, making us ~80% faster than vanilla managed code(8.6ms).
Tests: ran unit tests
Optim: remove last managed loop that picks between dynamic waves and static water
1k test with waves shaves off ~0.25ms - 1.55ms vs previous 1.8ms
Tests: ran unit tests
Optim: gather coarse distances to shore via indirect batch
1k waves perf test shows another ~0.5ms shaved - from 2.3ms to 1.8ms
Tests: ran unit tests
Optim: grab TerrainHeights via indirect batch
1k waves perf test used to take 3ms, now 2.3ms
Tests: ran unit tests
Update: gathering OceanSim's water heights in indirect way
- Also fixed the OceanSim's GetHeightsJobIndirect job as it works on world positions, not uvs
This allows me to convert the rest of the logic to Burst jobs.
Tests: ran unit tests
Update: Bunch of utility Burst jobs for WaterLevel.GetWaterLevels outstanding jobification
Tests: none, they are not plugged in
Update: Use NativeArray for TerrainTexturing.ShoreVector storage of distances and vectors
- Had to add an editor-only safety check for WaterCamera for a super rare exception
Tests: Played procgen and Craggy in editor. Forced a bunch of domain reloads to validate WaterCamera doesn't break anymore
Update: perf tests for WaterInfo/-s
For 1k sample points - batch version is 4(no waves)/3(with waves) times faster than serial
Tests: not applicable
Update: addding perf tests to GetWaterLevel/-s
- Doesn't have cases that have water volumes, so slowest path not stressed in both cases
On 1k locations batch version is 10x faster than serial(130micros vs 1.18ms) with no waves, with waves - 3x faster(3ms vs 9ms)
Tests: not applicable
Optim: replacing GetWaterLevels hot path with burst jobs
- Converted some internal NativeArrays to persistent, lazy growing ones to reduce allocation overhead
On a 25 player case shows a 25% improvement/5micros, though the sample size is too small(not enough players to check). I need a good way to test this at larger scales, or these 5min waiting times will murder me.
Tests: ran unit tests, ran staging demo multiple times, water checks counters in the expected range.
Bugfix: properly access count from CoarseQueryGridBoundsJobIndirect
Tests: ran unit tests
Merge: from erosion
Bringing across TerrainMap NativeArray conversion that I need for my batch checks
Tests: reran TerrainMap tests
Merge: from main
Aligning with erosion branch parent CL
Tests: none, no conflicts
Update: added GetTopologies jobs for TopologyMap
- Also covered with tests
- Sprinkled [WriteOnly]/[ReadOnly] attributes into jobs
- Fixed constant string allocation in tests
Tests: ran unit tests
Update: adding WaterMap height sampling burst jobs
- Also covered them with tests to validate output against non-job calls
Tests: ran unit tests
Merge: from terrainmap_nativearray2
- Fixed a bug of invalid reinterpret
Bringing across my tests since we're both working on the same thing.
Tests: ran the new tests
Add: covering TerrainMap public api in tests
Prep for switching over to NativeArray
Tests: ran the new tests
Optim: skip issuing 0-length WaterCollision.GetIgnore jobs
Tests: ran unit tests
Optim: make TerrainCollision.GetIgnore and WaterCollision.GetIgnore use indirect Burst jobs
- Also added a bunch of optim TODOs
Starting to build up an indirect collection of methods. Next up will convert related GamePhysics calls
Tests: ran unit tests - they passed. Played back staging demo multiple times with analyzedemo - got comparable in-water counts
New: add WaterStateProcessor for full server demo analysis
- Also redid ViolationProcessor to avoid leaking internal implementation to other files
Tracks how many players across all server frames were in water - using it to track consistency of water checks while modifying internals
Tests: played back staging demo - got consistent-enough results
Merge: froim texttable_allocs
Previously merged into Aux, but Aux2 one seems to be more fresh
Tests: none, no conflicts
Merge: from main
Tests: none, no conflicts
Merge: from concurrentquueue_leak
- Fixes an edge-case on high-pop servers that can cause a 10MB/s garbage allocation rate
Tests: validated fix works via synthetic test, then had a 2-player session on craggy to validate network traffic works as intended
Clean: remove false-sharing todo
Don't have proof of how impactful it is now in this area, so not going to jump the gun for now
Tests: none, trivial change
Undo: pick the right version of ProjectSettings from history
Tests: none, trivial change
Undo: revert ProjectSettings
Tests: none, trivial change
Bugfix/Optim: propagate fix to other ConcurrentQueues in the file
Tests: local 2-player session on Craggy in editors
Clean: remove the hack test
Now that the bug was validated this doesn't serve any purpose
Tests: compiled in editor
Bugfix/Optim: Don't force ConcurrentQueue to allocate new segments on every push
My hack/forced test no longer allocates - now I just need to cover left-over cases of this problem
Tests: on Craggy in editor took a snapshot - no more allocs in the forced test area.
Hack: improved the runaway test
Now I can see it via server profiler - ~90KB across 192 allocs for 192 packets
Tests: Craggy in editor, took a snapshot
Hack: synthetic test to proc ConcurrentQueue memory runaway
Managed to reproduce high memory allocation edge case of ConcurrentQueue. Need to rip it out after applying the fix.
Tests: ran the code and checked state of ConcurrentQueue with a debugger
Merge: from texttable_allocs
- Replacing old TextTable with the new one that allows deferred formatting and avoids allocs
Tests: new unit tests and manual invoke of server.playerlistpos and status commands on Craggy
Optim: replacing old TextTable with the new one
- Updated Server.GetPlayerListPosTable to new APIs
Synthetic test of `playerlistpos` for 200 players on Craggy runs in 0.5ms (instead of prev 5ms) and 99% less allocs.
Tests: Started Craggy in editor with a synthetic test. Also used a couple TextTable rcon commands
Update: make logic match `shouldPadColumns` meaning
it was doing the inverted logic before, but didn't affect tests since they used old values before the rename.
Tests: ran unit tests.
Update: adding extra perf test to track shouldPadColumns influence
- Also renamed isForJson to shouldPadColumns
Shows 6x perf impact between no pad and pad. Makes sense, since for some types we need to do string formatting and that's heavy.
Tests: ran the tests.
Update: Add deferred formatting for more types (uint, long, ulong, double, vec3)
- Extended tests to cover these cases
Tests: ran unit tests
Optim: don't allocate when writing values via JsonTextWriter
- also exposed a `stringify` param that can avoid string conversion
Tests: ran unit tests
Update: revert back to Newtonsoft.Json
- Temporarily reverted the TextTable in use to original version to validate via tests
We have to o many json serialization impls, so going to avoid trying to add a new one. I'll have to check if I can rip out the stale dll or not.
Tests: ran the unit tests