973 Commits over 274 Days - 0.15cph!
Update: Replace old TextTable with new
- Also did a light convert of `GetPlayerListPosTable` - it was able to handle 300 players instead of 200 in 4.7ms
- Updated ServerProfiler to exclude new System.Text.Json assemblies (and a couple extra bits) - built from f5b849e4
Tests: synthetic test in editor via constantly running getplayerlistpos command for 300 players, built server and client standalones(win64), booted server standalone
Clean: remove extra comments and finish TODOs
Just gotta swap out the implementations and test it all builds/runs
Tests: ran tests
Optim: store all row values into a continuous aray
- Inserts are now 50% faster than the old table, 99% less allocs
- ToText is 10% faster than old table, 99% less allocs
- ToJson is 47% slower than old table, 99% less allocs
Json serialization is a head scratcher - need to look through the utf8jsonwriter.
Tests: unit tests
Optim: don't serialize bool in memory - we can rely on const strings instead
Brings us a smidge closer to original api, but still slower(-12%).
Tests: unit tests
Optim: allow pre-allocating of columns and rows
- Rows were supported previously, but I didn't realize the call was going to the params overload, breaking the optim
Inserts are 22% faster now, with 6 allocs per test run (except first run, for what-ever reason - need to explore) instead of previous 12k
Tests: unit tests
Update: update perf tests to use new apis
Rip#2 - was testing old compat apis. And suprisingly, they were faster than new APIs. Well, there's still options that I haven't tapped into.
Tests: unit tests
Optim: allow user to provide an optim hint if we'll be serializing to json
This allows to skip table alignment logic. Still no effect on json perf tests(wat), I'm suspicious.
Tests: ran the unit tests.
Optim: disable json validation when streaming it with the new table
Surprisingly has 0 effect on 4k perf - the bottleneck must be somewhere else.
Tests: ran unit tests
Bugfix: Fixing json serialization
- Switching to System.Text.Json as it allows steamed serialization
- Updated binaries, System.Text.Json deps got stale
Text serialization is 2x faster in 4k perf test(62.5% less allocs), Json serialization is 12% slower(20% less allocs)
Tests: ran the unit tests
Bugfix: Fix ToString not aligning correctly for new table
- Also increased correctness test data set to 128 rows
Float size calculation could be improved, but I'll leave that till later as it's not trivial. Now to fix json serialization
Tests: ran the unit tests
Update: add deferred formatting overloads to TextTable
- Also expanded tests to cover backwards compatibility checks as well as new API correctness checks
- Refactored tests as they were becoming unwieldy
New API correctness checks are failing - will fix next.
Tests: ran new unit tests
Optim: Pool internal lists
- This is a breaking change - can no longer double call `ToJson` or `ToString` as we clean up resources during it
New version is 12k allocs at 4k records test vs 16k allocs. Still loosing on time, but hoping future serialization streaming changes will catch us up.
Tests: ran the unit tests
Update: store type unions instead of strings as row cell
Right now this is a pessimization(-25% perf on 4k "Record" test) - we do more work to manage it, but it should open up more optim opportunities.
Tests: ran the now correct unit tests
Bugfix: fixing unit tests
Rip. Missed the fact that unit tests were runnin on empty data. At least there weren't any issues with previous optims.
Tests: rajn the unit tests
Optim: avoid row allocations
Tests: ran unit tests
Optim: get rid of small Column allocs
- also removed all private qualifiers
Tests: ran unit tests
Update: set out optim plans
- reuse string builder via pool
- Renamed TextTableOriginal to TextTableNew (otherwise I'll accidentally break the game before I mean it)
Tests: none, trivial changes
Update: initial test setup to validate optimizations for TextTable
Tests: ran the new unit and perf tests
Merge: from eventrecord_allocs
- Reduces the number of allocations caused by our server-side analytics
- New "analytics.small_buffer_send_limit" persistent ServerVar to reduce task scheduling overhead. Set -1 to return original behavior.
Tests: ran existing analytics unit tests, booted server in editor.
Merge: from main
Tests: none, no conflicts
Clean: remove my testing setup
Test: none, trivial change
Optim: send small server-side analytics events using the same task thread
- Controlled via analytics.small_buffer_send_limit - to disable, set to -1, to enable for everything set to
999999
- Default to 16KB
- Preserved between server restarts
This avoids ~1KB of allocations just to schedule another async task per upload. On busy servers(100pop) this can save 0.8MB per frame.
Tests: booted in editor to check the command presence
Bugfix: EventRecord.AddField(bool) now respects it's param
Lucky for us, wasn't used anywhere outside of tests.
Tests: none, trivial change
Update: adding a couple perf tests for EventRecord
- also removed one of profilign scopes since I don't need it anymore
Used them to check if packing EventRecordField would give any perf benefit, and it's a no - indistinguishable from noise.
Tests: ran the new tests
LOD0 and prefabs setup condenser tanks smalls
Optim: Allocate scratch buffer on the stack instead of thread local mem
Perf tests showed same performance, so we can save on the global allocation
Tests: ran editor on craggy
Optim: avoid scratch buffer round trip
After checking internals, GUID serialization is also alloc-free, so routing through that.
Tests: profiled in editor
Optim: eliminate float/double related allocs in EventRecordField.Serialize
Need to run a couple additional experiments(stackalloc, tagged union), but this part is basically done.
Tests: ran in editor, observed in profiler that no more allocs are happening in Serialize for small records
Optim: EventRecordField - use thread local scratch to avoid GUID serialization allocs
Tests: validated value via debugging, unity profiler showed no allocs during Serialize(CSV) call
Update: hacky EventRecord profiling setup to track allocations
Will need to discard this before merge
Tests: ran in editor
Bugfix: Use valid index in WaterTestFromVolumes
Tests: detected during staging demo playback with useparallelupdatejobs - reran the demo, no more NREs
Merge: from parallel_validatemove
- Fixers a couple rare bugs leading to missing data from FullServerDemo recordings
- More work on BasePlayer.SErverUpdateParallel, still disabled
- Editor-only: Added a couple unit tests
- Editor-only: ServerDemoPlayer - disable error spam during demo playback, improve log format
- Editor-only: ServerDemoPlayer - automatically authenticate connections during demo playback
Tests: played back demo from staging server, recorded a couple new demos in local editor
Merge: from main
Tests: none, no conflicts
Bugfix: Initialize WaterSystem coarse grid on clients
- Also make it safe for scenes that don't have a water setup (like playground)
Tests: tested with a separate client connecting to server - saw that Client was initializing everything correctly in the right order
Bugfix: ensure WaterCollision grid is setup with right terrain dimensions
Tests: In editor, started on craggy - size matched. Started on procgen - matched. Played demo - matched. In all cases validated initialization order to confirm no colliders/volumes were added too early. Exported PNG of the grid.
Bugfix: FullServerDemos - transient entity recording fixes
- Use DemoCount instead of ChunkIndex when determing to send snapshots - ChunkIndices reset, so they can rarely overlap for an entity and cause it to not send a snapshot
- Reset counter on prefab pooling - previously it could cause skipping of transient entity snapshotting
Tests: recorded a bunch of demos in editor
Bugfix: ServerDemoPlayer - skip the auth flow when playing the demo
- also index packet logging messages - makes it easier to sort out order of things in editor logs
Seems the simpler thing to do for now.
Tests: played back staging demo - much less error spam
Update: ServerDemoPlayer - add Ready message logging
Tests: none, trivial change
Update: ServerDemoPlayer - log more message types
Tests: ran demo from staging server
Update: ServerDemoPlayer - switch to control error reporting
Tried the demo from staging, the error spam is too much and not super helpful, so adding a toggle to disable it.
Tests: Played back the staging demo
Optim: Avoid handling null cases in a batch with only non-null values
- Also updated the test to spawn entities, since now it's a requirement for the func.
This removes the need to juggle data to setup batch operations. Should save a bit of time.
Tests: Ran the updated unit tests
▄▇▅▆▇▊: ▇▆▆▊▉▍ ▄▋ ▊▉▆▇▅▉▋▇▉▄▅▋█▉▆ ▋▇▆▍█ ▅▊ ▅▅▋▌▅▅▄▍▄▆▍█▅▆▋█▇▉▆▋ ▌▅▄▊ ▅▌▍▉▆▇▍▊▉▄▇▄
▉▆▇▊▆: ▉▄▌▌▄▇ ▍▄▌▌ ▇ ▅▇▋▄ ▅▅▊▇ ▅▉▋▊▇▇▌▅▌▋▌▄▅▊▉▆▄▆▋ ▅█▄ █▇ █
Merge: from main
Tests: none, no conflicts
Merge: from profiling_improvements
- makes the Linux binaries compatible with more distros (Ubuntu 20.04, Debian 12)
Tests: ran in Ubunti 20.04 and 24.04 via WSL. Took snapshots and opened them in perfetto.
Update: ServerProfiler now usable on older Linux distros
- Binaries built using revision 45d79338
Should work on Debian 12.
Tests: Ran on WSL Ubuntu 20.04 and Ubuntu 24.04 and took snapshots - all worked.
Merge: from main
Tests: none, no conflicts
Merge: from profiling_improvements
- Fixes ProfilerBinViewer to display all available threads and fix invalid callstack depth calculation
- Fixes a bug that would prevent json from being generated on busy servers
- Fixes a bug with timelines being very-slightly out of sync
- Optim/Bugfix to filter out all constructors from being profiled
Tests: a lot of exports in the edittor and a bit of forced "bad" cases
Bugfix: ServerProfiler - properly filter out constructors
- Due to a typo (missing .) it would never match constructors and never filter them out
Using release libs built from ec8c5522
Tests: snapshot in editor on Craggy - confirmed no contructors were recorded.
Bugfix: ServerProfiler - sync up non-main-thread timelines to main thread
Previously it was possible to have a small gap at the very start of the snapshot on non-main thread views.
Tests: exported a snapshot in the editor.