616 Commits over 184 Days - 0.14cph!
Clean: DemoServer - removing custom tick vizualization logic
It should be via IDemoAnalyzers, rather than some randomly sprinkled seasoning
Tests: editor compiled in Server mode
Update: ServerDemoPlayer - players now move based on tick history
- Added recording of userIds to full server demos (need to match against players in saves)
- Added a callback for when world loading is done to do any additional logic (in demo case, matching players)
Tests: played back a 30s demo from Craggy
Update: ServerDemoPlayer - now loads a save associated with the chunk
Doesn't actually move the player entity, but it's there. Will investigate what I'm missing next.
Tests: played back demo, saw logs confirming spawning of 330 entities from save, and saw player entity in the world.
Bugfix: FullDemoServer - properly drop packets that were recorded before initial world save
Still not sure if it's the best option, as I worry we might miss a couple packets. Alternatively could just clamp timestamp to 0.
Tests: Played back the server demo - managed a full playback(previously it was stuck at start)
Update: SaveRestore - separate save-to-stream processing from main loop
- Done every end of frame
This made it problematic in editor, as our regular saving loop is disabled by default in editor.
Tests: recorded full server demo with 1 min chunks - saw the saves being generated for each chunk
Clean: FullServerDemo - separate benchmarking code to a partial class
Tests: editor compile in Client, Client+Server and Server modes
Buildfix: don't leak server-only callbacks for IDemoProcessor
Tests: compiled in editor in Client mode
Bugfix: FullServerDemo - record server info in editor
- Removed previous static globals
Reimpelemented via IServerCallbacks so that we have 1 point of modification instead of multiple
Tests: Recorded a session on Craggy in editor - generated json had the data
Update: FullServerDemo - create a save when starting a new chunk
- Moved chunk logic outside of packet serialization
- Using double buffering of packets when performing a save, to sort which packets go along which save/chunk
- Packet timestamps are now resolved during recoridng time, not during demo thread's serialization stage(as that can be deferred and out of sync)
- Moved all demo-related logic from BaseNetwork to BaseNetwork.Demos
Mostly done with the recorder side.
Tests: Ran around on craggy with editor's autosave enabled and at max frequency (1 second), and demo recording at min chunk length(1 min).
Update: Expand IServerCallback to include on-demand-save functionality
- Changed relevant calls to propagate the implementation
FullServerDemos will need to invoke these for relevant chunks
Tests: compile only (though this is a torn submit - FullServerDemos will come in next cl)
New: SaveRestore can write to streams on demand
- Refactored common functionality betweeen save-to-file and save-to-stream
- Added thread-safe stream queue for stream requests
- Added callbacks for when stream writing is complete
Working on server demos having a save for each of it's chunks - this is a part of a larger submit.
Tests: using a save of procgen map from staging(140k entities), tested manual full save via rcon, default automated save(every min) + frequent automated save (every second). Loaded all saves - no errors
Update: DemoServer - reconstruct messages from demo stream and pass to server
No warnings or errors during playback, but it doesn't do much, since main player entity is not present.
Tests: played back the recorded demo from craggy - no new warnings or errors
Update: DemoServer - able to process server demo stream
- Added another utility(write u64 to buffer & get length) to ProtocolParser (need to push it upstream)
- Fixed a bug with incorrect header read
- Fixed multiple bugs related to incorrectly caching the first chunk/entire packet from the stream
- Fixed invalid handling of connection counts
- Fixed invalid tracking of progress
- Fixed invalid recording of packet's timestamp
Next up need to hook up data stream parsing into known messages to see how it behaves.
Tests: recorded a demo on Craggy while riding the zipline - was able to stream the demo back
Update: DemoServer - adding full server demo reader
- Also renamed some files, as DemoServer.ServerDemo is a bit silly
- Also add timetamp to server demos packets (need to update other deps)
- Added a utility method to ProtocolParser (need to update the standalone lib)
Tests: untested, will check in next update
Update: DemoServer - Isolated all client demo logic into it's own player
This is prep for full server demo support.
Tests: played back a short craggy demo - it went through the entire thing without issues. Tried without demo - it started as expected
Buildfix: removing unused variable
Tests: editor compile
Merge: from profiling_improvements
Further exclude small methods/utility classes that are fast 95% of the time.
Tests: Took a snapshot on a defualt ProcGen map in Editor(Client+Server). ~13% uncompressed json reduction.
Update: more profiling exclusions
- Don't track NetRead and NetWrite
- Dont' track Facepunch.System's containers (including pooling), StringPool and ArrayPool
- Don't track EntityRef
- Don't track all Enumerators (previously only Facepunch's was excluded)
- Don't track all GetHashCode
- Don't track TimeWarning (debug-only calls, but can be frequent)
Tests: Took a snapshot of default procgen map in Editor(Client+Server), confirmed about 13% reduction in uncompressed json size.
Merge: from main
Tests: none
Update: DemoServer - ripping out fixed timestep logic
After experimenting with slowing down playback to bellow play speed, it did reduce the number of violations, but timing inconsistency between demo playback and server simulation leads to more issues.
Tests: none, simple change
Update: DemoServer - implement fixed step playback
- Should keep demo stream consumption stable
Right now we're streaming too much data(at 30hz with ~200hz editor sim), which trips up a number of violation checks. Going to try tweaking the number to see if it helps with reproducable results.
Tests: played the demo twice - the step count was the same, but the result numbers were different.
▍▉▇▍▇█: ▅▄▉▋▋▇▉██▆ - ▅▋▍▆▌ ▆▌▉▍▋▆█ ▌█▋▆▌▄▋▄▋ ▉▌▅▇█▋▇▇▆
- █▌▌▌ ▍█▌▄▍▋▊▋▄▊▆ ▋▆▄▍██▉▊ █▉▊▍▋▍███ ▆▊▆▇▉▍▉▇ ▇▋▄ ▍▇▍▉ ▅▌▊▆▊▇▌▌▍▅.
- █▆▇█ ▄▅▉▇▆█▇ ▇▍▅▆▉▊▅▊▉ ▋▅▉▅▋▌-▆▅▋▄ ▇▌▇▆ ▅▋▆▌▆▊█▊▅█▅▌▍
█▄▅▊▋▅█ ▇▆▋▆ ▊▆▌█ ▉▅▅▉ ▍▊▄█▇▌▍▄ ▄▄▄▄▍▍▍▌▅ ▊▇▊-▋▍▌▌▍▍▅▄▋▄▄▄ ▇▉▍▇█▉▋(▌▋▄▇▆▆ ▊▆▋ ▇▄ ▆▍▋█▌▄▋ ▅▉▇▊▌▇▄▄ ▄▊▇▄▉█ ▊▌▉▌▊▊&▉▊▇▋█▋▋ ▋▅▆▍) - ▋▍▊▆ ▄█ ▆▍▋▆▆▆ ▍▌▋ ▋▊▊ ▅▅ ▉▅▇▅▉▅ ▊▆ ▅▊▆▍▋▋ ▉▄▍▆▊█▇ ▍▍ ▇▆▇▇ ▌▊▍▄▅▍ ▄▇▊▉█.
▊▌▋▄▋: ▅▍▇▉▆▆ ▊▌█▍ ▋ ▌▉▅ ▄▌▆▌ ▋▉▅▉▅ ▍▊▇▍-▉▉-▉▋▊▇ - ▅▌▇ ▍▌▅▊▄▅▊▅▆ ▅▋▉▉▉▇▄ (▄▉▊▉ ▌█ ▌▌▇▄ ▇▌▆▅▉█▆▊▍█)
Update: DemoServer - bypas failed validation
- We can't always reconstruct correct tick history in some situations, so instead we'll use them as data to compare against
Tests: played the opriginal long demo - 1711 total violations across ~18 players.
Update: DemoServer - rudimentary tick visualization using gizmos
- Temporrary while working on server demo reconstruction - will rip out once done with feature
Tests: used on a demo recording on Craggy + the original demo that started it all.
Update: DemoServer - spawn entities with the right initial flags
Turns out I had doors in a base to be closed on spawn, leading to tick violations - this fixes it. There's more violations to go.
Tests: Played the demo, checked that the relevant door is now open.
Update: DemoServer - hook up metabolism and make every player invincible
I thought metabolism would fix the drowning of main player, but the recording info contains empty oxygen. Instead, we treat every player as invincible unless there's a replication message to destroy them.
Tests: Played the demo till the end - no more logs on main player drowning
Update: DemoServer handles a number of RPC messages
- Only propertly implementing model flags for now
- Adding a bunch of RPCs to ignore to avoid heavy spam during playback
- Also renaming player game objects during playback to make it easier to track and inspect their state
This revelas that during playback we're triggering a bunch of tick violations, which prevents position updates. Need to figure out how to deal with them.
Tests: ran the same demo, this time with warnings not filtered out - once map loaded, the rate of warnings was decreased substantially.
Clean: removing no longer relevant comment
Update: DemoServer improvements and fixes
- All ticks are now accepted
- exposed an editor only API to inject ticks (avoid serialization roundtrip)
- cleaned away tick logging - it generated too many logs
Ticks are now caught, which is nice, but it looks like it's not validating them all outside of demo playback (saw only 2 players doing it on a perf capture). That'll be next.
Tests: added temp debug assertions that would catch any discarded tick - played the new demo, and there were no more assertions.
Update: DemoServer - split ticks by distance instead of time
- Splitting by time didn't guarantee that they were in valid distance ranges
- Also handle case where we get positional data while the player is still initializing
Getting closer, according to logs most ticks get accepted, but there are still a bunch that get filtered out - investigating.
Tests: played the same demo, observed the logs.
Update: Server-Editor tries to synthesize position ticks for other players in client-demos
- Also supports movement of other, non-player entities
- Only handling positions for now
Doing this to allow for more thorough testing. Some ticks get rejected despite being in the same position - need to investigate why.
Tests: played the same demo as before - checked logs to see the injection and acceptance of ticks.
Bugfix: more NRE reductions in server-demo
- Skip VoiceData and other messages that we can't support in editor environment (or don't want to)
- Properly "disconnect" player when entity is being destroyed
- "shutdown" the demo server when at the end of the demo to avoid unnecessary replication attempts/NREs
This brings down NRE count during playback and shutdown from 40+ down to 4. Next up need to figure out if Tick processing works correctly (it ticks, but main player doesn't move).
Tests: played back the same client demo, saaw the reduction in errors
Bugfix: no more duplicate players when playing a client demo on server-editor
Now there's an issue with disconnecting/destryoed players - about 8 NREs about acessing something dead during BasePlayer.ServerCycle
Tests: played the same demo - max players was 20 instead of 1k
Update: properly initialize players when playing a client-demo in server-editor
- Also log when creating a main player
- Report kick reasons as errors
No more unexpected kicks for players. But, looks like we're duplicating players - by the end of playback we had 1k players, which is much more than I expected
Tests: played back the same demo to completion, accumulated errors are only related to some invalid packets that we don't care about (like Voice)
Update: Server-editor is able to see ticks from the player when playing a client-demo
- now also handling flag messages
- skip server demos and warn user that it's not supported for now
There are a couple things left to investigate and validate - why the kicks happening for being under terrain, whether I can restore full initialization flow
Tests: ran the same demo, was able to verify that main player is identified and it's tick history is being stepped through
Update: Server demo playback now creates entities on palyers as it first encounters them
- Added demo progress logging
- Avoided a number of reasons for kicking (as we don't fully setup entity simulation)
I can see more activity now - next up is making sure the important history is also replicated/present.
Tests: played the same demo from before - logs confirmed players were present.
New: Editor-Server can playback a server demo
Mimics how client demos are played back - streams commands to the server for execution. Currently doesn't spawn players/has some entities missing - that's next to investigate
Tests: Took an old 5 min demo and played it back until it stopped the editor play session.
Merge: from profiling_improvements
Avoids recording methods that are tiny/fast - helps with overhead.
Tests: in editor on Craggy generated a new snapshot and opened in Perfetto, couldn't find my methods.
Merge: from main
Tests: none
Update: Further reduce what methods we annotate
- Removes get_* property accessors, as they are frequent but usually quick
- Removes various storage classes and math utilities (ByteExtensions, BitUtility, Facepunch.System.Enumerator, all of Unity.Mathematics, etc)
- Removes operator invocation (any op_* method)
- Removes comparison method calls (as they are usually quick)
This should reduce performance degradation in tight loops that frequently invoke these methods and produce smaller snapshot(6.7mb -> 6.1mb).
Tests: in editor on Craggy generated a new snapshot and opened in Perfetto, couldn't find my methods.
Merge: from buildingprivilegeretrotool_recycling
Fixes invalid pooling of protobuf type when replicating data.
Tests: On Craggy setup a tiny box base and placed retro cupboard - before fix it immediately reported negatives via pool.print_memory, after fix - stayed >= 0
Bugfix: don't flood pool with ProtoBuf.BuildingPrivelegeRetroTool
Tests: On Craggy setup a tiny box base and placed retro cupboard - before fix it immediately reported negatives via pool.print_memory, after fix - stayed >= 0
Buildfix: define symbol on Mac Server
Tests: compiled editor, then compiled linux DGS
Merge: from profiler_improvements
- Adds linux support (tested on Ubuntu
22404 via WSL)
- Optimizations for JSON export
- Added debug utility to export binary snapshot - run `perfsnapshot <delay> <name> <frames> <shouldBinExport>`
- Added Tools/Profiler Bin Viewer, an editor only tool to inspect binary snapshots
- Reduced default frames captured to 4 from 10
- Profiler now skips annotating UnityEngine.CoreModule methods (reduces capture overhead)
- Works around Perfetto visualization issue with Complete events (https://github.com/google/perfetto/issues/970)
Tests:
- Exported a number of editor snapshots with binary snapshots to test bin viewer
- Using WSL, tested exporting a snapshot on Ubuntu - 3k procgen world
Merge: from main
Tests: editor compiles
Bugfix: Workaround Perfetto's "Complete" event hierarchy bug
- Reported issue on their repo: https://github.com/google/perfetto/issues/970
Tests: exported snapshot from a linux server (running on WSL Ubuntu), 3k procgen world. Exported from editor as well.
Update: Binary export no longer pre-processes the stream
- Saves time on the export
- Also added if-deffed out extra checks, disabled by default
My previous checks were wrong and produced false positives. Also, think I got an idea what jumbles the json vizualization - will fix in next CL.
Tests: used the extra-debug version to export linux snapshot - it succeded
Update: ProfileBinViewer - report found exceptions in thread stream
Still looking for why things are wrong with linux snapshot
Tests: opened a borked linux snapshot
Buildfix: Disable ProfileBinViewer if we're not in Server mode
Tests: switched to Client in editor
Update: ProfileBinViewer now shows thread summary
Tests: opened a snapshot from editor