850 Commits over 243 Days - 0.15cph!
Update: FullServerDemo - try to handle transient(not in the savefile) Entities
- Record an entity snapshot once per demo chunk if sending specific messages - Positions, Flags, RPC(both server-sending and recieving)
- DemoTransientEntities are not sent if we've sent an entity snapshot via Entities message during active chunk
- Fixed not handling demo messages that can have 0 connections associated with it
- Fixed incorrectly aborting on "unknown" message type when types are related to demo-specific messages.
In theory this should allow us to avoid missing entities scenarios when playing back, which helps with determinism. But the server-recieving-RPC scenario needs more work - we record the transient snapshot after we recorded an RPC from the client, so I need to implement look-ahead in demo playback (or do what client demo playback does and index the demo).
Tests: on Craggy in editor recorded a full server demo where I'm looting a wood-collectible. During playback it spawns, but doesn't get removed
Buildfix: if-def out editor only variable
Tests: build standalone server locally
Update: adding missing comment about layerMask in batched CheckCapsule
Tests: none, trivial change
Bugfix: found another invalid scatter
Tests: none, will come next
Bugfix: Fixing incorrect scatter logic in batched GetIgnore
Really need to cover this path with tests
Tests: none, will come next
Clean: removing -Batch suffix from APIs
Doesn't really add any additional clarity, and makes the code a smidge shorter
Tests: editor compile
Merge: from main
Tests: none
Merge: from profiling_improvements
Further reducing overhead by recording 25% less data overall (based on 350p release server snapshot)
Tests: Took a snapshot on Craggy in Editor
Update: ServerProfiler - Filter out ~25% of profiling scope by further removing tiny/cheap methods
- Using binaries built from f27f0281
There are some controversial changes:
* Filter out Newtonsoft.Json - we can't modify it's internals anyway
* Filter out setters (set_*) - overwhelming majority are cheap, but hides expensives ones. But we'll see the nested calls if there are any.
* Filter out IPooled callbacks - half of them are not implemented (usually LeavePool), and most fo them are cheap
There are bunch more, but not worth bringing up details.
Tests: on Craggy in editor
Merge: from main
Tests: none
Update: WaterLevel - rewrite WaterSystem GetIgnore serial chunk in batch form
This eliminates the other "jobify/batch" todo. Think I'll do a bit of cleanup, add more tests and get ready to merge to staging for stewing.
Tests: ran unit tests. Ran araound on craggy and checked a couple results of water checks.
Update: WaterCollision - add batch version for GetIgnore that does an overlap capsule test
Needed to get replace last serial chunk in batched WaterLevel.GetWaterInfos
Tests: none, compiled in editor
Update: GamePhisycs - added OverlapCapsule and CheckCapsure that use Unity jobs
Turns out I'll need it to support WaterCollision.GetIgnore(start, end...)
Tests: none, just compile
Bugfix: When spreading WaterInfo results, prevent trying to do a sawp between same indices
- This leads to invalidating WaterInfo, which returns invalid info
I need to extend tests to do checks with Entities, as this case/branch isn't covered currently
Tests: enabled parallel player updates and logged cached water checks - now it returns valid water where expected.
Bugfix: replace Temp allocators with TempJob allocators for physics jobs
- Added misisng resource cleanup
This partially fixes test failures.
Tests: ran the WaterLevel test checks
Update: WaterLevel.GetWaterInfos - replaced a serial GetWaterInfoFromVolumes call with a batched one
- should save us time by parallelizing physics checks, if they do get used
This removes one of 2 "batch" TODOs in the method.
Tests: ran unit tests, some are failing - will fix next.
Update: added batched WaterLevel.GetWaterInfoFromVolumes
- there's a number of inefficiencies - left TODOs to address later
Continuing to propagate batching support upwards, only got 1 level left to swap out a serial query with batched.
Tests: none, editor compile
Update: Added batched version of Vis.Components
Still building up to engage it as part of main parallel logic. Also still need to cover with tests.
Tests: compiles in editor
Clean: removing hanging meta files that I've commited earlier
Tests: open editor
Merge: from main
Tests: none
Merge: from hackweek_serverprofiler_memory
- Records allocations from all threads
- Displays allocations on separate thread tracks + a graph of total allocations per thread
- Allocations now have last/current method, allocated type and size in the mark's "arguments" - see "current selection" in perfetto
- Allocations now also duplicate in the executing thread to make it easier to spot where exactly in the method it was allocated.
- Graphs of working set and virtual set memory for the entire process
Tests: multiple snapshots in editor on Craggy, single in standalone debug linux server via WSL on 3k procgen map, single in standalone release windows server on 3k procgen map with a harmony mod
Clean: removing unnecessary using
Tests: compiled in editor
Clean: deleting empty folder
Left-over from the old merge that I didn't clean up properly
Tests: none
Update: ServerProfiler - switching to release libs (10b88e05)
Tests: tested stanadlone linux server in debug on a 3k procgen map, windows standalone server in release on a 3k procgen map and a harmony mod
Merge: from main
Tests: none
Clean: remove a couple unnecessary TODOs
Tests: none, trivial change
Clean: ServerProfiler - less IntPtr, more Native.MonoMethod
Think I didn't have enough sleep when I was writing it at first.
Tests: exported snapshot in editor on Craggy
Update: ServerProfiler - emit the current/last managed method when recording an allocation
- using debug binaries (10b88e5)
Should reduce/eliminate the need to manually hunt for the allocation in the methods view.
Tests: snaspshot in editor on Craggy
Update: ServerProfiler - emit a graph of per-thread total allocation size
- Main thread reset the counter every frame, worker threads over their entire lifetime.
This makes it easier to see at a glance how much we allocate per frame/thread.
Tests: exported snapshot in editor on Craggy
Update: ServerProfiler - update binary exporter & viewer to handle new Allocs format
- Made ProfileBinViewer's search work with alloc's names
Tests: made a debug export of Craggy in editor, then opened it in the ProfileBinViewer
Clean: ServerProfiler - remove not supported thread sorting keys
Tests: none, trivial change
Update: ServerProfiler - remove the class name and duplicate size into args of a mark
- They are part of the "arguments" view when you click on the allocation mark
- Also assigned all allocations the "A" category so that they can be easily filtered out via queries
Turns out cname (controlling colors) is also unsupported via legacy json import (see issue https://github.com/google/perfetto/issues/208 and linked ones), so if we modify the name, we lose the uniform yellow color for allocations that makes it easier to spot. This also allows to run queries on top of these args more easily.
Tests: exported snasphot from Craggy in editor
Update: ServerProfiler - track and emit allocation class and array size
- using debug binaries (9f4a07f8)
Will need to update binary export and fix the bin viewer tool
Tests: snapshot on craggy in editor
Bugfix: ServerProfiler - properly name worker thread's allocation track
- Another one of the "did it right first time, simplified, now it's borked" cases
Tests: snapshot in editor Craggy
Update: ProfilerExporter.JSON - emit virtual memory graph
Tests: exported on craggy
Optim: ServerProfiler - properly avoid false-sharing when recording memory state when taking a snapshot
- using debug binaries (b445081f)
Should be a smidge faster faster in multithread allocation-busy scenarios.
Tests: took a snapshot on craggy
Update: ServerProfiler - emit working set as KB instead of Bytes
- Also pre-allocate extra size in string builder to account for the memory counters.
KB are easier to spot on the graph changing(not really on perfetto's Values view, better on delta view)
Tests: exported in editor on Craggy
Update: ServerProfiler - vizualize process working set memory
- using debug binaries for now(bc3e74cd)
Need to add virtual set as well and test standalone servers(Win and Linux)
Tests: in editor on craggy.
Bugfix: ProfilerExporter - filter out worker thread marks that are before the frame start
It was originally correct, but in previous commit I simplified the code, thus breaking it. This restores it.
Tests: none, as currently too many changes present in workspace - will test later
Update: ServerProfiler - emit fake "Allocation" threads and duplicate allocation marks there
- Also updated the whole buffer estimation to take into account these metadata marks
- Added support for naming and sorting thread tracks
Well, they do appear, but the thread_sort_index is ignored by perfetto(see https://github.com/google/perfetto/issues/555). Might finally bite the bullet and write a protobuf exporter, but afraid that it'll be more expensive to run and won't compress as well.
Tests: exported a snapshot from editor
Update: ServerProfiler - enable safety checks by default
- Also prefixed error logs to make it easy to identify where it came from
Tests: did 6 exports in editor, no false-positives
Bugfix: ServerProfiler - don't emit thread tracks with only allocations in them
- Handle "legal" case where we don't have any method marks on worker threads after frame start timestamp
- Handle "legal" case where we get a thread profile for a thread that was stopped before the frame start
- Handle "legal" case where we get empty thread profiles due to method filtering
- Dead func removal
This can be a controversial choice, as allocations do happen there, but it's not something we can interact with because there's not enough helpful information about them(for example, what if we filter out entire thread methods?).
Tests: did 4 exports, wasn't able to find weird allocation records on different threads.
Update: ServerProfiler - Track allocations on all threads
The display of this information is still abysmal - need to figure out how to make it better.
Tests: took snapshots both in editor and in standalone server builds (win + linux). Hacked a version that used to crash, but current changes don't anymore.
Update: Impllement GamePhysics.HandleIgnoreCollisions as a batch
- extended TerrainCollision to support batching
- extended WaterColllision to support batching
There's still more improvements that can be done (translating into burst jobs and better job-graph building), but currently the goal is to translate more of code into batch form away from singular.
Tests: none, it's not hooked up yet - will explore writing unit tests next-ish
Bugfix: read back from the right array of heights after sampling ocean
- Reported by new unit tests
Tests: ran water-related unit tests - now they pass
Bugfix: Make sure WaterLevel tests run full simulation path during unit testing
Previously it was earlying out during GetWaterLevel
Tests: ran the new tests, they report an issue that I need to fix
Update: missing meta files from merge
Submitting for now, but should nuke once back on main
Tests: none, trivial change
Merge: from main
- Skipping a couple meta files for empty folders, will submit them separately
Tests: booted editor for import
Clean: removing unused WaterFactorForPlayer that didn't return WaterInfo
Tests: compiles in editor in SERVER+CLIENT mode
Bugfix: ensure we cache WaterInfo for players that use various vehicles
Some vehicles have custom logic to check how submerged the player is, and we would miss the water info taht was used to calculate it.
Tests: tested via DPV - submerged factor was 1 and waterinfo was valid (before it would report factor of 1, but invalid waterinfo).