branchrust_reboot/main/baseplayer_serverupdateparallelcancel
82 Commits over 31 Days - 0.11cph!
Bugfix: avoid potential NRE after player kick for terrain violations
- Rewrote logic to avoid extra IsRealNull checks - we now filter indices instead
Tests: clipped in geometry with noclip
Bugfix: player-like entities that don't support occlusion replicate correctly with UsePlayerUpdateJobs 2
Old way of filtering them allowed them to use cache, which never contained them, so they were culled.
Tests: 2p session on Craggy with UsePlayerrUpdateJobs 2 - spawned NPC and a bot, it didn't despawn
Merge: from main
Tests: none
Bugfix: IndexOutOfRange exception when updating player state cache
Affected both UsePlayerUpdateJobs 1 and 2
Tests: none, simple change
Buildfix: unused variable warning-as-error
Weird it didn't get detected outisde of my branch, but ah well
Tests: built all modes in editor
Bugfix: reset OcclusionCanUseFrameCache on emergency disable of UsePlayerUpdateJobs
Tests: none, trivial change
Clean: extract WaitForTasks
Tests: editor compiles
Update: review fixes #1
- fix formatting in BasePlayer-Mission
- Apply same mission fix from vanilla server flow
- Comments around EventRecord.New and AzureAnalytics.OnPlayerTick warning to avoid scripting API to keep it thread save
Got 2 more changes to make, then think it's ready
Tests: compiles in editor
Update: make batching constants servervars
- New vars are Server.SnapshotTaskBatchCount and Server.DestroyTaskBatchCount
- factored out SendEntityDestroyMessages
Tests: none, trivial changes
Bugfix: recycle filter buffers when there's no results
UsePlayerUpdateJobs 2 only issue
Tests: none, trivial change
Update: ran Generate Code
Tests: none, trivial change
Update: Network++
UsePlayerUpdateJobs 2 can send packets out of order, which needs client-side change (see CL 128169)
Tests: none, trivial change
Update: amending a comment for SendDemoTransientEntity
Finished investigating, server-demos shouldn't be a problem for multithreaded networking
Tests: none, trivial change
Bugfix: players go back to sleep on disconnect on a server with UsePlayerUpdateJobs 2
This looked scarrier than it ended up being. One concern left to investigate - looks like I'llbe able to merge it in tomorrow
Tests: 2p session on craggy with UsePlayerUpdateJobs 2
Optim: NetworkPositionTick - skip transform access
Tests: none, trivial change
Optim: inline virtual calls in SendNetworkPositions and skip transform access
Tests: 2p seesion on Craggy with UsePlayerUpdateJobs 2
Update: simplify SendNetworkPositions
Decided against parallelizing with tasks as it looks to be taking only ~0.3ms on a 350pop server
Tests: 2p session on Craggy with UsePlayerUpdateJobs 2
Clean: fix formatting
Tests: none, trivial change
Bugfix: fix failing WaterLevel tests
My ReadOnlySpan caused it, as we had hardcoded access to the global PlayerCache (which we don't use in tests).
Tests: unit tests pass
Update: rename BasePlayer.playerCache member var -> BasePlayer.PlayerCache
It was hiding the source of a bug, as func params have lower case name
Tests: compiles in editor
Clean: prefer ReadOnlySpan<BasePlayer> instead of playerCache where possible
Makes it easier to guess things at a glance.
Tests: ran all relevant unit tests(discovered WaterLevelTests are failing, will investigate next) + 2p session on craggy with UsePlayerUpdateJobs 2
Update: demote PlayersToFinalize, PlayersToValidate, PlayersToRecache from being global caches
Tests: 2p session on Craggy with UsePlayerUpdateJobs 2
Update: demote PositionChanges from being a global cache
Tests: 2p session on Craggy with UsePlayerUpdateJobs 2
Merge: from occlusion_rework
- Fix for occlusion queries not being commutative
- Reenable occlusion frame cache and expand it's use to full frame
- Server occlusion deduplicates queries
- minor API restructure (might affect mods)
Update: disable bidirectional occlusion caching for now
- Left a couple optim/safety todos
This fixes the batch vs serial baseplayer occlusion consistency test, but one CantSee fails as expected. Going to fix it in a child branch, as that currently blocks more optims.
Tests: ran relevant unit tests
Tests: TestLineOfSight - properly enforce non-job mode in relevant cases
Tests: ran test sets
Tests: TestLineOfSight - validate reverse paths
Looks like some occlusion checks might fail reverse paths. That's an uh-oh.
Tests: ran above set, got a failure
Tests: ServerOcclusionTests - generated data set now contains similar pairs
Interestingly, this trips up consistency test - will investigate
Tests: ran TestOcclusionLineOfSight_Consistency
Tests: PerfOcclusionLineOfSight - reset internal recent visibility cache between runs
Tests: ran the set
Tests: ServerOcclusionTests - standardize test naming
Tests: ran test set
Tests: ServerOcclusionTests.GeneratePairs - generate correct number of pairs
This further shrinks runtimes, as previously we generated waaaay too many
Tests: ran ServerOcclusionTests set
Tests: optim TestOcclusionLineOfSight_PerfSerial/-Parallel
By properly constructing and caching base players - tests now take less than a second.
Tests: ran test set
Tests: add TestOcclusionLineOfSight_PerfSerial/-Parallel
Both run for too long - likely due to how I create players for the tests. Will fix next
Tests: ran the new set
Tests: TestOcclusionLineOfSight_Consistency - rset occlusion cache between serial and parallel runs
Tests: ran the test set
Tests: add ServerOcclusionTests.TestOcclusionLineOfSight_Consistency
Compares results between serial and batched versions.
Tests: ran the new test set
Update: prep BasePlayer.OcclusionLineOfSight (both serial and batched) for use in tests
Tests: none, trivial change
Tests: rewrite ServerOcclusionTests to make ServerOcclusion usage clear
Is it ugly on some lines? Yes. But is it explicit and beautiful? Also yes.
Tests: ran the unit tests
Clean: propagate network time from FinalizeTickParallel
Tests: none, trivial change
Optim: NetworkPositionTick - remove extra InvalidateNetworkCache
Cache has been previously invalidated in FinalizeTickParallel, so no need to discard it again
Tests: none, trivial change
Optim: OcclusionSendUpdates can now reuse occlusion results
- Got rid of old OcclusionLineOfSight that used to send updates internally, as there's no need for it now
Tests: 2p session on Craggy with UsePlayerUpdateJobs 2
Clean: remove couple TODOs
- one was just completed
- another was overzealous
Tests: none, trivial changes
Optim: SendNetworkPositions - reuse occlusion query results
Tests: 2p on Craggy with UsePlayerUpdateJobs 2
Update: move ServerUpdateOcclusionParallel inside FinalizeTickParallel
- FinalizeTickParallel invalidates players network cache - with UsePlayerUpdateJobs 2 we can skip it later, but 1 is has double-invalidate
This increases the coverage of OcclusionFrameCache, allowing to simplify a bunch of code.
Tests: 2p on Craggy with UsePlayerUpdateJobs 0, 1, 2 and disconnects. 0p server with UsePlayerUpdateJobs 1, 2
Merge: from connectedplayer_rewrite
Got far enough along in this direction and things seem to work
Bugfix: Parallel ServerOcclusion - get rid of extra ShouldNetworkTo checks
- Added a note explaining why we're not offloading network-cached children to worker threads
Tehnically previous version added an extra check due to APIs being used - this brings it back in line with serial count.
Tests: 2p session on craggy with UsePlayerUpdateJobs 2
Clean: simplify SendAsSnapshotWIthChildren
- remove includeChildrensChildren - it was always set to true
- remove the children null check - it's always created/set
Tests: compiles in editor
Bugfix: Parallel ServerOcclusion - serialization now happens only on main thread
- main thread doesn't try to process lost pairs and already-serialized found pairs tasks anymore - serializing work is longest
Previously serialization was offloaded to worker threads, but after investigation discovered that it would kaboom due to accessing scripting API. So instead, we offload to worker threads only players + entities that we can guarantee won't trigger serialization.
Tests: 2p session on craggy with UsePlayerUpdateJobs 0, 1, 2 - ran around, switched weapons, disconnected
Bugfix: Parallel Occlusion marks entity snapshots as out of order
Since we can send them from muiltiple threads, they are unordered - this would cause a client disconnect previously. Not anymore
Tests: none yet, need to try validating with server demos