userDaniel Pcancel

1,688 Commits over 427 Days - 0.16cph!

3 Months Ago
Merge: from main
3 Months Ago
Backout of 122696 - was meant to go into texttable_allocs branch
3 Months Ago
Update: bring back last column padding to TextTable.ToString Tests: unit tests
3 Months Ago
Merge: from main
3 Months Ago
Merge: from profiling_improvements - New server profiler allocation tracking mode, start with "watchallocs", stop with "stopwatchingallocs", control export via various NotifyOn... server vars - Json Snapshot compression is now streamed, saving 95% of memory in the process and reducing GC events Tests: unit tests in editor, all forms of profiling in editor on Craggy in Server+Client mode, all forms of profiling in standalone server on Linux WSL
3 Months Ago
Merge: from main Tests: none, no conflicts
3 Months Ago
Bugfix: update unit allocation tracking tests to work with new notification params Tests: ran the "TestContinuousRecording" test
3 Months Ago
Update: all server profiler commands now respond if action started Tests: ran all commands on Craggy in editor
3 Months Ago
Merge: from main Tests: none, no conflicts
3 Months Ago
Update: additional memory metrics for memory profiling - Using release binaries based on 77ac1774 - Renamed NotifyOnAllocCount to NotifyOnTotalAllocCount - Added NotifyOnMainAllocCount, NotifyOnMainMemKB, NotifyOnWorkerAllocCount, NotifyOnWorkerMemKB (default 0 - disabled) - Set NotifyOnTotalAllocCount to 16k and NotifyOnTotalMemKB to 12MB Makes it easier to focus investigation in particular areas. Tests: continuous profiling on Craggy with enabling individual metrics and verifying that it generated snapshots with expected "violations"
3 Months Ago
Update: ContinuousProfiler now has TotalAlloc and AllocCount metrics for allocation snapshotting - Release binary build with 27d643a3 - exposed via NotifyOnTotalMemKB and NotifyOnAllocCount (set to 0 to disable) Tests: tested both on Craggy in editor. Helped spot a potential small leak in PerformanceLogging
3 Months Ago
Bugfix: snapshot json export no longer emits extra coma, making json invalid - also sneaking in AllocWithStack to be on execution thread only, not on alloc thread Super rare, but would be hard to find in the code when it would happen in the future Tests: persnapshot and watchallocs in editor
3 Months Ago
Bugfix: Prevent data races leading to torn continuous profiler snapshots - Binary built using Release conf based on 237b5df3 commit - Both Resume and Stop happen on profiler frame end (previously Resume was instant) - Stop gets deferred to after snapshot is exported if requested during processing - Profiler, if in initialized, always gets called on new frame (since internal state machine demands more steps than the user code can know about) - Updated continuous profiling unit test to acoount for extra OnFrameEnd required It was possible that a stop is requested during export process, leading to use-after-free exceptions on a managed thread and torn snapshot. Tests: unit tests + 20 manual watchallocs->stopwatchingallocs calls
4 Months Ago
Update: Allow user to control how big of a callstack to record when tracking allocations - Defaults to 16, should be enough to track where in code it originates - Updated description - windows binary built with 1a176138 commit Tests: used it on craggy. Discovered an issue with preceeding commit, but this change works as expected
4 Months Ago
Optim: ProfielrExporter.Json - export now uses streaming compression Avoids the need to allocate massive StringBuilder. Running watchallocs for 2 mins caused 3-4 GC collection events, instead of 1 during each export. Tests: done a perfsnapshot and ran watchallocs for couple minutes
4 Months Ago
Bugfix: ContinuousProfiler - don't record Sync marks when paused for export - Based off 23b9590b commit This is last known bug - lib was still writing Sync marks for new frame, eventually leading to main thread buffer growth, which invalidated pointers during export. Test: soaked for almost 1hour with watchallocs - no more unrecognized reads on main thread
4 Months Ago
Merge: from main
4 Months Ago
Debug: add logging of prefab path to track down server bundle NRE Tests: update game manifest (optimized)
4 Months Ago
Backout of CL 121065 - should reintroduce merge from prefab_process_optim Tests: update game manifest (optimized)
4 Months Ago
Merge: from main Catching up assets to the point of failure (I hope?) - still trying to reproduce NRE during server bundle generation
4 Months Ago
Backout CL 121063 due to failing server bundle generation
4 Months Ago
Merge: from main
4 Months Ago
Merge: from prefab_process_optim - Optimizes component checks during Prefab Processing (speeds up Asset Warmup and monument spawning) Tests: with temp old code that throws exceptions on result missmatch, ran Asset Warmup and ran Scene2Prefab on all large and xlarge monuments
4 Months Ago
Tests: perf test for FileSystem Warmup Recent optims show prefab processing cost for entire server-warmup goes from 39s down to 4.5s (averages across 5 runs) Tests: ran the perf test
4 Months Ago
Cherrypick(hackweek_procgen_async) Optim: PrefabPreProcess.FindComponents is now using GetComponentsInChildren With profiler, this ended up 2x faster than old way (lighthouse monument goes from 96ms to 46ms) Tests: used old code inline to validate outputs of new code
4 Months Ago
Cherrypick(hackweek_procgen_async) Optim: PrefabPreProcess - replace GetComponent with TryGetComponent Those are cheaper since they do less allocations and text formatting. Saves ~35s (but new flow is still slower). Tests: ran procgen with early out
4 Months Ago
Optim: PrefabPreProcess.FindComponents is now using GetComponentsInChildren With profiler, this ended up 2x faster than old way (lighthouse monument goes from 96ms to 46ms) Tests: used old code inline to validate outputs of new code
4 Months Ago
Optim: PrefabPreProcess - replace GetComponent with TryGetComponent Those are cheaper since they do less allocations and text formatting. Saves ~35s (but new flow is still slower). Tests: ran procgen with early out
4 Months Ago
Update: Merge prefab loading and preprocessing to run concurrently Surprisingly leads to worse timings than them being separate (120s prev CL vs 144s new). Might be overhead from doing a single prefab per frame Tests: ran procgen with early out
4 Months Ago
Update: move prefab processing to WorldSetup Tests: ran procgen with early out
4 Months Ago
Update: Prefab<T> gains a convenience (Prefab, T) constructor Tests: compiles
4 Months Ago
Update: Merging asset loading flows together - still editor only + debug code to early out - still slow (there's a number of issues left to resolve) Discovered that mixing Sync + Async loads causes an integration queue flush(big stall for us). This'll be a tricky problem to address, since SoundDefinition (and I presume others) load assets as part of OnValidate Tests: procgen in editor
4 Months Ago
Update: hooking up gameobject spawning to async load logic - Contains a bunch of testing code used for profiling, will clean up in next update Needs a bit of rework to ensure both the orignal flow and new flow can work together. Tests: ran procgen
4 Months Ago
Bugfix: fix out of bounds access during prefab shuffling Tests: ran procgen, no exceptions
4 Months Ago
Update: implement missing logic for both GatherAssets and Process - GatherAssets now respects all relevant settings and sorts paths - implemented Process that works on a batch of objects Tests: only GatherAssets has been checked (confirmed reduction of assets due to config use)
4 Months Ago
Update: exposing prefab preprocesing from GameManager Tests: none, simple change
4 Months Ago
Update: List and Array Shuffle range overloads Tests: none, trivial code
4 Months Ago
Optim: replace prefab search logic with editor manifest lookups - commented out a bunch of code for quicker iteration, wil lrevert later - doesn't account for monument duplication/probability Significantly faster because we don't load any assets in the process - goes from 30s+ down to 15ms Test: tried to procgen default editor map
4 Months Ago
Update: Sort editor manifest by path Allows to do faster lookups Tests: ran in the editor
4 Months Ago
Update: initial work on parallelizing prefab loading during editor procgen Loads too quickly(0.2s instead of 90) - I feel like it only loads the root-level gameobject, instead of the entire hierarchy. Will continue later. Tests: ran it once, got some telemetry, but already certain it's wrong.
4 Months Ago
Merge: from parallel_validatemove - Extra validation checks exposed via server.EmergencyDisablePlayerJobs (default to true). In case of error, shuts down UsePlayerUpdateJobs and goes back to vanilla flow These are cheap to run and should help us track down any problems in the future. Tests: compilation tests, unit tests and played back server demo
4 Months Ago
Update: Another validity check for UsePlayerUpdateJobs - validates player counts between PlayerCache and activePlayerList Tests: played back server demo
4 Months Ago
Update: promote some UsePlayerUpdateJobs validation logic from DEBUG only to release - Hidden behind EmergencyDisablePlayerJobs switch(on by default) and UsePlayerUpdateJobs(off by default) - ValidatePlayerCache checks whole range instead of just up to player count (in case we got more than expected) Tests: played back server demo
4 Months Ago
Clean: fix formatting
4 Months Ago
Update: turn server.EmergencyDisablePlayerJobs const into a servervar Allows to run some extra validation Tests: editor compiles
4 Months Ago
Test: test case for missing player removal from PlayerCache Tests: ran the new unit test
4 Months Ago
Merge: from main
4 Months Ago
Bugfix: ContinuousProfiler now atomically updates it's write indices - internal fix in ServerProfiler.Core, based on e39afb43 - Removed now-unhelpful echeck for right mark type at the start of main thread perf stream (all cases confirmed legal now) After soaking it for 15 minutes total, only main thread export gets lost in the binary sauce. Hopefully last bug. Tests: soaked 3 times on craggy, only hit unexpected mark type on main thread
5 Months Ago
Update: ContinuousProfiling will emergency stop if fails to export Yet to pin down the worker thread telemetry stream seeing stale/garbage data. Tests: soaked on Craggy in editor
5 Months Ago
Bugfix: fix main source of invalid profiling stream from ServerProfiler - Drop dead threads on every succesful frame - Reset all writing indices on new frame and on resuming continuous profiling post-export - release binary built using 019295b4 There's still another issue hiding somewhere, but it's much more stable now. Tests: on craggy exporting every 3rd frame for 5 minutes straight. Previously would trip after 20seconds.