branchrust_reboot/main/profiling_improvementscancel

64 Commits over 61 Days - 0.04cph!

3 Hours Ago
Update: borked linux ServerProfiler.Core It gets loaded, but fails to find an entry point, so it just falls over. I'll continue tomorrow. Tests: built a standalone linux server and tried launching in via WSL - my profiler script throws an exception
Today
Update: ServerProfiler.Core separates platform specific code Required for Linux support - still figuring out how to organize msbuild projects Tests: took spanshot in Editor's Craggy, and did a couple start-stops of playing to ensure no errors
Yesterday
Clean: fixing typo in "ServerProfiler::AppendNameTo" Noticed when working on now-discarded assembly name skipping Tests: scripts build in editor
2 Days Ago
Clean: remove unused "Profile.cs" script Tests: made sure ditor compiles
2 Days Ago
Update: don't annotate UnityEngine.CoreModule funcs It has a lot of small functions, so it inflates the profile by quite a bit as well as adds overhead to small loops. Tests: tested in editor on Craggy
2 Days Ago
Update: Reduce default frame count for perf snapshot to 4 Going to max(10) by default is too much - on a 150 pop release server that tried to produce 3.5GB json snapshot. I'm expecting 4 to give us a 1 GB profile Tests: took a snapshot on Craggy in editor - it took 4 frames
2 Days Ago
Update: Reduce main buffer relocations when processing generating json snapshot - Also report total marks instead of max to allow predicting how big the snapshot is in case of failure Will help with reducing GC activity on the server, hopefully also reducing the generation time. Tests: locally on Craggy in editor took a couple snapshots - no issues
2 Days Ago
Merge: from main Tests: none
23 Days Ago
Update: Avoid shipping ServerProfiler to clients - Done by deleting the dll after it's built Tests: Build Win64 Client Debug and Release - no more dll
23 Days Ago
Update: added Assembly name to the snapshot marks Makes it a bit clearer where calls are coming from at the expense of larger snapshot(compressed: 2mb -> 7mb, json: 50mb->140mb). But, since the profiler is much faster now, the snapshots are smaller on 6k servers. Tests: exported 2 snapshots from Editor's Craggy, and exported 5 from standalone 6k server
23 Days Ago
Bugfix: Prohibit constexpr initialization of mutexes in ServerProfiler.Core Turns out the toolset I'm using produces non-binary compatible assembly - using an escape hatch to avoid this issue. This only manifested in standalone builds. Tests: exported 5 snapshots from procgen 6k world in a row over 2 minutes. Used both borked and good hamony mods as well.
23 Days Ago
Bugfix: fixed incorrect time scale in reports - Also replaced the debug dll with a release one (interestingly, this also removed quite a lot of scopes in the report) Tests: exported snapshot from Craggy in editor
23 Days Ago
Bugfix: various issues that snuck in during rewrite - Turns out unmanaged threads get destroyed by engine, so needed to add deferred clean up of collected telemetry - this fixed editor crashes and crashes in export thread - I accidentally recorded allocations for worker threads only - flipped to record main thread only - I was emitting methods for MethodExit and MethodException marks, which is unnecessary - cleaned up Still need to fix timestamp conversion on export, that'll be next Tests: exported snapshot from Editor's Craggy
24 Days Ago
Update: moved profiler code to unmanaged assembly - submitting debug DLL for now, will ship release once ready Can take snapshot, but export doesn't finish - likely an exception on export thread. Tests: taken snapshot in editor on Craggy - saw logs of data accumulated, but no final "compressed" message
28 Days Ago
Bugfix: avoid leaking repeating invoke when taking snapshots - Also avoids ambiguity of taking multiple snapshots Tests: generated 2 snapshots in editor back to back with a delay
28 Days Ago
Update: notify that a snapshot was taken when no delay was requested Tests: none, trivial change
28 Days Ago
Merge: from main Tests: built all modes in editor, exported snapshot from editor's Craggy 5 times in a row, built standalone release server and exported snapshot 3 times
28 Days Ago
Update: Add chat feedback when perf snapshot is being taken It'll warn users if they're in the middle of something important Tests: exported in editor with no delay and default standalone delay
28 Days Ago
Clean: remove unnecessary params in ProfilerExporter Tests: none, trivial change
28 Days Ago
Update: Add commandline argument support to explicitly turn on profiler instrumentation - Added log to explicitly confirm if it's enabled or disabled Tests: ran in editor and server standalone with and without it being enabled
28 Days Ago
Bugfix: avoid a rare case of dealocating main thread's Allocs storage - Code is written with the assumption that it's always there, but if 1 frame didn't record any allocs, it would nuke the storage, tripping up the profiler. Discovered when doing additional testing in standalone (somehow editor was unaffected) Tests: did 6 snapshots of standalone server with 6k map - no crashes
29 Days Ago
Update: Don't allocate storage for alloc marks on worker threads Tests: exported a couple snapshots in the editor
29 Days Ago
Update: truncate snapshot names to 32 chars Tests: none, trivial change
29 Days Ago
Update: export worker threads in the json snapshot - Also fixed a bug I introduced in previous submit that led to sporadic exceptions Tests: exported 5 profiles in a row from Craggy in Editor, exported 2 in standalone, checked in perfetto
29 Days Ago
Update: record marks from worker threads - Had to leave allocation tracking enabled for main thread only - there's a comment explaining why Need to implement export for worker threads - that's next Tests: exported snapshot from Craggy in editor and opened in Perfetto
29 Days Ago
Update: generate snapshots under server root Tests: exported in editor, found in the right location
30 Days Ago
Undo: unintentional change to ProjectSettings, reverted by hand Tests: none, trivial change
30 Days Ago
Update: rewrote ServerProfiler TLS storage - Instead of having per-frame storage, we now have one big buffer - Rewrote ProfilerExporter to support changes - Removes a weird stall on EndOfFrame invoke in standalone I couldn't find a way to implement lock-free perf mark recording with previous approach, but now I should have a way - will attempt next. Tests: exported profile from Craggy in editor and standalone 6k server - both open in Perfetto and look coherent
31 Days Ago
Update: export snapshots into separate folder Makes it easier to build tooling for it Tests: generated an editor snapshot
31 Days Ago
Update: allow specifying the name and how many frames to collect for perf snapshot - We support max 10 frames of recording, so frames input gets clamped - Also left a note for future maintenance Tests: exported multiple snapshots in editor with 1 and 11 frames
34 Days Ago
Update: adding extra logging and sanity checking when exporting a profile snapshot For some reason one of frames from standalone gets borked - hoping this'll help track it down. Tests: exported craggy in editor
34 Days Ago
Update: Gather GC.Collect activity into perf snapshot - Avoid thread id checks reaching to mono runtime to resolve it, as it's unsafe during GC.Collect Tests: did a 6k standalone perf snapshot, but didn't catch a collection event. Forced one in editor on 3rd frame, confirmed visible in perfetto.
34 Days Ago
Update: export Allocs as process-wide events This puts them on a separate track, making it easier to spot them. Tests: checked craggy snapshot in perfetto
34 Days Ago
Clean: removing some unnecessary sanitization logic One of recent updates now guarantees storage is enough to house all snapshot data Tests: loaded craggy snapshot in perfetto - no missing names/allocs, no asserts
34 Days Ago
Update: export Alloc events to json snapshot They get a bit lost in the sea of all other instantaneous events, so will need to somehow improve this Tests: loaded craggy snapshot in perfetto
34 Days Ago
Update: record GC alloc events in the snapshot Currently don't emit them in the json, but that'll be the next thing Tests: took a perf snapshot on craggy
35 Days Ago
Optim: reduce profiling capacity, instead lazy-grow it - When frame didn't fit the capacity, drop it and rerecord it This should help avoid large stutters in editor (and hopefully on the server as well). Tests: tested craggy in editor - spikes gone. Tested on 6k world in standalone server - spikes still present. Also noticed one export failed, but think it's unrelated to current changes
35 Days Ago
Optim: Avoid allocations when generating method names - Also cleaned up a couple TODOs - Added extra logging to track stages of export progress Getting very close to completing all outstanding TODOs Tests: exported craggy and checked scope names in perfetto
35 Days Ago
Update: add cross-frame stitching of torn scopes Tests: snapshotted craggy, loaded in perfetto
35 Days Ago
Update: offloading snapshot export to a task thread - Stopped exporting binary snapshot, and left a comment explaining why it's not in use (but not deleted) - Only export compressed json (saves a bit of time on iteration) Hope is to reduce stutter, but my tests in editor show that it's inconsistent (it's less, but for some reason main thread has frames that somtimes seem to wait for worker thread to finish - need to investigate) Tests: exported snapshot from craggy, unzipped and loaded in perfetto
35 Days Ago
Bugfix: exclude 'length' bytes from content end offset Tests: none, trivial change
35 Days Ago
Update: Add support to export snapshot as a binary blob - Also emitting compressed version of json and bin snapshots - Minor code reorganization + TODOs Surprisingly, despite tighter binary format, it compressed to a larger size than compressed json Tests: loaded compressed snapshot to perfetto. Didn't trest the binary as it doesn't have any readers yet, so it's untested
36 Days Ago
Bugfix: patch sheared frames when exporting json snapshot - In Editor EndOfFrame is called as part of nested GUI, which our profiler shears apart. For now we inject additional marks to maintain callstack structure - left a TODO to properly reconstruct a sheared frame - For now expanding frame scope to cover sheared period Tests: exported craggy snapshot and opened in perfetto - no more randomly trashed frames
36 Days Ago
Update: Factor out profielr exporting logic to a separate script - Also cleaned up a couple log outputs, as the collection seems sensible - Cleaned a couple already-done TODOs Tests: Did an export from editor and standalone server (6k size, 0 pop), loaded in perfetto
36 Days Ago
Undo: removing ignore.conf accidental submit Tests: none
36 Days Ago
Update: ServerProfiler snapshot now contains full names for scopes Tests: opened the new profile in perfetto
37 Days Ago
Update: Emit all 10 frames of snapshot - Also emitting UnityFrame as a CompleteEvent, so it's always visible in the profiler - UnityFrames are now numbered This produces a 160mb snapshot on Craggy (taken immediately after spawning), will need testing on larger worlds (and this is before proper labels) Tests: opened snapshot in perfetto - saw all 10 frames
37 Days Ago
Update: Synthesize missing OnEnter profiling marks This restore the profile's structure Test: opened snapshot in perfetto
37 Days Ago
Update: first export of profiling snapshot - Also filtering out System and System.Core from annotation (otherwise we produce too much data, even for 1 frame) - Also added timestamp utility functions (as trace format requires micros) Not super reliable - need to post-process gathered data to get rid of torn marks Tests: opened trace in perfetto - although wonky, it's there
37 Days Ago
Update: minor improvements to ServerProfiler - Added a smidge more debug to track the growth of per-frame lazy storage - Added output of how long each frame took Tests: ran the perfsnapshot command