Optim: TraceRaysUnordered - optimize collider validation using burst jobs
- added UtilityJobs.FlipBoolJob and ScatterToJob<T>
This brings costs down across the board:
* 128 rays - 0.37ms -> 0.25ms, 33% improvement (serial was 0.49ms, 49%)
* 1k rays - 2.59ms -> 1ms, 61% (serial was 4.33ms, 77%)
* 8k rays - 18.9ms -> 6.26ms, 67% (serial was 34.79ms, 82%)
Tests: unit tests