We had a problem on 9/8/2017 that has me stumped at the moment...
An operator canceled a single job at 23:45:26. I can see this in the Audit log; complete with his User Name, and the Source being listed as the Job Manager. The Master processed this request successfully at 23:45:31, again with the correct User Name. The Job Manager changed the status of this job to Aborted under the User Name of the job's Runtime User.
Great. That's what was supposed to happen. However....
Also at 23:45:26, the same time as the deliberate cancellation, the Job Manager processed a cancellation request on three completely unrelated job groups. The User Name listed in the Audit logs for these unexpected cancellations is blank. All indications are that these three job groups were the only items left for the day not in a Launched state or otherwise processed... although that's difficult to prove given the complexity of our nightly processing.
The errant cancellations were all either Waiting On Dependencies or Waiting On Group.
This lead to dozens of cancellation requests being processed in the span of two seconds. They were completed by 23:45:27, four seconds prior to the initial cancellation request completing.
Not one single errant cancellation request was processed with a User Name. Those fields are all blank in the logs. Runtime Users and Agents were credited with actually marking jobs and job groups as Cancelled, which is to be expected.