Dialyzer pointed this one out.
The WorkerHelper removal in !4166 was missing this Oban.insert() and tests were not noticing any problems because we mocked the Push.send function instead of executing it and checking for the Oban job.
Gun's connection pool also returns an error if duplicate workers are launched simultaneously. Snooze on this error as well, and lower the snooze to 3 seconds with the optimism that the connection will still be open by then and the delivery can be completed quickly.
The original setting of 30 seconds is pretty high and means there's an unnatural lag between deliveries of activities destined to the same server that were created at nearly the same time. This configuration should be more efficient.
Our test environment cheats by constructing a conn with a custom oauth_access/2 function. This assigns a :token to the conn but due to the way it is constructed it has the :user preloaded. When the OAuth Plug fetches a token it does not preload the user, so the check for user.disclose_client was always nil and assumed to be false.
Preloading the :user ensures the test environment matches reality.
This logic only exists in the Plug, so attempting to validate the signature by calling the library function HTTPSignature.validate_conn/2 directly will never work because we do not attempt to construct the (request-target) and @request-target headers with both the commonly misinterpreted and correct implementation of this field. Therefore all attempts to validate a signature from an Oban Job will fail.
When signatures fail on incoming activities we put the job into Oban to be processed later instead of doing the user fetching and validation inline which is expensive and increases latency on the incoming POST request. Unfortunately we did not retain the :method, :request_path, and :query_string parameters from the conn so the signature validation and Oban Job would always fail.
This was most obvious when Mastodon sends Deletes for users your server has never seen before.
This is for a normal HTTP error response or timeout while receiving the data. A hard error from a process crash, DNS lookup failure, etc should produce a different response than {:ok, %Tesla.Env{}} and the request/job will be retryable.