I know how it workes, the question is if you get better results using whisper or deepl.
The good things with Whisper end-to-end translation are:
(a) It uses context for translation. It tries to build a context for example guessing gender (he, she), and punctuations for translation task.;
(b) It makes the entire Whisper output faster. Translate tsak is faster than transcribe task. It is funny but their main sw engineer was saying that the way the algorithm is written, the end-to-end trasnlation task is performed faster than just transcribe task
The good things with DeepL is that It is just a better translator. Fullstop. One bad thing with DeepL is that it often mixes up he/she, it/they, sir/ma'am.For me I decided to just stick with DeepL. I did some comparisons during the early days of Whisper (v1). I haven't done any comparison with v2 but I understand that the translation capability did not change from v1 to v2. To me, DeepL translations came out better. But then again, I don't speek Japanese so my read might be quite wrong.
In terms of being able to compare the outputs as @SamKook suggested, one can make Whisper to be more deterministic by setting both temperature and beam to zero. That makes the output close to determinstic. But the pitfal is that it produces more halucination and repeating lines in the output.