The issue is that the transients in the waveform don't match what is playing. They seem to be a bit off and if I see it correctly the gap widens the later in the file you are.
I can not reproduce it but I have a suspicion what caused it in the last udpate and will try to revert that part to the previous version.