OK, I see. It's 16bit 44100Hz in my Windows setting, but how it matters? I have tried 24bit setting on another audio device but the bug still exists.
By the way, I found "Tone variation" is the key factor. You can reproduce the bug by simply setting it to 1.00