OK, it sounds like this isn't really an intended use case.  99% of the time I can monitor using whatever is in the mains just fine, even with the Mac's built in audio which makes for a super simple setup.  But those occasions where I do need a click reference or count-in...it sucks to then have to add an external interface, headphone amp, earbuds, another wire to trip over if not an expensive wireless setup, an entire rack case that I didn't need before, a new monitor mix, etc etc etc. just for those rare occasions. I was hoping the visual cue/metronome would be the cure for that.  I thought being able to adjust the song's start time was specifically for this purpose...to get the metronome in sync with the audio files. What a great solution that is too; because there is often a random amount of silence at the head of the stem files.  This actually worked pretty well (for a song that is locked to a grid, obviously) as long as you don't stop in the middle.  Maybe in the future, if the metronome gets a grid.
 
Anyway... The tone was not what I would call muted, although it could be silenced (temporarily) by muting the click bus.  It was loud, maybe even 0dBFS. So yeah, I will try to re-create the issue, I don't recall if I had switched apps...it's possible.  I will test and get back to you.