The most interesting thing is imo that people often say one thing, but put a sim...

		wongarsu on July 14, 2022 \| parent \| context \| favorite \| on: 30% of Google's Emotions Dataset Is Mislabeled The most interesting thing is imo that people often say one thing, but put a similar-sounding word or a homophone in the subtitles, and the filter seems to trust the user-supplied subtitles. I hope nobody trains speach-to-text systems on a tiktok dataset.