True, we’re trying to produce bots that reliably do things (make people laugh) that humans can’t even do reliably. People who can feel out a room and use the right joke, or the right reassurance or whatever, are not even very common.
I dunno if the intent of this dataset is to produce bots that can make people laugh. I think (intentional) comedy is the ultimate Turing test. I say intentional because there are things like https://inspirobot.me/ which are essentially glorified Markov models and it's downright hilarious the stuff it comes up with, but I think that's primarily due to absurdist humor and subversion of expectation (and unintentional ironic pairing with the picture). That's very different than communicating something, intended to be a joke, having it land, having it be funny, and deliberately so, not just because it was silly or non-sequitur.
I think it's still valuable to be able to detect when something is joke/satire/sarcasm/irony/slang, especially in the context of content moderation, because quite often it totally flips the sentiment valence. A perfect example is "I'm literally dying" - "literally" meaning in the exact or truest sense, "dying" meaning sloughing off the mortal coil (very bad)- vs "literally" meaning "figuratively, but in an extreme sense" and "dying" from laughter (very good).