That sets the theoretical limit for an algorithm trained on a dataset labeled by outsiders. Most people should be able to label the sentiment of their own statements with much higher accuracy.
Making such a dataset is much harder than letting Mechanical Turk workers label reddit comments, and you somehow have to set up a situation where people are honest about their labels, but the rewards might be worth it.
That's the personalization angle. A lot of effort's being expended on transfer learning + training at the edge, where you start with a general model trained on humanity and then it gradually learns about the specific human(s) it's interacting with.
My idea is more along the lines of asking each redditor to label the sentiment of 20 of their own (recent) comments, building a dataset and model from that, instead of having unrelated people guess what they meant.
Personalizing to the actual subculture the interaction takes place in would be another step. You probably need both.
Making such a dataset is much harder than letting Mechanical Turk workers label reddit comments, and you somehow have to set up a situation where people are honest about their labels, but the rewards might be worth it.