Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The plots showing the full dataset as a scatter plot on the court are great, and the most useful plots in this post. I also like the outlines of the court (though if I were making these charts I’d draw the court lines in gray or pale orange or something so the data would stand out more).

The heat maps are much less useful IMO, because the colors are poorly chosen and the data generalization/binning methods seem kind of arbitrary. Until the shot count gets up into the tens of thousands or more, just show all the data. (If e.g. aggregating all the shots from the whole league, then some kind of binning would become necessary.)

The marginal histograms showing density by x/y coordinates on the court are essentially useless in my opinion. Dramatically more interesting would be marginal histograms related to angle and distance in terms of polar coordinates centered at the basket (it might be necessary to ignore the angles for positions very close to the basket, where angle is kind of irrelevant). To make them even more informative, since the 3 point line isn’t a perfect semicircle, make marginal distributions (in terms of angle and distance) of separate categories of 2 point and 3 point shots, and stack them. Or even three categories showing dunks/layups, 2 point jump shots, and 3 point jump shots.



I'm the author, thanks for suggestions on improving the charts, especially regarding the marginal histograms.

Just a couple of questions: What color maps would you use for those kde plots?

I know seaborn by default uses the Freedman-Diaconis rule to create bins for the hexbin plot. But, what suggestions do you have for binning?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: