Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wouldn't this require larger datasets? That isn't always an option. I'm imagining that a smaller, more computationally efficient network could learn nearly as well with fewer data points given these heavily engineered features. Is that off base?


Basically, no. See http://karpathy.github.io/2015/05/21/rnn-effectiveness/

He gets pretty amazing results with a corpus size around 10M.


But that takes ages to train!


So something like Jason Weston's state-of-the-art attention-NN based sentence summarizer took ~4 days to train.

You'd easily spend that time doing manual feature engineering just to build a baseline system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: