The blog post highlights this specific point - "US English" and "Indian English" really aren't the same English (in fact, I'd probably go even further and state that "Reddit English" and "US English" probably aren't the same English either).
Likewise, the Common Voice English dataset isn't great for ASR training outside India, either. There's a huge proportion of Indian speakers, and their data doesn't really help train ASR systems for non-Indian accents.
Likewise, the Common Voice English dataset isn't great for ASR training outside India, either. There's a huge proportion of Indian speakers, and their data doesn't really help train ASR systems for non-Indian accents.