Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The thing I really don't understand is this seems like such a simple thing to test! Search for X in our data set vs. competitor data set: flag for review if we are not within 1km radius. Don't release until the number and scope of flags hasn't been reduced to an acceptable margin of error (don't even have to test the full data set, just need a good enough statistician to help you figure out how bad your overall data is based on your sample tested data).



2 problems here, first, can you use competitors data for comparison and making your data better? Second, calculating difference is very difficult and size of difference could be overwhelming. But first issue is the main problem.


1. Yes.

2. Difficult how? You don't have to perform it on items that can't be automatically searched for. Then you do subtraction, square, check if you're over a kilometer.


"Apple steals Google Maps data then bans them"...news at eleven.


He's not advocating outright plagiarism. Checking your own results against someone else's is perfectly acceptable and recommended.


Not if the terms of use of the "someone else" explicitly prohibit such activity. This is particularly common with local information providers (eg business/event listings).


Depends on the definition of "someone else". Check the terms and conditions of all big/high quality map data providers.


Not for the media. This would be a media gold mine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: