Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But you can't access Classification Web without a $$$ subscription plan! (from $375 for Single User up to $1900 for 26+ Concurrent Users).

https://www.loc.gov/cds/

https://www.loc.gov/cds/classweb/

(Aaron Swartz would object. You can access US patent data for free, but not LoC Classification Web)



Pretty much my point.

I should look into the terms/conditions for that.


People won't care why it isn't freely accessible, it's not going to displace ISBN (or other non-US classifications) without that; not even inside the US, and certainly not outside it.

I'm actually genuinely surprised it isn't freely accessible; Aaron Swartz (RIP) might have gone to war over that, and might have won that war in the court of public opinion.

Hey has anyone in the govt trained an LLM on it? Given title, author, keywords, abstract, etc. predict which LoC triple classification (or Dewey Decimal Classification, or Harvard–Yenching Classification, or Chinese Library Classification, or New Classification Scheme for Chinese Libraries (NCL in Taiwan)) a book would have? That would be neat, and a good way to proliferate its use instead of ISBN. (But the US govt would still assert copyright over the LoC classification.)


ISBN and LoC Classification serve totally different purposes.

ISBN identifies a specific publication, which may or may not be a distinct work. In practice, a given published work (say, identified by an author, title, publication date, and language) might have several ISBNs associated with it, for trade hardcover, trade paperback, library edition, large print, Braille, audio book, etc. On account of how ISBNs are issued, the principle organisation is by country and publisher. This also means that the same author/title/pubdate tuple may well have widely varying ISBNs for, say, US, Canadian, UK, Australian, NZ, and other country's version of the same English-language text.

There are other similar identifiers such as the LoC's publication number (issued sequentially by year), the OCLC's identifier, or (for journal publications) DOI. Each of these simply identify a distinct publication without providing any significant classification function.[1]

The LoC Classification, as the name suggests, organises a book within a subject-based ontology. Whilst different editions, formats, and/or national versions of a book might have distinct LoC Classifications, those will be tightly coupled and most of the sequence will be shared amongst those books. The LoC Classification can be used to identify substantively related material, e.g., books on economics, history, military science, religion, or whatever, in ways which ISBN simply cannot.

As I've noted, the Classification is freely available, as PDFs, WordPerfect, and MS Word files, at the URLs I'd given previously. Those aren't particularly useful as machine-readable structured formats, however.

________________________________

Notes:

1. Weasel-word "significant" included as those identifiers provide some classification, but generally by year, publisher, publication, etc., and not specifically classifying the work itself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: