instant-segment/README.md at 3c52201fa011c91dfa9b3a75d03c0428b0ac8248

504 B

Raw Blame History

The data files in this directory are derived from the Google Web Trillion Word Corpus, as described by Thorsten Brants and Alex Franz, and distributed by the Linguistic Data Consortium. Note that this data "may only be used for linguistic education and research", so for any other usage you should acquire a different data set.

504 B Raw Blame History

504 B

Raw Blame History