Commit Graph

21 Commits

Author SHA1 Message Date
Beau Hartshorne f32bd7bbcd
Update README.md (#65) 2024-08-04 12:57:00 -07:00
Beau Hartshorne d9cf323683
Update README.md (#63) 2024-07-24 11:56:14 -07:00
Dirkjan Ochtman 7bae0ee4cc Update more links 2024-05-29 11:15:37 +02:00
Beau Hartshorne ceb949340b Update README.md 2024-05-29 11:15:37 +02:00
Michael Partheil ae858ead5c Update Readme to reflect recent perf improvements 2023-11-16 01:23:26 +01:00
jinglybits 3a61fd2c55
replace cover image (#49) 2023-11-15 11:37:13 -08:00
jinglybits 2b47ca2ad4
replace cover image (#48)
* replace cover image

* replaced cover image with optimized svg
2023-11-15 10:13:42 -08:00
Michael Partheil 3b3627422b Use nested `HashMap` for storing both unigram and bigram scores 2023-10-17 14:42:50 +02:00
Dirkjan Ochtman f32b42537a Update links to point to new GitHub org 2021-08-31 14:10:13 +02:00
Beau Hartshorne f16306499c
Update README.md 2021-08-18 13:23:38 -07:00
Beau Hartshorne 8230ac6ed5
Update README.md 2021-08-18 13:18:56 -07:00
Beau Hartshorne e2f6f5c4a5
Update README.md 2021-06-05 14:30:22 -07:00
Dirkjan Ochtman 7214ffc126 Remove note about planned further optimizations 2021-05-28 14:44:44 +02:00
Dirkjan Ochtman 85f4f94b53 Use more efficient segmentation strategy
Based on the triangular matrix approach as explained here:

https://towardsdatascience.com/fast-word-segmentation-for-noisy-text-2c2c41f9e8da

Use iteration rather than recursion to segment the input forwards
rather than backwards and use a `Vec`-based memoization strategy
instead of relying on a `HashMap` of words. This version is about
4.8x faster, 100 lines of code less and should use much less memory.
2021-05-28 14:30:27 +02:00
Nick Rempel 9bbb633f1d
Flesh out README (#14) 2021-04-29 11:12:42 +02:00
Dirkjan Ochtman 41fb2075a6 Tighten the language a little bit 2020-12-16 10:48:31 +01:00
Dirkjan Ochtman 27d20f07e5 Add crate badges to README 2020-12-16 10:44:56 +01:00
Dirkjan Ochtman a8d93efbb6 Add cover to README 2020-12-16 10:42:35 +01:00
Dirkjan Ochtman 3a37893e74
Update README with new name 2020-12-15 21:02:22 +01:00
Dirkjan Ochtman c11da266aa Update performance claim in README 2020-11-26 11:39:57 +01:00
Dirkjan Ochtman 93bbff91ca Create initial README (fixes #1) 2020-06-19 13:14:27 +02:00