Dirkjan Ochtman
|
96187965b6
|
Extract public asssert_segments() function
|
2021-02-04 10:40:04 +01:00 |
Dirkjan Ochtman
|
45e569379c
|
Default to calculating total from unigram map
|
2021-02-04 10:36:30 +01:00 |
Dirkjan Ochtman
|
0d2930c408
|
Add API to create segmenter from hashmaps directly
|
2021-02-04 10:36:30 +01:00 |
Dirkjan Ochtman
|
b85fc6adc2
|
Rename testcases to test_cases
|
2021-02-04 10:36:30 +01:00 |
Dirkjan Ochtman
|
55cc7c54a3
|
Use powi() instead of powf() for performance
|
2021-02-04 10:17:11 +01:00 |
Dirkjan Ochtman
|
970caeba44
|
Use std HashMap to simplify API
|
2021-02-04 10:16:38 +01:00 |
Dirkjan Ochtman
|
29d2d94a8d
|
Reorganize tests and test data to expose test cases
|
2021-02-01 17:25:32 +01:00 |
Dirkjan Ochtman
|
cb3c9707ef
|
Add docstring for Segmenter type
|
2020-12-07 14:51:10 +01:00 |
Dirkjan Ochtman
|
adf7995adb
|
Remove now unused error type
|
2020-12-07 14:51:10 +01:00 |
Dirkjan Ochtman
|
2ab57ca0b1
|
Fix typo
|
2020-12-07 14:36:59 +01:00 |
Dirkjan Ochtman
|
c571996925
|
Simplify bigram scoring algorithm
|
2020-12-07 14:24:33 +01:00 |
Dirkjan Ochtman
|
912e6477e3
|
Fix clippy problems in test data setup
|
2020-12-07 11:46:42 +01:00 |
Dirkjan Ochtman
|
eeb9c77bc7
|
Simplify Segmenter setup API
|
2020-12-07 11:39:49 +01:00 |
Dirkjan Ochtman
|
d554825594
|
Name complex type as suggested by clippy
|
2020-11-26 11:33:36 +01:00 |
Dirkjan Ochtman
|
691ecbc3c6
|
Simplify handling of empty tails
|
2020-11-26 11:20:06 +01:00 |
Dirkjan Ochtman
|
ae3896b47b
|
Use range for previous argument as well
|
2020-11-26 11:15:27 +01:00 |
Dirkjan Ochtman
|
bc20e39c1e
|
Make slicing cheaper by adding a little unsafe code
|
2020-11-26 11:14:53 +01:00 |
Dirkjan Ochtman
|
bb1b1db9c5
|
Pass Range instead of str to search()
|
2020-11-26 11:13:35 +01:00 |
Dirkjan Ochtman
|
4be435e0fb
|
Make split values absolute instead of relative
|
2020-11-26 11:12:52 +01:00 |
Dirkjan Ochtman
|
b7daaff47a
|
Simplify top-level loop
|
2020-11-26 10:46:27 +01:00 |
Dirkjan Ochtman
|
2f9cb95b5c
|
Avoid allocations for split vectors
|
2020-11-26 10:46:23 +01:00 |
Dirkjan Ochtman
|
a1f03e32fe
|
Remove unused lifetime
|
2020-11-25 17:33:50 +01:00 |
Dirkjan Ochtman
|
47271ff81e
|
Allocate a single Vec to back cached splits
|
2020-11-25 17:29:13 +01:00 |
Dirkjan Ochtman
|
947e003a48
|
Store splits instead of string slices
|
2020-11-25 17:29:13 +01:00 |
Dirkjan Ochtman
|
1df3c4397e
|
Inline TextDivider iterator
|
2020-11-25 17:29:13 +01:00 |
Dirkjan Ochtman
|
ead9a3064b
|
Better typed handling of previous word
|
2020-11-25 17:29:13 +01:00 |
Dirkjan Ochtman
|
ea4438f2e8
|
Make Segmenter::score() slightly more efficient
|
2020-11-25 17:29:13 +01:00 |
Dirkjan Ochtman
|
540348f703
|
Abstract over test data format code and API
|
2020-11-25 17:29:13 +01:00 |
Dirkjan Ochtman
|
0d7fbd53e7
|
Prevent allocations where possible
|
2020-11-25 17:29:11 +01:00 |
Dirkjan Ochtman
|
1b4377715f
|
Move from err-derive to thiserror
|
2020-11-23 13:23:16 +01:00 |
Dirkjan Ochtman
|
76bdcf1ca5
|
Separate state from Segmenter
|
2020-05-28 19:56:13 +02:00 |
Dirkjan Ochtman
|
98a8368be6
|
Avoid string allocations for search
|
2020-05-28 19:56:13 +02:00 |
Dirkjan Ochtman
|
b9c8402b0c
|
Prevent allocations for memo keys
|
2020-05-28 19:56:13 +02:00 |
Dirkjan Ochtman
|
0f69f267d8
|
Use ahash for hashing
|
2020-05-28 19:56:13 +02:00 |
Dirkjan Ochtman
|
38f9747c92
|
Initial version
|
2020-05-26 20:07:00 +02:00 |