Dirkjan Ochtman
2d942bbfc9
Box up the BitVec array
...
The `Search::best` field will take about 8000 bytes. In some of our usage
with rayon, this appeared to cause stack overflows. Boxing it up makes the
code slower by about 1-2%, but should hopefully avoid stack overflows.
2021-04-01 10:03:48 +02:00
Dirkjan Ochtman
a4fe0e4039
Bump version to 0.7.2
...
Now that the core crate is in a directory, we no longer needlessly publish
data files on crates.io.
2021-03-24 13:16:49 +01:00
Dirkjan Ochtman
55fb3c664f
Tweak CI to avoid testing bindings for now
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
987220c586
py: add some comments
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
5d8f1b2fb0
py: add load() and dump() methods
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
8fe1b2ab46
Optimize test data reader
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
fd774ad465
py: initial version of Python bindings
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
f6061044fc
Add helper method for Python bindings
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
0ce148db1e
Consistent ordering of impl blocks
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
11a7e88b95
Start a workspace
2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
a146790e17
Bump version number to 0.7.1
2021-03-24 11:53:20 +01:00
Dirkjan Ochtman
8f7959eeed
Use separate total value for bigrams
2021-02-11 12:08:19 +01:00
Dirkjan Ochtman
9addd3810b
Use more compact cache key
2021-02-11 12:05:39 +01:00
Dirkjan Ochtman
83aa46593a
Use bit vectors to improve performance
2021-02-11 11:56:33 +01:00
Dirkjan Ochtman
9dd1cf089d
Simplify the API some more
2021-02-10 13:23:37 +01:00
Dirkjan Ochtman
4338ff2c0c
Fix formatting
2021-02-10 13:10:38 +01:00
Dirkjan Ochtman
cd06fbecc8
Guarantee known size of the output iterator
2021-02-10 13:07:05 +01:00
Dirkjan Ochtman
d190aa5240
Simplify API by moving result data into Search
2021-02-10 13:03:06 +01:00
Dirkjan Ochtman
9735e64ee4
Bump version to 0.5.1
2021-02-10 12:51:36 +01:00
Dirkjan Ochtman
95804e9672
Derive Clone for Search
2021-02-10 12:51:21 +01:00
Dirkjan Ochtman
da26dedfc8
Apply clippy suggestion
2021-02-10 11:53:15 +01:00
Dirkjan Ochtman
a862ec97a5
Version bump to 0.5.0
2021-02-10 11:49:09 +01:00
Dirkjan Ochtman
13b29d183e
Take an explicit search parameter
2021-02-10 11:48:24 +01:00
Dirkjan Ochtman
be0f8c0ed7
Don't normalize input strings implicitly
2021-02-08 15:53:24 +01:00
Dirkjan Ochtman
8c08bb9e14
Add check_segments function
2021-02-04 11:20:49 +01:00
Dirkjan Ochtman
f3aaaa656d
Bump version to 0.4.0
2021-02-04 11:20:49 +01:00
Dirkjan Ochtman
5127aac1ec
Add optional support for serde
2021-02-04 11:20:49 +01:00
Dirkjan Ochtman
bacf82c8cc
Separate incorrect segmentation out of TEST_CASES
2021-02-04 10:40:45 +01:00
Dirkjan Ochtman
96187965b6
Extract public asssert_segments() function
2021-02-04 10:40:04 +01:00
Dirkjan Ochtman
45e569379c
Default to calculating total from unigram map
2021-02-04 10:36:30 +01:00
Dirkjan Ochtman
0d2930c408
Add API to create segmenter from hashmaps directly
2021-02-04 10:36:30 +01:00
Dirkjan Ochtman
b85fc6adc2
Rename testcases to test_cases
2021-02-04 10:36:30 +01:00
Dirkjan Ochtman
55cc7c54a3
Use powi() instead of powf() for performance
2021-02-04 10:17:11 +01:00
Dirkjan Ochtman
970caeba44
Use std HashMap to simplify API
2021-02-04 10:16:38 +01:00
Dirkjan Ochtman
c1068c2e53
Bump version number to 0.3.2
2021-02-01 17:25:55 +01:00
Dirkjan Ochtman
29d2d94a8d
Reorganize tests and test data to expose test cases
2021-02-01 17:25:32 +01:00
dependabot-preview[bot]
d4df4ce29a
Update ahash requirement from 0.6.1 to 0.7.0
...
Updates the requirements on [ahash](https://github.com/tkaitchuck/ahash ) to permit the latest version.
- [Release notes](https://github.com/tkaitchuck/ahash/releases )
- [Commits](https://github.com/tkaitchuck/ahash/commits )
Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2021-01-27 15:30:34 +01:00
Dirkjan Ochtman
41fb2075a6
Tighten the language a little bit
2020-12-16 10:48:31 +01:00
Dirkjan Ochtman
27d20f07e5
Add crate badges to README
2020-12-16 10:44:56 +01:00
Dirkjan Ochtman
a8d93efbb6
Add cover to README
2020-12-16 10:42:35 +01:00
Dirkjan Ochtman
f51a6e6cd5
Merge branch 'local' into main
2020-12-16 10:41:59 +01:00
Dirkjan Ochtman
275c3c63cb
Rename crate to instant-segment
2020-12-16 10:36:38 +01:00
Dirkjan Ochtman
3a37893e74
Update README with new name
2020-12-15 21:02:22 +01:00
Dirkjan Ochtman
dcc1c5edc1
Bump version to 0.3.1
2020-12-07 16:30:23 +01:00
Dirkjan Ochtman
cb3c9707ef
Add docstring for Segmenter type
2020-12-07 14:51:10 +01:00
Dirkjan Ochtman
adf7995adb
Remove now unused error type
2020-12-07 14:51:10 +01:00
Dirkjan Ochtman
2ab57ca0b1
Fix typo
2020-12-07 14:36:59 +01:00
Dirkjan Ochtman
c571996925
Simplify bigram scoring algorithm
2020-12-07 14:24:33 +01:00
Dirkjan Ochtman
f26793379b
No longer need a macro for testing
2020-12-07 11:55:27 +01:00
Dirkjan Ochtman
912e6477e3
Fix clippy problems in test data setup
2020-12-07 11:46:42 +01:00