Commit Graph

98 Commits

Author SHA1 Message Date
Nick Rempel
9bbb633f1d
Flesh out README (#14) 2021-04-29 11:12:42 +02:00
Dirkjan Ochtman
eca12c572f Bump version number to 0.8.1 2021-04-22 15:08:23 +02:00
Dirkjan Ochtman
bba1de7543 Simplify loop 2021-04-22 15:07:54 +02:00
Dirkjan Ochtman
c21b66ab83 Rename sentence_score() to score_sentence() 2021-04-22 15:04:48 +02:00
Dirkjan Ochtman
62f5b79d6d py: add Segmenter::sentence_score() method 2021-04-22 15:04:06 +02:00
Dirkjan Ochtman
85035a9b34 Add Segmenter::sentence_score() method 2021-04-22 14:58:06 +02:00
Dirkjan Ochtman
bd014dcc5c Move logarithm conversion into score() 2021-04-22 14:54:54 +02:00
Beau Hartshorne
85038d1f6f
Add files via upload 2021-04-20 11:06:47 -07:00
Dirkjan Ochtman
507e8da5ef Bump version to 0.8.0 2021-04-01 11:04:42 +02:00
Dirkjan Ochtman
754b0d5692 Revert version number for testing 2021-04-01 10:07:22 +02:00
Dirkjan Ochtman
2d942bbfc9 Box up the BitVec array
The `Search::best` field will take about 8000 bytes. In some of our usage
with rayon, this appeared to cause stack overflows. Boxing it up makes the
code slower by about 1-2%, but should hopefully avoid stack overflows.
2021-04-01 10:03:48 +02:00
Dirkjan Ochtman
a4fe0e4039 Bump version to 0.7.2
Now that the core crate is in a directory, we no longer needlessly publish
data files on crates.io.
2021-03-24 13:16:49 +01:00
Dirkjan Ochtman
55fb3c664f Tweak CI to avoid testing bindings for now 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
987220c586 py: add some comments 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
5d8f1b2fb0 py: add load() and dump() methods 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
8fe1b2ab46 Optimize test data reader 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
fd774ad465 py: initial version of Python bindings 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
f6061044fc Add helper method for Python bindings 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
0ce148db1e Consistent ordering of impl blocks 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
11a7e88b95 Start a workspace 2021-03-24 11:57:29 +01:00
Dirkjan Ochtman
a146790e17 Bump version number to 0.7.1 2021-03-24 11:53:20 +01:00
Dirkjan Ochtman
8f7959eeed Use separate total value for bigrams 2021-02-11 12:08:19 +01:00
Dirkjan Ochtman
9addd3810b Use more compact cache key 2021-02-11 12:05:39 +01:00
Dirkjan Ochtman
83aa46593a Use bit vectors to improve performance 2021-02-11 11:56:33 +01:00
Dirkjan Ochtman
9dd1cf089d Simplify the API some more 2021-02-10 13:23:37 +01:00
Dirkjan Ochtman
4338ff2c0c Fix formatting 2021-02-10 13:10:38 +01:00
Dirkjan Ochtman
cd06fbecc8 Guarantee known size of the output iterator 2021-02-10 13:07:05 +01:00
Dirkjan Ochtman
d190aa5240 Simplify API by moving result data into Search 2021-02-10 13:03:06 +01:00
Dirkjan Ochtman
9735e64ee4 Bump version to 0.5.1 2021-02-10 12:51:36 +01:00
Dirkjan Ochtman
95804e9672 Derive Clone for Search 2021-02-10 12:51:21 +01:00
Dirkjan Ochtman
da26dedfc8 Apply clippy suggestion 2021-02-10 11:53:15 +01:00
Dirkjan Ochtman
a862ec97a5 Version bump to 0.5.0 2021-02-10 11:49:09 +01:00
Dirkjan Ochtman
13b29d183e Take an explicit search parameter 2021-02-10 11:48:24 +01:00
Dirkjan Ochtman
be0f8c0ed7 Don't normalize input strings implicitly 2021-02-08 15:53:24 +01:00
Dirkjan Ochtman
8c08bb9e14 Add check_segments function 2021-02-04 11:20:49 +01:00
Dirkjan Ochtman
f3aaaa656d Bump version to 0.4.0 2021-02-04 11:20:49 +01:00
Dirkjan Ochtman
5127aac1ec Add optional support for serde 2021-02-04 11:20:49 +01:00
Dirkjan Ochtman
bacf82c8cc Separate incorrect segmentation out of TEST_CASES 2021-02-04 10:40:45 +01:00
Dirkjan Ochtman
96187965b6 Extract public asssert_segments() function 2021-02-04 10:40:04 +01:00
Dirkjan Ochtman
45e569379c Default to calculating total from unigram map 2021-02-04 10:36:30 +01:00
Dirkjan Ochtman
0d2930c408 Add API to create segmenter from hashmaps directly 2021-02-04 10:36:30 +01:00
Dirkjan Ochtman
b85fc6adc2 Rename testcases to test_cases 2021-02-04 10:36:30 +01:00
Dirkjan Ochtman
55cc7c54a3 Use powi() instead of powf() for performance 2021-02-04 10:17:11 +01:00
Dirkjan Ochtman
970caeba44 Use std HashMap to simplify API 2021-02-04 10:16:38 +01:00
Dirkjan Ochtman
c1068c2e53 Bump version number to 0.3.2 2021-02-01 17:25:55 +01:00
Dirkjan Ochtman
29d2d94a8d Reorganize tests and test data to expose test cases 2021-02-01 17:25:32 +01:00
dependabot-preview[bot]
d4df4ce29a Update ahash requirement from 0.6.1 to 0.7.0
Updates the requirements on [ahash](https://github.com/tkaitchuck/ahash) to permit the latest version.
- [Release notes](https://github.com/tkaitchuck/ahash/releases)
- [Commits](https://github.com/tkaitchuck/ahash/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2021-01-27 15:30:34 +01:00
Dirkjan Ochtman
41fb2075a6 Tighten the language a little bit 2020-12-16 10:48:31 +01:00
Dirkjan Ochtman
27d20f07e5 Add crate badges to README 2020-12-16 10:44:56 +01:00
Dirkjan Ochtman
a8d93efbb6 Add cover to README 2020-12-16 10:42:35 +01:00