mirror of
https://codeberg.org/forgejo/forgejo.git
synced 2024-11-23 08:47:42 -05:00
fdf750e4d4
* Update blevesearch v0.8.1 -> v1.0.7 * make vendor Co-authored-by: zeripath <art27@cantab.net>
66 lines
1.9 KiB
Markdown
Vendored
66 lines
1.9 KiB
Markdown
Vendored
# snowballstem
|
|
|
|
This repository contains the Go stemmers generated by the [Snowball](https://github.com/snowballstem/snowball) project. They are maintained outside of the core bleve package so that they may be more easily be reused in other contexts.
|
|
|
|
## Usage
|
|
|
|
All these stemmers export a single `Stem()` method which operates on a snowball `Env` structure. The `Env` structure maintains all state for the stemmer. A new `Env` is created to point at an initial string. After stemming, the results of the `Stem()` operation can be retrieved using the `Current()` method. The `Env` structure can be reused for subsequent calls by using the `SetCurrent()` method.
|
|
|
|
## Example
|
|
|
|
```
|
|
package main
|
|
|
|
import (
|
|
"fmt"
|
|
|
|
"github.com/blevesearch/snowballstem"
|
|
"github.com/blevesearch/snowballstem/english"
|
|
)
|
|
|
|
func main() {
|
|
|
|
// words to stem
|
|
words := []string{
|
|
"running",
|
|
"jumping",
|
|
}
|
|
|
|
// build new environment
|
|
env := snowballstem.NewEnv("")
|
|
|
|
for _, word := range words {
|
|
// set up environment for word
|
|
env.SetCurrent(word)
|
|
// invoke stemmer
|
|
english.Stem(env)
|
|
// print results
|
|
fmt.Printf("%s stemmed to %s\n", word, env.Current())
|
|
}
|
|
}
|
|
```
|
|
Produces Output:
|
|
```
|
|
$ ./snowtest
|
|
running stemmed to run
|
|
jumping stemmed to jump
|
|
```
|
|
|
|
## Testing
|
|
|
|
The test harness for these stemmers is hosted in the main [Snowball](https://github.com/snowballstem/snowball) repository. There are functional tests built around the separate [snowballstem-data](https://github.com/snowballstem/snowball-data) repository, and there is support for fuzz-testing the stemmers there as well.
|
|
|
|
## Generating the Stemmers
|
|
|
|
```
|
|
$ export SNOWBALL=/path/to/github.com/snowballstem/snowball/after/snowball/built
|
|
$ go generate
|
|
```
|
|
|
|
## Updated the Go Generate Commands
|
|
|
|
A simple tool is provided to automate these from the snowball algorithms directory:
|
|
|
|
```
|
|
$ go run gengen.go /path/to/github.com/snowballstem/snowball/algorithms
|
|
```
|