Pathhierarchy TokenizerΒΆ

The path_hierarchy tokenizer takes something like this:

<pre> /something/something/else

And produces tokens:

<pre> /something /something/something /something/something/else

Setting Description
delimiter The character delimiter to use, defaults to /.
replacement An optional replacement character to use. Defaults to the delimiter.
buffer_size The buffer size to use, defaults to 1024.

Previous topic

Ngram Tokenizer

Next topic

Pattern Tokenizer

This Page