Class PatternTokenizer
- java.lang.Object
-
- com.azure.search.documents.indexes.models.LexicalTokenizer
-
- com.azure.search.documents.indexes.models.PatternTokenizer
-
public final class PatternTokenizer extends LexicalTokenizer
Tokenizer that uses regex pattern matching to construct distinct tokens. This tokenizer is implemented using Apache Lucene.
-
-
Constructor Summary
Constructors Constructor Description PatternTokenizer(String name)
Constructor ofPatternTokenizer
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<RegexFlags>
getFlags()
Get the flags property: Regular expression flags.Integer
getGroup()
Get the group property: The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens.String
getPattern()
Get the pattern property: A regular expression pattern to match token separators.PatternTokenizer
setFlags(RegexFlags... flags)
Set the flags property: Regular expression flags.PatternTokenizer
setFlags(List<RegexFlags> flags)
Set the flags property: Regular expression flags.PatternTokenizer
setGroup(Integer group)
Set the group property: The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens.PatternTokenizer
setPattern(String pattern)
Set the pattern property: A regular expression pattern to match token separators.-
Methods inherited from class com.azure.search.documents.indexes.models.LexicalTokenizer
getName
-
-
-
-
Constructor Detail
-
PatternTokenizer
public PatternTokenizer(String name)
Constructor ofPatternTokenizer
.- Parameters:
name
- The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
-
-
Method Detail
-
getPattern
public String getPattern()
Get the pattern property: A regular expression pattern to match token separators. Default is an expression that matches one or more non-word characters.- Returns:
- the pattern value.
-
setPattern
public PatternTokenizer setPattern(String pattern)
Set the pattern property: A regular expression pattern to match token separators. Default is an expression that matches one or more non-word characters.- Parameters:
pattern
- the pattern value to set.- Returns:
- the PatternTokenizer object itself.
-
getFlags
public List<RegexFlags> getFlags()
Get the flags property: Regular expression flags.- Returns:
- the flags value.
-
setFlags
public PatternTokenizer setFlags(RegexFlags... flags)
Set the flags property: Regular expression flags.- Parameters:
flags
- the flags value to set.- Returns:
- the PatternTokenizer object itself.
-
setFlags
public PatternTokenizer setFlags(List<RegexFlags> flags)
Set the flags property: Regular expression flags.- Parameters:
flags
- the flags value to set.- Returns:
- the PatternTokenizer object itself.
-
getGroup
public Integer getGroup()
Get the group property: The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens. Use -1 if you want to use the entire pattern to split the input into tokens, irrespective of matching groups. Default is -1.- Returns:
- the group value.
-
setGroup
public PatternTokenizer setGroup(Integer group)
Set the group property: The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens. Use -1 if you want to use the entire pattern to split the input into tokens, irrespective of matching groups. Default is -1.- Parameters:
group
- the group value to set.- Returns:
- the PatternTokenizer object itself.
-
-