Class NGramTokenizer

java.lang.Object
com.azure.search.documents.indexes.models.LexicalTokenizer
com.azure.search.documents.indexes.models.NGramTokenizer

public final class NGramTokenizer extends LexicalTokenizer
Tokenizes the input into n-grams of the given size(s). This tokenizer is implemented using Apache Lucene.
  • Constructor Details

    • NGramTokenizer

      public NGramTokenizer(String name)
      Constructor of NGramTokenizer.
      Parameters:
      name - The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
  • Method Details

    • getMinGram

      public Integer getMinGram()
      Get the minGram property: The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.
      Returns:
      the minGram value.
    • setMinGram

      public NGramTokenizer setMinGram(Integer minGram)
      Set the minGram property: The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.
      Parameters:
      minGram - the minGram value to set.
      Returns:
      the NGramTokenizer object itself.
    • getMaxGram

      public Integer getMaxGram()
      Get the maxGram property: The maximum n-gram length. Default is 2. Maximum is 300.
      Returns:
      the maxGram value.
    • setMaxGram

      public NGramTokenizer setMaxGram(Integer maxGram)
      Set the maxGram property: The maximum n-gram length. Default is 2. Maximum is 300.
      Parameters:
      maxGram - the maxGram value to set.
      Returns:
      the NGramTokenizer object itself.
    • getTokenChars

      public List<TokenCharacterKind> getTokenChars()
      Get the tokenChars property: Character classes to keep in the tokens.
      Returns:
      the tokenChars value.
    • setTokenChars

      public NGramTokenizer setTokenChars(TokenCharacterKind... tokenChars)
      Set the tokenChars property: Character classes to keep in the tokens.
      Parameters:
      tokenChars - the tokenChars value to set.
      Returns:
      the NGramTokenizer object itself.
    • setTokenChars

      public NGramTokenizer setTokenChars(List<TokenCharacterKind> tokenChars)
      Set the tokenChars property: Character classes to keep in the tokens.
      Parameters:
      tokenChars - the tokenChars value to set.
      Returns:
      the NGramTokenizer object itself.