Class WordDelimiterTokenFilter

java.lang.Object
com.azure.search.documents.indexes.models.TokenFilter
com.azure.search.documents.indexes.models.WordDelimiterTokenFilter

public final class WordDelimiterTokenFilter extends TokenFilter
Splits words into subwords and performs optional transformations on subword groups. This token filter is implemented using Apache Lucene.
  • Constructor Details

    • WordDelimiterTokenFilter

      public WordDelimiterTokenFilter(String name)
      Constructor of WordDelimiterTokenFilter.
      Parameters:
      name - The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
  • Method Details

    • generateWordParts

      public Boolean generateWordParts()
      Get the generateWordParts property: A value indicating whether to generate part words. If set, causes parts of words to be generated; for example "AzureSearch" becomes "Azure" "Search". Default is true.
      Returns:
      the generateWordParts value.
    • setGenerateWordParts

      public WordDelimiterTokenFilter setGenerateWordParts(Boolean generateWordParts)
      Set the generateWordParts property: A value indicating whether to generate part words. If set, causes parts of words to be generated; for example "AzureSearch" becomes "Azure" "Search". Default is true.
      Parameters:
      generateWordParts - the generateWordParts value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • generateNumberParts

      public Boolean generateNumberParts()
      Get the generateNumberParts property: A value indicating whether to generate number subwords. Default is true.
      Returns:
      the generateNumberParts value.
    • setGenerateNumberParts

      public WordDelimiterTokenFilter setGenerateNumberParts(Boolean generateNumberParts)
      Set the generateNumberParts property: A value indicating whether to generate number subwords. Default is true.
      Parameters:
      generateNumberParts - the generateNumberParts value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • areWordsCatenated

      public Boolean areWordsCatenated()
      Get the catenateWords property: A value indicating whether maximum runs of word parts will be catenated. For example, if this is set to true, "Azure-Search" becomes "AzureSearch". Default is false.
      Returns:
      the catenateWords value.
    • setWordsCatenated

      public WordDelimiterTokenFilter setWordsCatenated(Boolean wordsCatenated)
      Set the catenateWords property: A value indicating whether maximum runs of word parts will be catenated. For example, if this is set to true, "Azure-Search" becomes "AzureSearch". Default is false.
      Parameters:
      wordsCatenated - the catenateWords value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • areNumbersCatenated

      public Boolean areNumbersCatenated()
      Get the catenateNumbers property: A value indicating whether maximum runs of number parts will be catenated. For example, if this is set to true, "1-2" becomes "12". Default is false.
      Returns:
      the catenateNumbers value.
    • setNumbersCatenated

      public WordDelimiterTokenFilter setNumbersCatenated(Boolean numbersCatenated)
      Set the catenateNumbers property: A value indicating whether maximum runs of number parts will be catenated. For example, if this is set to true, "1-2" becomes "12". Default is false.
      Parameters:
      numbersCatenated - the catenateNumbers value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • catenateAll

      public Boolean catenateAll()
      Get the catenateAll property: A value indicating whether all subword parts will be catenated. For example, if this is set to true, "Azure-Search-1" becomes "AzureSearch1". Default is false.
      Returns:
      the catenateAll value.
    • setCatenateAll

      public WordDelimiterTokenFilter setCatenateAll(Boolean catenateAll)
      Set the catenateAll property: A value indicating whether all subword parts will be catenated. For example, if this is set to true, "Azure-Search-1" becomes "AzureSearch1". Default is false.
      Parameters:
      catenateAll - the catenateAll value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • splitOnCaseChange

      public Boolean splitOnCaseChange()
      Get the splitOnCaseChange property: A value indicating whether to split words on caseChange. For example, if this is set to true, "AzureSearch" becomes "Azure" "Search". Default is true.
      Returns:
      the splitOnCaseChange value.
    • setSplitOnCaseChange

      public WordDelimiterTokenFilter setSplitOnCaseChange(Boolean splitOnCaseChange)
      Set the splitOnCaseChange property: A value indicating whether to split words on caseChange. For example, if this is set to true, "AzureSearch" becomes "Azure" "Search". Default is true.
      Parameters:
      splitOnCaseChange - the splitOnCaseChange value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • isPreserveOriginal

      public Boolean isPreserveOriginal()
      Get the preserveOriginal property: A value indicating whether original words will be preserved and added to the subword list. Default is false.
      Returns:
      the preserveOriginal value.
    • setPreserveOriginal

      public WordDelimiterTokenFilter setPreserveOriginal(Boolean preserveOriginal)
      Set the preserveOriginal property: A value indicating whether original words will be preserved and added to the subword list. Default is false.
      Parameters:
      preserveOriginal - the preserveOriginal value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • splitOnNumerics

      public Boolean splitOnNumerics()
      Get the splitOnNumerics property: A value indicating whether to split on numbers. For example, if this is set to true, "Azure1Search" becomes "Azure" "1" "Search". Default is true.
      Returns:
      the splitOnNumerics value.
    • setSplitOnNumerics

      public WordDelimiterTokenFilter setSplitOnNumerics(Boolean splitOnNumerics)
      Set the splitOnNumerics property: A value indicating whether to split on numbers. For example, if this is set to true, "Azure1Search" becomes "Azure" "1" "Search". Default is true.
      Parameters:
      splitOnNumerics - the splitOnNumerics value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • isStemEnglishPossessive

      public Boolean isStemEnglishPossessive()
      Get the stemEnglishPossessive property: A value indicating whether to remove trailing "'s" for each subword. Default is true.
      Returns:
      the stemEnglishPossessive value.
    • setStemEnglishPossessive

      public WordDelimiterTokenFilter setStemEnglishPossessive(Boolean stemEnglishPossessive)
      Set the stemEnglishPossessive property: A value indicating whether to remove trailing "'s" for each subword. Default is true.
      Parameters:
      stemEnglishPossessive - the stemEnglishPossessive value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • getProtectedWords

      public List<String> getProtectedWords()
      Get the protectedWords property: A list of tokens to protect from being delimited.
      Returns:
      the protectedWords value.
    • setProtectedWords

      public WordDelimiterTokenFilter setProtectedWords(String... protectedWords)
      Set the protectedWords property: A list of tokens to protect from being delimited.
      Parameters:
      protectedWords - the protectedWords value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.
    • setProtectedWords

      public WordDelimiterTokenFilter setProtectedWords(List<String> protectedWords)
      Set the protectedWords property: A list of tokens to protect from being delimited.
      Parameters:
      protectedWords - the protectedWords value to set.
      Returns:
      the WordDelimiterTokenFilter object itself.