Class UaxUrlEmailTokenizer


  • public final class UaxUrlEmailTokenizer
    extends LexicalTokenizer
    Tokenizes urls and emails as one token. This tokenizer is implemented using Apache Lucene.
    • Constructor Detail

      • UaxUrlEmailTokenizer

        public UaxUrlEmailTokenizer​(String name)
        Constructor of UaxUrlEmailTokenizer.
        Parameters:
        name - The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
    • Method Detail

      • getMaxTokenLength

        public Integer getMaxTokenLength()
        Get the maxTokenLength property: The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
        Returns:
        the maxTokenLength value.
      • setMaxTokenLength

        public UaxUrlEmailTokenizer setMaxTokenLength​(Integer maxTokenLength)
        Set the maxTokenLength property: The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
        Parameters:
        maxTokenLength - the maxTokenLength value to set.
        Returns:
        the UaxUrlEmailTokenizer object itself.