Options
All
  • Public
  • Public/Protected
  • All
Menu

Interface IndexingParametersConfiguration

Package version

A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.

Hierarchy

  • IndexingParametersConfiguration

Indexable

[property: string]: any

Describes unknown properties. The value of an unknown property can be of "any" type.

Index

Properties

Optional allowSkillsetToReadFileData

allowSkillsetToReadFileData: undefined | false | true

If true, will create a path //document//file_data that is an object representing the original file data downloaded from your blob data source. This allows you to pass the original file data to a custom skill for processing within the enrichment pipeline, or to the Document Extraction skill.

Optional dataToExtract

Specifies the data to extract from Azure blob storage and tells the indexer which data to extract from image content when "imageAction" is set to a value other than "none". This applies to embedded image content in a .PDF or other application, or image files such as .jpg and .png, in Azure blobs.

Optional delimitedTextDelimiter

delimitedTextDelimiter: undefined | string

For CSV blobs, specifies the end-of-line single-character delimiter for CSV files where each line starts a new document (for example, "|").

Optional delimitedTextHeaders

delimitedTextHeaders: undefined | string

For CSV blobs, specifies a comma-delimited list of column headers, useful for mapping source fields to destination fields in an index.

Optional documentRoot

documentRoot: undefined | string

For JSON arrays, given a structured or semi-structured document, you can specify a path to the array using this property.

Optional excludedFileNameExtensions

excludedFileNameExtensions: undefined | string

Comma-delimited list of filename extensions to ignore when processing from Azure blob storage. For example, you could exclude ".png, .mp4" to skip over those files during indexing.

Optional executionEnvironment

executionEnvironment: IndexerExecutionEnvironment

Specifies the environment in which the indexer should execute.

Optional failOnUnprocessableDocument

failOnUnprocessableDocument: undefined | false | true

For Azure blobs, set to false if you want to continue indexing if a document fails indexing.

Optional failOnUnsupportedContentType

failOnUnsupportedContentType: undefined | false | true

For Azure blobs, set to false if you want to continue indexing when an unsupported content type is encountered, and you don't know all the content types (file extensions) in advance.

Optional firstLineContainsHeaders

firstLineContainsHeaders: undefined | false | true

For CSV blobs, indicates that the first (non-blank) line of each blob contains headers.

Optional imageAction

Determines how to process embedded images and image files in Azure blob storage. Setting the "imageAction" configuration to any value other than "none" requires that a skillset also be attached to that indexer.

Optional indexStorageMetadataOnlyForOversizedDocuments

indexStorageMetadataOnlyForOversizedDocuments: undefined | false | true

For Azure blobs, set this property to true to still index storage metadata for blob content that is too large to process. Oversized blobs are treated as errors by default. For limits on blob size, see https://docs.microsoft.com/azure/search/search-limits-quotas-capacity.

Optional indexedFileNameExtensions

indexedFileNameExtensions: undefined | string

Comma-delimited list of filename extensions to select when processing from Azure blob storage. For example, you could focus indexing on specific application files ".docx, .pptx, .msg" to specifically include those file types.

Optional parsingMode

Represents the parsing mode for indexing from an Azure blob data source.

Optional pdfTextRotationAlgorithm

pdfTextRotationAlgorithm: BlobIndexerPDFTextRotationAlgorithm

Determines algorithm for text extraction from PDF files in Azure blob storage.

Optional queryTimeout

queryTimeout: undefined | string

Increases the timeout beyond the 5-minute default for Azure SQL database data sources, specified in the format "hh:mm:ss".

Generated using TypeDoc