Class DocumentAnalysisAsyncClient
This class provides an asynchronous client to connect to the Form Recognizer Azure Cognitive Service.
This client provides asynchronous methods to perform:
- Custom Document Analysis: Classification, extraction and analysis of data from forms and documents specific
to distinct business data and use cases. Use the custom trained model by passing its modelId into the
beginAnalyzeDocument(String, BinaryData)
method. - General Document Analysis: Extract text, tables, structure, and key-value pairs. Use general document model
provided by the Form Recognizer service by passing modelId="rebuilt-document" into the
beginAnalyzeDocument(String, BinaryData)
method. - Prebuilt Model Analysis: Analyze receipts, business cards, invoices, ID's, W2's and other documents with
supported prebuilt models. Use the prebuilt receipt model
provided by passing modelId="prebuilt-receipt" into the
beginAnalyzeDocument(String, BinaryData)
method. - Layout Analysis: Extract text, selection marks, and tables structures, along with their bounding box
coordinates, from forms and documents. Use the layout analysis model provided the service by passing
modelId="prebuilt-layout" into the
beginAnalyzeDocument(String, BinaryData)
method. - Polling and Callbacks: It includes mechanisms for polling the service to check the status of an analysis operation or registering callbacks to receive notifications when the analysis is complete.
This client also provides different methods based on inputs from a URL and inputs from a stream.
Note: This client only supports
DocumentAnalysisServiceVersion.V2022_08_31
and newer.
To use an older service version, FormRecognizerClient
and .formrecognizer.training.FormTrainingClient
.
Service clients are the point of interaction for developers to use Azure Form Recognizer.
DocumentAnalysisClient
is the synchronous service client and
DocumentAnalysisAsyncClient
is the asynchronous service client.
The examples shown in this document use a credential object named DefaultAzureCredential for authentication, which is
appropriate for most scenarios, including local development and production environments. Additionally, we
recommend using
managed identity
for authentication in production environments.
You can find more information on different ways of authenticating and their corresponding credential types in the
Azure Identity documentation".
Sample: Construct a DocumentAnalysisAsyncClient with DefaultAzureCredential
The following code sample demonstrates the creation of a
DocumentAnalysisAsyncClient
, using
the `DefaultAzureCredentialBuilder` to configure it.
DocumentAnalysisAsyncClient documentAnalysisAsyncClient = new DocumentAnalysisClientBuilder() .endpoint("{endpoint}") .credential(new DefaultAzureCredentialBuilder().build()) .buildAsyncClient();
Further, see the code sample below to use
AzureKeyCredential
for client creation.
DocumentAnalysisAsyncClient documentAnalysisAsyncClient = new DocumentAnalysisClientBuilder() .credential(new AzureKeyCredential("{key}")) .endpoint("{endpoint}") .buildAsyncClient();
-
Method Summary
Modifier and TypeMethodDescriptioncom.azure.core.util.polling.PollerFlux<OperationResult,
AnalyzeResult> beginAnalyzeDocument
(String modelId, com.azure.core.util.BinaryData document) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.com.azure.core.util.polling.PollerFlux<OperationResult,
AnalyzeResult> beginAnalyzeDocument
(String modelId, com.azure.core.util.BinaryData document, AnalyzeDocumentOptions analyzeDocumentOptions) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.com.azure.core.util.polling.PollerFlux<OperationResult,
AnalyzeResult> beginAnalyzeDocumentFromUrl
(String modelId, String documentUrl) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.com.azure.core.util.polling.PollerFlux<OperationResult,
AnalyzeResult> beginAnalyzeDocumentFromUrl
(String modelId, String documentUrl, AnalyzeDocumentOptions analyzeDocumentOptions) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.com.azure.core.util.polling.PollerFlux<OperationResult,
AnalyzeResult> beginClassifyDocument
(String classifierId, com.azure.core.util.BinaryData document) Classify a given document using a document classifier.com.azure.core.util.polling.PollerFlux<OperationResult,
AnalyzeResult> beginClassifyDocumentFromUrl
(String classifierId, String documentUrl) Classify a given document using a document classifier.
-
Method Details
-
beginAnalyzeDocumentFromUrl
public com.azure.core.util.polling.PollerFlux<OperationResult,AnalyzeResult> beginAnalyzeDocumentFromUrl(String modelId, String documentUrl) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.The service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support.
Code sample
Analyze a document using the URL of the document.
String documentUrl = "{document_url}"; String modelId = "{model_id}"; documentAnalysisAsyncClient.beginAnalyzeDocumentFromUrl(modelId, documentUrl) // if polling operation completed, retrieve the final result. .flatMap(AsyncPollResponse::getFinalResult) .subscribe(analyzeResult -> analyzeResult.getDocuments() .forEach(document -> document.getFields() .forEach((key, documentField) -> { System.out.printf("Field text: %s%n", key); System.out.printf("Field value data content: %s%n", documentField.getContent()); System.out.printf("Confidence score: %.2f%n", documentField.getConfidence()); })));
- Parameters:
modelId
- The unique model ID to be used. Use this to specify the custom model ID or prebuilt model ID. Prebuilt model IDs supported can be found heredocumentUrl
- The URL of the document to analyze.- Returns:
- A
PollerFlux
that polls the progress of the analyze document operation until it has completed, has failed, or has been cancelled. The completed operation returns anAnalyzeResult
. - Throws:
com.azure.core.exception.HttpResponseException
- If analyze operation fails and theAnalyzeResultOperation
returns with anOperationStatus.FAILED
..IllegalArgumentException
- IfdocumentUrl
ormodelId
is null.
-
beginAnalyzeDocumentFromUrl
public com.azure.core.util.polling.PollerFlux<OperationResult,AnalyzeResult> beginAnalyzeDocumentFromUrl(String modelId, String documentUrl, AnalyzeDocumentOptions analyzeDocumentOptions) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.The service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support.
Code sample
Analyze a document using the URL of the document with configurable options.
String documentUrl = "{document_url}"; // analyze a receipt using prebuilt model String modelId = "prebuilt-receipt"; documentAnalysisAsyncClient.beginAnalyzeDocumentFromUrl(modelId, documentUrl, new AnalyzeDocumentOptions().setPages(Arrays.asList("1", "3"))) // if polling operation completed, retrieve the final result. .flatMap(AsyncPollResponse::getFinalResult) .subscribe(analyzeResult -> { System.out.println(analyzeResult.getModelId()); analyzeResult.getDocuments() .forEach(document -> document.getFields() .forEach((key, documentField) -> { System.out.printf("Field text: %s%n", key); System.out.printf("Field value data content: %s%n", documentField.getContent()); System.out.printf("Confidence score: %.2f%n", documentField.getConfidence()); })); });
- Parameters:
modelId
- The unique model ID to be used. Use this to specify the custom model ID or prebuilt model ID. Prebuilt model IDs supported can be found heredocumentUrl
- The source URL to the input form.analyzeDocumentOptions
- The additional configurableoptions
that may be passed when analyzing documents.- Returns:
- A
PollerFlux
that polls progress of the analyze document operation until it has completed, has failed, or has been cancelled. The completed operation returns anAnalyzeResult
. - Throws:
com.azure.core.exception.HttpResponseException
- If analyze operation fails and theAnalyzeResultOperation
returns with anOperationStatus.FAILED
.IllegalArgumentException
- IfdocumentUrl
ormodelId
is null.
-
beginAnalyzeDocument
public com.azure.core.util.polling.PollerFlux<OperationResult,AnalyzeResult> beginAnalyzeDocument(String modelId, com.azure.core.util.BinaryData document) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.The service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support.
Note that the
data
passed must be replayable if retries are enabled (the default). In other words, theFlux
must produce the same data each time it is subscribed to.Code sample
Analyze a document.
File document = new File("{local/file_path/fileName.jpg}"); String modelId = "{model_id}"; // Utility method to convert input stream to Binary Data BinaryData buffer = BinaryData.fromStream(new ByteArrayInputStream(Files.readAllBytes(document.toPath()))); documentAnalysisAsyncClient.beginAnalyzeDocument(modelId, buffer) // if polling operation completed, retrieve the final result. .flatMap(AsyncPollResponse::getFinalResult) .subscribe(analyzeResult -> analyzeResult.getDocuments() .forEach(analyzedDocument -> analyzedDocument.getFields() .forEach((key, documentField) -> { System.out.printf("Field text: %s%n", key); System.out.printf("Field value data content: %s%n", documentField.getContent()); System.out.printf("Confidence score: %.2f%n", documentField.getConfidence()); })));
- Parameters:
modelId
- The unique model ID to be used. Use this to specify the custom model ID or prebuilt model ID. Prebuilt model IDs supported can be found heredocument
- The data of the document to analyze information from.- Returns:
- A
PollerFlux
that polls the progress of the analyze document operation until it has completed, has failed, or has been cancelled. The completed operation returns anAnalyzeResult
. - Throws:
com.azure.core.exception.HttpResponseException
- If analyze operation fails and returns with anOperationStatus.FAILED
.IllegalArgumentException
- Ifdocument
ormodelId
is null.
-
beginAnalyzeDocument
public com.azure.core.util.polling.PollerFlux<OperationResult,AnalyzeResult> beginAnalyzeDocument(String modelId, com.azure.core.util.BinaryData document, AnalyzeDocumentOptions analyzeDocumentOptions) Analyzes data from documents with optical character recognition (OCR) and semantic values from a given document using any of the prebuilt models or a custom-built analysis model.The service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support.
Note that the
data
passed must be replayable if retries are enabled (the default). In other words, theFlux
must produce the same data each time it is subscribed to.Code sample
Analyze a document with configurable options. .
File document = new File("{local/file_path/fileName.jpg}"); String modelId = "{model_id}"; final AnalyzeDocumentOptions analyzeDocumentOptions = new AnalyzeDocumentOptions().setPages(Arrays.asList("1", "3")).setDocumentAnalysisFeatures( Collections.singletonList( DocumentAnalysisFeature.FORMULAS)); // Utility method to convert input stream to Binary Data BinaryData buffer = BinaryData.fromStream(new ByteArrayInputStream(Files.readAllBytes(document.toPath()))); documentAnalysisAsyncClient.beginAnalyzeDocument(modelId, buffer, analyzeDocumentOptions) // if polling operation completed, retrieve the final result. .flatMap(AsyncPollResponse::getFinalResult) .subscribe(analyzeResult -> { System.out.println(analyzeResult.getModelId()); analyzeResult.getDocuments() .forEach(analyzedDocument -> analyzedDocument.getFields() .forEach((key, documentField) -> { System.out.printf("Field text: %s%n", key); System.out.printf("Field value data content: %s%n", documentField.getContent()); System.out.printf("Confidence score: %.2f%n", documentField.getConfidence()); })); });
- Parameters:
modelId
- The unique model ID to be used. Use this to specify the custom model ID or prebuilt model ID. Prebuilt model IDs supported can be found heredocument
- The data of the document to analyze information from.analyzeDocumentOptions
- The additional configurableoptions
that may be passed when analyzing documents.- Returns:
- A
PollerFlux
that polls the progress of the analyze document operation until it has completed, has failed, or has been cancelled. The completed operation returns anAnalyzeResult
. - Throws:
com.azure.core.exception.HttpResponseException
- If analyze operation fails and returns with anOperationStatus.FAILED
.IllegalArgumentException
- Ifdocument
ormodelId
is null.IllegalArgumentException
- Ifdocument
length is null or unspecified. UseBinaryData.fromStream(InputStream, Long)
to create an instance of thedocument
from givenInputStream
with length.
-
beginClassifyDocumentFromUrl
public com.azure.core.util.polling.PollerFlux<OperationResult,AnalyzeResult> beginClassifyDocumentFromUrl(String classifierId, String documentUrl) Classify a given document using a document classifier. For more information on how to build a custom classifier model, seeThe service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support.
Code sample
Analyze a document using the URL of the document.
String documentUrl = "{document_url}"; String classifierId = "custom-trained-classifier-id"; documentAnalysisAsyncClient.beginClassifyDocumentFromUrl(classifierId, documentUrl) // if polling operation completed, retrieve the final result. .flatMap(AsyncPollResponse::getFinalResult) .subscribe(analyzeResult -> { System.out.println(analyzeResult.getModelId()); analyzeResult.getDocuments() .forEach(analyzedDocument -> System.out.printf("Doc Type: %s%n", analyzedDocument.getDocType())); });
- Parameters:
classifierId
- The unique classifier ID to be used. Use this to specify the custom classifier ID.documentUrl
- The URL of the document to analyze.- Returns:
- A
PollerFlux
that polls the progress of the analyze document operation until it has completed, has failed, or has been cancelled. The completed operation returns anAnalyzeResult
. - Throws:
com.azure.core.exception.HttpResponseException
- If analyze operation fails and theAnalyzeResultOperation
returns with anOperationStatus.FAILED
..IllegalArgumentException
- IfdocumentUrl
orclassifierId
is null.
-
beginClassifyDocument
public com.azure.core.util.polling.PollerFlux<OperationResult,AnalyzeResult> beginClassifyDocument(String classifierId, com.azure.core.util.BinaryData document) Classify a given document using a document classifier. For more information on how to build a custom classifier model, seeThe service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support.
Note that the
data
passed must be replayable if retries are enabled (the default). In other words, theFlux
must produce the same data each time it is subscribed to.Code sample
Analyze a document with configurable options.
File document = new File("{local/file_path/fileName.jpg}"); String classifierId = "{model_id}"; // Utility method to convert input stream to Binary Data BinaryData buffer = BinaryData.fromStream(new ByteArrayInputStream(Files.readAllBytes(document.toPath()))); documentAnalysisAsyncClient.beginClassifyDocument(classifierId, buffer) // if polling operation completed, retrieve the final result. .flatMap(AsyncPollResponse::getFinalResult) .subscribe(analyzeResult -> { System.out.println(analyzeResult.getModelId()); analyzeResult.getDocuments() .forEach(analyzedDocument -> System.out.printf("Doc Type: %s%n", analyzedDocument.getDocType())); });
- Parameters:
classifierId
- The unique classifier ID to be used. Use this to specify the custom classifier ID.document
- The data of the document to analyze information from. For service supported file types, see:- Returns:
- A
PollerFlux
that polls the progress of the analyze document operation until it has completed, has failed, or has been cancelled. The completed operation returns anAnalyzeResult
. - Throws:
com.azure.core.exception.HttpResponseException
- If analyze operation fails and returns with anOperationStatus.FAILED
.IllegalArgumentException
- Ifdocument
orclassifierId
is null.IllegalArgumentException
- Ifdocument
length is null or unspecified. UseBinaryData.fromStream(InputStream, Long)
to create an instance of thedocument
from givenInputStream
with length.
-