Azure Cognitive Services Form Recognizer uses cloud-based machine learning to extract structured data from form documents. Its features include:
Note: This package targets Azure Form Recognizer service API version 2.x.
Source code | Package (NPM) | API reference documentation | Product documentation | Samples
Form Recognizer supports both multi-service and single-service access. Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource.
You can create the resource using
Option 1: Azure Portal
Option 2: Azure CLI.
Below is an example of how you can create a Form Recognizer resource using the CLI:
# Create a new resource group to hold the Form Recognizer resource -
# if using an existing resource group, skip this step
az group create --name my-resource-group --location westus2
If you use the Azure CLI, replace <your-resource-group-name>
and <your-resource-name>
with your own unique names:
az cognitiveservices account create --kind FormRecognizer --resource-group <your-resource-group-name> --name <your-resource-name> --sku <your-sku-name> --location <your-location>
@azure/ai-form-recognizer
packageInstall the Azure Form Recognizer client library for JavaScript with npm
:
npm install @azure/ai-form-recognizer
In order to interact with the Form Recognizer service, you'll need to select either a FormRecognizerClient
or a FormTrainingClient
, and create an instance of this type. In the following examples, we will use FormRecognizerClient
. To create a client instance to access the Form Recognizer API, you will need the endpoint
of your Form Recognizer resource and a credential
. The Form Recognizer clients can use either an AzureKeyCredential
with an API key of your resource or a TokenCredential
that uses Azure Active Directory RBAC to authorize the client.
You can find the endpoint for your Form Recognizer resource either in the Azure Portal or by using the Azure CLI snippet below:
az cognitiveservices account show --name <your-resource-name> --resource-group <your-resource-group-name> --query "endpoint"
Use the Azure Portal to browse to your Form Recognizer resource and retrieve an API key, or use the Azure CLI snippet below:
Note: Sometimes the API key is referred to as a "subscription key" or "subscription API key."
az cognitiveservices account keys list --resource-group <your-resource-group-name> --name <your-resource-name>
Once you have an API key and endpoint, you can use it as follows:
const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
const client = new FormRecognizerClient("<endpoint>", new AzureKeyCredential("<API key>"));
API key authorization is used in most of the examples, but you can also authenticate the client with Azure Active Directory using the Azure Identity library. To use the DefaultAzureCredential provider shown below or other credential providers provided with the Azure SDK, please install the @azure/identity
package:
npm install @azure/identity
To authenticate using a service principal, you will also need to register an AAD application and grant access to Form Recognizer by assigning the "Cognitive Services User"
role to your service principal (note: other roles such as "Owner"
will not grant the necessary permissions, only "Cognitive Services User"
will suffice to run the examples and the sample code).
Set the values of the client ID, tenant ID, and client secret of the AAD application as environment variables: AZURE_CLIENT_ID
, AZURE_TENANT_ID
, AZURE_CLIENT_SECRET
.
const { FormRecognizerClient } = require("@azure/ai-form-recognizer");
const { DefaultAzureCredential } = require("@azure/identity");
const client = new FormRecognizerClient("<endpoint>", new DefaultAzureCredential());
FormRecognizerClient
provides operations for:
RecognizedForm
objects.RecognizedForm
objects.FormPage
objects.FormTrainingClient
provides operations for:
CustomFormModel
indicating the form types the model will recognize and the fields it will extract for each form type. See the service's documentation on unlabeled model training for a more detailed explanation of creating a training data set.CustomFormModel
indicating the fields the model will extract, as well as the model's confidence in the accuracy of each field. See the service's documentation on labeled model training for a more detailed explanation of applying labels to a training data set.Please note that models can also be trained using a graphical user interface such as the Form Recognizer Labeling Tool.
Sample code snippets that illustrate the use of FormTrainingClient
can be found below, in the "Train a Model" section..
Long-running operations (LROs) are operations which consist of an initial request sent to the service to start an operation, followed by polling for a result at a certain intervals to determine if the operation has completed and whether it failed or succeeded. Ultimately, the LRO will either fail with an error or produce a result.
In Azure Form Recognizer, operations that create/copy models (including composing models) or that extract values from forms are LROs. The SDK clients provide asynchronous begin<operation-name>
methods that return Promise<PollerLike>
objects. The PollerLike
object represents the LRO, and a program can wait for the operation to complete by awaiting pollUntilDone()
on the poller returned from the begin<operation-name>
method. Sample code snippets are provided to illustrate using long-running operations in the next section.
The following section provides several JavaScript code snippets illustrating common patterns used in the Form Recognizer client libraries.
Recognize fields and table data from forms. These models are trained with your own data, so they're tailored to your forms. A custom model should only be used with forms of the same document structure as those used to train the model.
const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
const fs = require("fs");
async function main() {
const endpoint = "<cognitive services endpoint>";
const apiKey = "<api key>";
const modelId = "<model id>";
const path = "<path to a form document>";
const readStream = fs.createReadStream(path);
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(apiKey));
const poller = await client.beginRecognizeCustomForms(modelId, readStream, {
onProgress: (state) => {
console.log(`status: ${state.status}`);
}
});
const forms = await poller.pollUntilDone();
console.log("Forms:");
for (const form of forms || []) {
console.log(`${form.formType}, page range: ${form.pageRange}`);
console.log("Pages:");
for (const page of form.pages || []) {
console.log(`Page number: ${page.pageNumber}`);
console.log("Tables");
for (const table of page.tables || []) {
for (const cell of table.cells) {
console.log(`cell (${cell.rowIndex},${cell.columnIndex}) ${cell.text}`);
}
}
}
console.log("Fields:");
for (const fieldName in form.fields) {
// each field is of type FormField
const field = form.fields[fieldName];
console.log(
`Field ${fieldName} has value '${field.value}' with a confidence score of ${field.confidence}`
);
}
}
}
main().catch((err) => {
console.error("The sample encountered an error:", err);
});
Alternatively, a form URL can be used to recognize custom forms using the beginRecognizeCustomFormsFromUrl
method. URL sources must be accessible from the service (in other words, a private intranet URL, or URLs that use header- or certificate-based secrets, will not work, as the Form Recognizer service must be able to access the URL). Methods with a FromUrl
suffix that use URLs instead of file streams exist for all of the recognition methods.
Recognize text words/lines, tables, and selection marks along with their bounding boxes in documents:
const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
const fs = require("fs");
async function main() {
const endpoint = "<cognitive services endpoint>";
const apiKey = "<api key>";
const path = "<path to your receipt document>"; // pdf/jpeg/png/tiff formats
const readStream = fs.createReadStream(path);
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(apiKey));
const poller = await client.beginRecognizeContent(readStream);
const pages = await poller.pollUntilDone();
if (!pages || pages.length === 0) {
throw new Error("Expecting non-empty list of pages!");
}
for (const page of pages) {
console.log(
`Page ${page.pageNumber}: width ${page.width} and height ${page.height} with unit ${page.unit}`
);
for (const table of page.tables) {
for (const cell of table.cells) {
console.log(`cell [${cell.rowIndex},${cell.columnIndex}] has text ${cell.text}`);
}
}
}
}
main().catch((err) => {
console.error("The sample encountered an error:", err);
});
Extract fields from certain types of common forms such as receipts, invoices, business cards, and identity documents using prebuilt models provided by the Form Recognizer service.
For example, to extract fields from a sales receipt, use the prebuilt receipt model provided by the beginRecognizeReceipts
method:
const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
const fs = require("fs");
async function main() {
const endpoint = "<cognitive services endpoint>";
const apiKey = "<api key>";
const path = "<path to your receipt document>"; // pdf/jpeg/png/tiff formats
const readStream = fs.createReadStream(path);
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(apiKey));
const poller = await client.beginRecognizeReceipts(readStream, {
onProgress: (state) => {
console.log(`status: ${state.status}`);
}
});
const receipts = await poller.pollUntilDone();
if (!receipts || receipts.length <= 0) {
throw new Error("Expecting at lease one receipt in analysis result");
}
const receipt = receipts[0];
console.log("First receipt:");
const receiptTypeField = receipt.fields["ReceiptType"];
if (receiptTypeField.valueType === "string") {
console.log(
` Receipt Type: '${receiptTypeField.value || "<missing>"}', with confidence of ${
receiptTypeField.confidence
}`
);
}
const merchantNameField = receipt.fields["MerchantName"];
if (merchantNameField.valueType === "string") {
console.log(
` Merchant Name: '${merchantNameField.value || "<missing>"}', with confidence of ${
merchantNameField.confidence
}`
);
}
const transactionDate = receipt.fields["TransactionDate"];
if (transactionDate.valueType === "date") {
console.log(
` Transaction Date: '${transactionDate.value || "<missing>"}', with confidence of ${
transactionDate.confidence
}`
);
}
const itemsField = receipt.fields["Items"];
if (itemsField.valueType === "array") {
for (const itemField of itemsField.value || []) {
if (itemField.valueType === "object") {
const itemNameField = itemField.value["Name"];
if (itemNameField.valueType === "string") {
console.log(
` Item Name: '${itemNameField.value || "<missing>"}', with confidence of ${
itemNameField.confidence
}`
);
}
}
}
}
const totalField = receipt.fields["Total"];
if (totalField.valueType === "number") {
console.log(
` Total: '${totalField.value || "<missing>"}', with confidence of ${totalField.confidence}`
);
}
}
main().catch((err) => {
console.error("The sample encountered an error:", err);
});
You are not limited to receipts! There are a few prebuilt models to choose from, each of which has its own set of supported fields.:
beginRecognizeReceipts
method (see the supported fields of the receipt model).beginRecognizeBusinessCards
(see the supported fields of the business card model).beginRecognizeInvoices
(see the supported fields of the invoice model).beginRecognizeIdDocuments
(see the supported fields of the identity document model).Train a machine learning model on your own form data. The resulting model will be able to recognize values from the structures of forms it was trained on. The training operation accepts a SAS-encoded URL to an Azure Storage Blob container that holds the training documents. The training operation will read the files in the container and create a model based on their contents. For more details on how to create and structure a training container, see the service quickstart documentation.
For example, the following program trains a custom model without using labels:
const { FormTrainingClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
async function main() {
const endpoint = "<cognitive services endpoint>";
const apiKey = "<api key>";
const containerSasUrl = "<SAS url to the blob container storing training documents>";
const trainingClient = new FormTrainingClient(endpoint, new AzureKeyCredential(apiKey));
const poller = await trainingClient.beginTraining(containerSasUrl, false, {
onProgress: (state) => {
console.log(`training status: ${state.status}`);
}
});
const model = await poller.pollUntilDone();
if (!model) {
throw new Error("Expecting valid training result!");
}
console.log(`Model ID: ${model.modelId}`);
console.log(`Status: ${model.status}`);
console.log(`Training started on: ${model.trainingStartedOn}`);
console.log(`Training completed on: ${model.trainingCompletedOn}`);
if (model.submodels) {
for (const submodel of model.submodels) {
// since the training data is unlabeled, we are unable to return the accuracy of this model
console.log("We have recognized the following fields");
for (const key in submodel.fields) {
const field = submodel.fields[key];
console.log(`The model found field '${field.name}'`);
}
}
}
// Training document information
if (model.trainingDocuments) {
for (const doc of model.trainingDocuments) {
console.log(`Document name: ${doc.name}`);
console.log(`Document status: ${doc.status}`);
console.log(`Document page count: ${doc.pageCount}`);
console.log(`Document errors: ${doc.errors}`);
}
}
}
main().catch((err) => {
console.error("The sample encountered an error:", err);
});
For information on creating a labeled training data set, see the documentation of the sample labeling tool and the labeled model training sample.
FormTrainingClient
also provides some methods for managing the custom models. The following example shows several ways to iterate through the custom models in a Form Recognizer resource.
const { FormTrainingClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
async function main() {
const endpoint = "<cognitive services endpoint>";
const apiKey = "<api key>";
const client = new FormTrainingClient(endpoint, new AzureKeyCredential(apiKey));
// returns an async iteratable iterator that supports paging
const result = client.listCustomModels();
let i = 0;
for await (const modelInfo of result) {
console.log(`model ${i++}:`);
console.log(modelInfo);
}
// using `iter.next()`
i = 1;
let iter = client.listCustomModels();
let modelItem = await iter.next();
while (!modelItem.done) {
console.log(`model ${i++}: ${modelItem.value.modelId}`);
modelItem = await iter.next();
}
// using `byPage()`
i = 1;
for await (const response of client.listCustomModels().byPage()) {
for (const modelInfo of response.modelList) {
console.log(`model ${i++}: ${modelInfo.modelId}`);
}
}
}
main().catch((err) => {
console.error("The sample encountered an error:", err);
});
Enabling logging may help uncover useful information about failures. In order to see a log of HTTP requests and responses, set the AZURE_LOG_LEVEL
environment variable to info
. Alternatively, logging can be enabled at runtime by calling setLogLevel
in the @azure/logger
:
import { setLogLevel } from "@azure/logger";
setLogLevel("info");
For more detailed instructions on how to enable logs, you can look at the @azure/logger package docs.
Please take a look at the samples directory for detailed code samples that show how to use this library including several features and methods that are not shown in the "Examples" section above, such as copying and composing models.
If you'd like to contribute to this library, please read the contributing guide to learn more about how to build and test the code.
Generated using TypeDoc