Options
All
  • Public
  • Public/Protected
  • All
Menu

@azure/ai-form-recognizer

Package version

Azure Form Recognizer client library for JavaScript

Azure Cognitive Services Form Recognizer is a cloud service that uses machine learning to recognize text and table data from form documents. It includes the following main functionalities:

  • Custom models - Recognize field values and table data from forms. These models are trained with your own data, so they're tailored to your forms. You can then take these custom models and recognize forms. You can also manage the custom models you've created and see how close you are to the limit of custom models your account can hold.
  • Content API - Recognize text and table structures, along with their bounding box coordinates, from documents. Corresponds to the REST service's Layout API.
  • Prebuilt receipt model - Recognize data from sales receipts using a prebuilt model.

Source code | Package (NPM) | API reference documentation | Product documentation | Samples

Getting started

Prerequisites

Create a Form Recognizer resource

Form Recognizer supports both multi-service and single-service access. Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource.

You can create the resource using

Option 1: Azure Portal

Option 2: Azure CLI. Below is an example of how you can create a Form Recognizer resource using the CLI:

# Create a new resource group to hold the form recognizer resource -
# if using an existing resource group, skip this step
az group create --name my-resource-group --location westus2

If you use the Azure CLI, replace <your-resource-group-name> and <your-resource-name> with your own unique names:

az cognitiveservices account create --kind FormRecognizer --resource-group <your-resource-group-name> --name <your-resource-name> --sku <your-sku-name> --location <your-location>

Install the @azure/ai-form-recognizer package

Install the Azure Form Recognizer client library for JavaScript with npm:

npm install @azure/ai-form-recognizer

Note: This preview version targets Azure Form Recognizer service API version v2.0-preview.

Create and authenticate a client

In order to interact with the Form Recognizer service, you'll need to select either a FormRecognizerClient or a FormTrainingClient, and create an instance of this type. In the following samples, we will use FormRecognizerClient as an example. To create a client instance to access the Form Recognizer API, you will need the endpoint of your Form Recognizer resource and a credential. The Form Recognizer client use an API key credential to authenticate.

You can find the endpoint for your form recognizer resource either in the Azure Portal or by using the Azure CLI snippet below:

az cognitiveservices account show --name <your-resource-name> --resource-group <your-resource-group-name> --query "endpoint"

Using an API Key

Use the Azure Portal to browse to your Form Recognizer resource and retrieve an API key, or use the Azure CLI snippet below:

Note: Sometimes the API key is referred to as a "subscription key" or "subscription API key."

az cognitiveservices account keys list --resource-group <your-resource-group-name> --name <your-resource-name>

Once you have an API key and endpoint, you can use it as follows:

const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");

const client = new FormRecognizerClient("<endpoint>", new AzureKeyCredential("<API key>"));

Using an Azure Active Directory Credential

Client API key authentication is used in most of the examples, but you can also authenticate with Azure Active Directory using the Azure Identity library. To use the DefaultAzureCredential
provider shown below, or other credential providers provided with the Azure SDK, please install the @azure/identity package:

npm install @azure/identity

You will also need to register a new AAD application and grant access to Form Recognizer by assigning the "Cognitive Services User" role to your service principal (note: other roles such as "Owner" will not gra
nt the necessary permissions, only "Cognitive Services User" will suffice to run the examples and the sample code).

Set the values of the client ID, tenant ID, and client secret of the AAD application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.

const { FormRecognizerClient } = require("@azure/ai-form-recognizer");
const { DefaultAzureCredential } = require("@azure/identity");

const client = new FormRecognizerClient("<endpoint>", new DefaultAzureCredential());

Key concepts

FormRecognizerClient

FormRecognizerClient provides operations for:

  • Recognizing form fields and content using custom models trained to recognize your custom forms. These values are returned in a collection of RecognizedForm objects.
  • Recognizing form content, including tables, lines and words, without the need to train a model. Form content is returned in a collection of FormPage objects.
  • Recognizing common fields from receipts, using a pre-trained receipt model on the Form Recognizer service. These fields and meta-data are returned in a collection of RecognizedReceipt.

FormTrainingClient

FormTrainingClient provides operations for:

  • Training custom models to recognize all fields and values found in your custom forms. A CustomFormModel is returned indicating the form types the model will recognize, and the fields it will extract for each form type. See the service's documents for a more detailed explanation.
  • Training custom models to recognize specific fields and values you specify by labeling your custom forms. A CustomFormModel is returned indicating the fields the model will extract, as well as the estimated accuracy for each field. See the service's documents for a more detailed explanation.
  • Managing models created in your account.
  • Copying a custom model from one Form Recognizer resource to another.

Please note that models can also be trained using a graphical user interface such as the Form Recognizer Labeling Tool.

Long-Running Operations

Long-running operations are operations which consist of an initial request sent to the service to start an operation,followed by polling the service at intervals to determine whether the operation has completed or failed, and if it has succeeded, to get the result.

Methods that train models or extract values from forms are modeled as long-running operations. The client exposes a begin<operation-name> method that returns an Promise<PollerLike>. Callers should wait for the operation to complete by calling pollUntilDone() on the poller returned from the begin<operation-name> method. Sample code snippets are provided to illustrate using long-running operations below.

Examples

The following section provides several JavaScript code snippets illustrating common patterns used in the Form Recognizer client libraries.

Recognize receipts

Recognize data from sales receipts using the pre-built model.

const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
const fs = require("fs");

async function main() {
  const endpoint = "<cognitive services endpoint>";
  const apiKey = "<api key>";
  const path = "<path to your receipt document>"; // pdf/jpeg/png/tiff formats

  const readStream = fs.createReadStream(path);

  const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(apiKey));
  const poller = await client.beginRecognizeReceipts(readStream, "image/jpeg", {
    onProgress: (state) => { console.log(`status: ${state.status}`); }
  });

  await poller.pollUntilDone();
  const receipts = poller.getResult();

  if (!receipts || receipts.length <= 0) {
    throw new Error("Expecting at lease one receipt in analysis result");
  }

  const receipt = receipts[0];
  console.log("First receipt:");
  // For supported fields recognized by the service, please refer to https://westus2.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-preview/operations/GetAnalyzeReceiptResult.
  const receiptTypeField = receipt.recognizedForm.fields["ReceiptType"];
  if (receiptTypeField.valueType === "string") {
    console.log(`  Receipt Type: '${receiptTypeField.value || "<missing>"}', with confidence of ${receiptTypeField.confidence}`);
  }
  const merchantNameField = receipt.recognizedForm.fields["MerchantName"];
  if (merchantNameField.valueType === "string") {
    console.log(`  Merchant Name: '${merchantNameField.value || "<missing>"}', with confidence of ${merchantNameField.confidence}`);
  }
  const transactionDate = receipt.recognizedForm.fields["TransactionDate"];
  if (transactionDate.valueType === "date") {
    console.log(`  Transaction Date: '${transactionDate.value || "<missing>"}', with confidence of ${transactionDate.confidence}`);
  }
  const itemsField = receipt.recognizedForm.fields["Items"];
  if (itemsField.valueType === "array") {
    for (const itemField of itemsField.value || []) {
      if (itemField.valueType === "object") {
        const itemNameField = itemField.value["Name"];
        if (itemNameField.valueType === "string") {
          console.log(`    Item Name: '${itemNameField.value || "<missing>"}', with confidence of ${itemNameField.confidence}`);
        }
      }
    }
  }
  const totalField = receipt.recognizedForm.fields["Total"];
  if (totalField.valueType === "number") {
    console.log(`  Total: '${totalField.value || "<missing>"}', with confidence of ${totalField.confidence}`);
  }
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Recognize content

Recognize text and table structures, along with their bounding box, from documents

const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
const fs = require("fs");

async function main() {
  const endpoint = "<cognitive services endpoint>";
  const apiKey = || "<api key>";
  const path = "<path to your receipt document>"; // pdf/jpeg/png/tiff formats

  const readStream = fs.createReadStream(path);

  const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(apiKey));
  const poller = await client.beginRecognizeContent(readStream);
  await poller.pollUntilDone();
  const pages = poller.getResult();

  if (!pages || pages.length === 0) {
    throw new Error("Expecting non-empty list of pages!");
  }

  for (const page of pages) {
    console.log(
      `Page ${page.pageNumber}: width ${page.width} and height ${page.height} with unit ${page.unit}`
    );
    for (const table of page.tables) {
      for (const row of table.rows) {
        for (const cell of row.cells) {
          console.log(`cell [${cell.rowIndex},${cell.columnIndex}] has text ${cell.text}`);
        }
      }
    }
  }
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Train model

Train a machine-learned model on your own form type. The resulting model will be able to recognize values from the types of forms it was trained on. Provide a container SAS url to your Azure Storage Blob container where you're storing the training documents. See details on setting this up in the service quickstart documentation. This sample creates and trains a custom model without using labels.

const { FormTrainingClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");

async function main() {
  const endpoint = "<cognitive services endpoint>";
  const apiKey = "<api key>";
  const containerSasUrl = "<SAS url to the blob container storing training documents>";

  const trainingClient = new FormTrainingClient(endpoint, new AzureKeyCredential(apiKey));

  const poller = await trainingClient.beginTraining(containerSasUrl, false, {
    onProgress: (state) => { console.log(`training status: ${state.status}`); }
  });

  await poller.pollUntilDone();
  const response = poller.getResult();

  if (!response) {
    throw new Error("Expecting valid response!");
  }

  console.log(`Model ID: ${response.modelId}`);
  console.log(`Status: ${response.status}`);
  console.log(`Requested on: ${response.requestedOn}`);
  console.log(`Completed on: ${response.completedOn}`);

  if (response.submodels) {
    for (const submodel of response.submodels) {
      // since the training data is unlabeled, we are unable to return the accuracy of this model
      console.log("We have recognized the following fields");
      for (const key in submodel.fields) {
        const field = submodel.fields[key];
        console.log(`The model found field '${field.name}'`);
      }
    }
  }
  // Training document information
  if (response.trainingDocuments) {
    for (const doc of response.trainingDocuments) {
      console.log(`Document name: ${doc.documentName}`);
      console.log(`Document status: ${doc.status}`);
      console.log(`Document page count: ${doc.pageCount}`);
      console.log(`Document errors: ${doc.errors}`);
    }
  }
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Recognize forms using a custom model

const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");

async function main() {
  const endpoint = "<cognitive services endpoint>";
  const apiKey = "<api key>";
  const modelId = "<model id>";
  const path = "<path to a form document>";

  const readStream = fs.createReadStream(path);

  const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(apiKey));
  const poller = await client.beginRecognizeCustomForms(modelId, readStream, "application/pdf", {
    onProgress: (state) => { console.log(`status: ${state.status}`); }
  });
  await poller.pollUntilDone();
  const forms = poller.getResult();

  console.log("Forms:");
  for (const form of forms || []) {
    console.log(`${form.formType}, page range: ${form.pageRange}`);
    console.log("Pages:");
    for (const page of form.pages || []) {
      console.log(`Page number: ${page.pageNumber}`);
      console.log("Tables");
      for (const table of page.tables || []) {
        for (const row of table.rows) {
          for (const cell of row.cells) {
            console.log(`cell (${cell.rowIndex},${cell.columnIndex}) ${cell.text}`);
          }
        }
      }
    }

    console.log("Fields:");
    for (const fieldName in form.fields) {
      // each field is of type FormField
      const field = form.fields[fieldName];
      console.log(
        `Field ${fieldName} has value '${field.value}' with a confidence score of ${field.confidence}`
      );
    }
  }
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Listing all models

Listing custom models in the current cognitive service account. This sample shows several ways to iterate through the result.

const { FormTrainingClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");

async function main() {
  const endpoint = "<cognitive services endpoint>";
  const apiKey = "<api key>";
  const client = new FormTrainingClient(endpoint, new AzureKeyCredential(apiKey));

  const result = await client.listCustomModels();
  let i = 0;
  for await (const modelInfo of result) {
    console.log(`model ${i++}:`);
    console.log(modelInfo);
  }

  // using `iter.next()`
  i = 1;
  let iter = client.listCustomModels();
  let modelItem = await iter.next();
  while (!modelItem.done) {
    console.log(`model ${i++}: ${modelItem.value.modelId}`);
    modelItem = await iter.next();
  }

  // using `byPage()`
  i = 1;
  for await (const response of client.listCustomModels().byPage()) {
    for (const modelInfo of response.modelList) {
      console.log(`model ${i++}: ${modelInfo.modelId}`);
    }
  }
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Troubleshooting

Enable logs

You can set the following environment variable to see debug logs when using this library.

  • Getting debug logs from the Azure Form Recognizer client library
export DEBUG=azure*

For more detailed instructions on how to enable logs, you can look at the @azure/logger package docs.

Next steps

Please take a look at the samples directory for detailed examples on how to use this library.

Contributing

If you'd like to contribute to this library, please read the contributing guide to learn more about how to build and test the code.

Impressions

Generated using TypeDoc