azure.ai.translation.document package

class azure.ai.translation.document.DocumentTranslationClient(endpoint: str, credential: azure.core.credentials.AzureKeyCredential, **kwargs: Any)[source]

DocumentTranslationClient is your interface to the Document Translation service. Use the client to translate whole documents while preserving source document structure and text formatting.

Parameters
  • endpoint (str) – Supported Document Translation endpoint (protocol and hostname, for example: https://<resource-name>.cognitiveservices.azure.com/).

  • credential (AzureKeyCredential) – Credential needed for the client to connect to Azure. Currently only API key authentication is supported.

Keyword Arguments

api_version (str or DocumentTranslationApiVersion) – The API version of the service to use for requests. It defaults to the latest service version. Setting to an older version may result in reduced feature compatibility.

Example:

Creating the DocumentTranslationClient with an endpoint and API key.
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import DocumentTranslationClient

endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]

document_translation_client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))
cancel_job(job_id: str, **kwargs: Any)None[source]

Cancel a currently processing or queued job.

A job will not be cancelled if it is already completed, failed, or cancelling. All documents that have completed translation will not be cancelled and will be charged. If possible, all pending documents will be cancelled.

Parameters

job_id (str) – The translation job ID.

Returns

None

Return type

None

Raises

HttpResponseError or ResourceNotFoundError

close()None[source]

Close the DocumentTranslationClient session.

create_translation_job(inputs: List[azure.ai.translation.document._models.DocumentTranslationInput], **kwargs: Any) → azure.ai.translation.document._models.JobStatusResult[source]

Create a document translation job which translates the document(s) in your source container to your TranslationTarget(s) in the given language.

For supported languages and document formats, see the service documentation: https://docs.microsoft.com/azure/cognitive-services/translator/document-translation/overview

Parameters

inputs (List[DocumentTranslationInput]) – A list of translation inputs. Each individual input has a single source URL to documents and can contain multiple TranslationTargets (one for each language) for the destination to write translated documents.

Returns

A JobStatusResult with information on the status of the translation job.

Return type

JobStatusResult

Raises

HttpResponseError

Example:

Create a translation job.
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import (
    DocumentTranslationClient,
    DocumentTranslationInput,
    TranslationTarget
)

endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]
source_container_url = os.environ["AZURE_SOURCE_CONTAINER_URL"]
target_container_url = os.environ["AZURE_TARGET_CONTAINER_URL"]

client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))

job_result = client.create_translation_job(inputs=[
        DocumentTranslationInput(
            source_url=source_container_url,
            targets=[
                TranslationTarget(
                    target_url=target_container_url,
                    language_code="es"
                )
            ]
        )
    ]
)  # type: JobStatusResult
get_document_formats(**kwargs: Any) → List[azure.ai.translation.document._models.FileFormat][source]

Get the list of the document formats supported by the Document Translation service.

Returns

A list of supported document formats for translation.

Return type

List[FileFormat]

Raises

HttpResponseError

get_document_status(job_id: str, document_id: str, **kwargs: Any) → azure.ai.translation.document._models.DocumentStatusResult[source]

Get the status of an individual document within a translation job.

Parameters
  • job_id (str) – The translation job ID.

  • document_id (str) – The ID for the document.

Returns

A DocumentStatusResult with information on the status of the document.

Return type

DocumentStatusResult

Raises

HttpResponseError or ResourceNotFoundError

get_glossary_formats(**kwargs: Any) → List[azure.ai.translation.document._models.FileFormat][source]

Get the list of the glossary formats supported by the Document Translation service.

Returns

A list of supported glossary formats.

Return type

List[FileFormat]

Raises

HttpResponseError

get_job_status(job_id: str, **kwargs: Any) → azure.ai.translation.document._models.JobStatusResult[source]

Gets the status of a translation job.

The status includes the overall job status, as well as a summary of the documents that are being translated as part of that translation job.

Parameters

job_id (str) – The translation job ID.

Returns

A JobStatusResult with information on the status of the translation job.

Return type

JobStatusResult

Raises

HttpResponseError or ResourceNotFoundError

list_all_document_statuses(job_id: str, **kwargs: Any) → ItemPaged[DocumentStatusResult][source]

List all the document statuses under a translation job.

Parameters

job_id (str) – The translation job ID.

Returns

~azure.core.paging.ItemPaged[DocumentStatusResult]

Return type

ItemPaged

Raises

HttpResponseError

Example:

List all the document statuses under the translation job.
doc_results = client.list_all_document_statuses(job_result.id)  # type: ItemPaged[DocumentStatusResult]
for document in doc_results:
    print("Document ID: {}".format(document.id))
    print("Document status: {}".format(document.status))
    if document.status == "Succeeded":
        print("Source document location: {}".format(document.source_document_url))
        print("Translated document location: {}".format(document.translated_document_url))
        print("Translated to language: {}\n".format(document.translate_to))
    else:
        print("Error Code: {}, Message: {}\n".format(document.error.code, document.error.message))
list_submitted_jobs(**kwargs: Any) → ItemPaged[JobStatusResult][source]

List all the submitted translation jobs under the Document Translation resource.

Returns

~azure.core.paging.ItemPaged[JobStatusResult]

Return type

ItemPaged

Raises

HttpResponseError

Example:

List all submitted jobs under the resource.
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import (
    DocumentTranslationClient,
)

endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]

client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))
translation_jobs = client.list_submitted_jobs()  # type: ItemPaged[JobStatusResult]

for job in translation_jobs:
    if job.status == "Running":
        job = client.wait_until_done(job.id)

    print("Job ID: {}".format(job.id))
    print("Job status: {}".format(job.status))
    print("Job created on: {}".format(job.created_on))
    print("Job last updated on: {}".format(job.last_updated_on))
    print("Total number of translations on documents: {}".format(job.documents_total_count))
    print("Total number of characters charged: {}".format(job.total_characters_charged))

    print("\nOf total documents...")
    print("{} failed".format(job.documents_failed_count))
    print("{} succeeded".format(job.documents_succeeded_count))
    print("{} cancelled\n".format(job.documents_cancelled_count))

wait_until_done(job_id: str, **kwargs: Any) → azure.ai.translation.document._models.JobStatusResult[source]

Wait until the translation job is done.

A job is considered “done” when it reaches a terminal state like Succeeded, Failed, Cancelled.

Parameters

job_id (str) – The translation job ID.

Returns

A JobStatusResult with information on the status of the translation job.

Return type

JobStatusResult

Raises

HttpResponseError or ResourceNotFoundError – Will raise if validation fails on the input. E.g. insufficient permissions on the blob containers.

Example:

Create a translation job and wait until it is done.
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import (
    DocumentTranslationClient,
    DocumentTranslationInput,
    TranslationTarget
)

endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]
source_container_url = os.environ["AZURE_SOURCE_CONTAINER_URL"]
target_container_url = os.environ["AZURE_TARGET_CONTAINER_URL"]

client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))

job = client.create_translation_job(inputs=[
        DocumentTranslationInput(
            source_url=source_container_url,
            targets=[
                TranslationTarget(
                    target_url=target_container_url,
                    language_code="es"
                )
            ]
        )
    ]
)  # type: JobStatusResult

job_result = client.wait_until_done(job.id)  # type: JobStatusResult

print("Job status: {}".format(job_result.status))
print("Job created on: {}".format(job_result.created_on))
print("Job last updated on: {}".format(job_result.last_updated_on))
print("Total number of translations on documents: {}".format(job_result.documents_total_count))

print("\nOf total documents...")
print("{} failed".format(job_result.documents_failed_count))
print("{} succeeded".format(job_result.documents_succeeded_count))
class azure.ai.translation.document.DocumentTranslationApiVersion[source]

Document Translation API versions supported by this package

V1_0_PREVIEW = '1.0-preview.1'

This is the default version

class azure.ai.translation.document.DocumentTranslationInput(source_url: str, targets: List[azure.ai.translation.document._models.TranslationTarget], **kwargs: Any)[source]

Input for translation. This requires that you have your source document or documents in an Azure Blob Storage container. Provide a SAS URL to the source file or source container containing the documents for translation. The source document(s) are translated and written to the location provided by the TranslationTargets.

Parameters
  • source_url (str) – Required. Location of the folder / container or single file with your documents.

  • targets (list[TranslationTarget]) – Required. Location of the destination for the output. This is a list of TranslationTargets. Note that a TranslationTarget is required for each language code specified.

Keyword Arguments
  • source_language_code (str) – Language code for the source documents. If none is specified, the source language will be auto-detected for each document.

  • prefix (str) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using a Azure storage blob Uri, use the prefix to restrict sub folders for translation.

  • suffix (str) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.

  • storage_type (str or StorageInputType) – Storage type of the input documents source string. Possible values include: “Folder”, “File”.

  • storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.

Variables
  • source_url (str) – Required. Location of the folder / container or single file with your documents.

  • targets (list[TranslationTarget]) – Required. Location of the destination for the output. This is a list of TranslationTargets. Note that a TranslationTarget is required for each language code specified.

  • source_language_code (str) – Language code for the source documents. If none is specified, the source language will be auto-detected for each document.

  • prefix (str) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using a Azure storage blob Uri, use the prefix to restrict sub folders for translation.

  • suffix (str) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.

  • storage_type (str or StorageInputType) – Storage type of the input documents source string. Possible values include: “Folder”, “File”.

  • storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.

class azure.ai.translation.document.TranslationGlossary(glossary_url: str, file_format: str, **kwargs: Any)[source]

Glossary / translation memory to apply to the translation.

Parameters
  • glossary_url (str) – Required. Location of the glossary file. This should be a SAS URL to the glossary file in the storage blob container. If the translation language pair is not present in the glossary, it will not be applied.

  • file_format (str) – Required. Format of the glossary file. To see supported formats, call the get_glossary_formats() client method.

Keyword Arguments
  • format_version (str) – File format version. If not specified, the service will use the default_version for the file format returned from the get_glossary_formats() client method.

  • storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.

Variables
  • glossary_url (str) – Required. Location of the glossary file. This should be a SAS URL to the glossary file in the storage blob container. If the translation language pair is not present in the glossary, it will not be applied.

  • file_format (str) – Required. Format of the glossary file. To see supported formats, call the get_glossary_formats() client method.

  • format_version (str) – File format version. If not specified, the service will use the default_version for the file format returned from the get_glossary_formats() client method.

  • storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.

class azure.ai.translation.document.StorageInputType[source]

Storage type of the input documents source string

FILE = 'File'
FOLDER = 'Folder'
class azure.ai.translation.document.FileFormat(**kwargs: Any)[source]

Possible file formats supported by the Document Translation service.

Variables
  • file_format (str) – Name of the format.

  • file_extensions (list[str]) – Supported file extension for this format.

  • content_types (list[str]) – Supported Content-Types for this format.

  • format_versions (list[str]) – Supported Version.

  • default_format_version (str) – Default format version if none is specified.

class azure.ai.translation.document.TranslationTarget(target_url: str, language_code: str, **kwargs: Any)[source]

Destination for the finished translated documents.

Parameters
Keyword Arguments
  • category_id (str) – Category / custom model ID for using custom translation.

  • glossaries (list[TranslationGlossary]) – Glossaries to apply to translation.

  • storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.

Variables
  • target_url (str) – Required. The target location for your translated documents. This should be a container SAS URL to your target container.

  • language_code (str) – Required. Target Language Code. This is the language you want your documents to be translated to. See supported languages here: https://docs.microsoft.com/azure/cognitive-services/translator/language-support#translate

  • category_id (str) – Category / custom model ID for using custom translation.

  • glossaries (list[TranslationGlossary]) – Glossaries to apply to translation.

  • storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.

class azure.ai.translation.document.JobStatusResult(**kwargs: Any)[source]

Status information about the translation job.

Variables
  • id (str) – Id of the job.

  • created_on (datetime) – The date time when the translation job was created.

  • last_updated_on (datetime) – The date time when the translation job’s status was last updated.

  • status (str) –

    Status for a job.

    • NotStarted - the job has not begun yet.

    • Running - translation is in progress.

    • Succeeded - at least one document translated successfully within the job.

    • Cancelled - the job was cancelled.

    • Cancelling - the job is being cancelled.

    • ValidationFailed - the input failed validation. E.g. there was insufficient permissions on blob containers.

    • Failed - all the documents within the job failed. To understand the reason for each document failure,

    call the list_all_document_statuses() client method and inspect the error.

  • error (DocumentTranslationError) – Returned if there is an error with the translation job. Includes error code, message, target.

  • documents_total_count (int) – Number of translations to be made on documents in the job.

  • documents_failed_count (int) – Number of documents that failed translation. More details can be found by calling the list_all_document_statuses() client method.

  • documents_succeeded_count (int) – Number of successful translations on documents.

  • documents_in_progress_count (int) – Number of translations on documents in progress.

  • documents_not_yet_started_count (int) – Number of documents that have not yet started being translated.

  • documents_cancelled_count (int) – Number of documents that were cancelled for translation.

  • total_characters_charged (int) – Total characters charged across all documents within the job.

  • has_completed (bool) – boolean to check whether a job has finished or not. If the status returned indicates that the translation job has completed. A translation job is considered ‘complete’ if it has reached a terminal state like ‘Succeeded’, ‘Cancelled’, or ‘Failed’.”

class azure.ai.translation.document.DocumentStatusResult(**kwargs: Any)[source]

Status information about a particular document within a translation job.

Variables
  • source_document_url (str) – Location of the source document in the source container. Note that any SAS tokens are removed from this path.

  • translated_document_url (str) – Location of the translated document in the target container. Note that any SAS tokens are removed from this path.

  • created_on (datetime) – The date time when the document was created.

  • last_updated_on (datetime) – The date time when the document’s status was last updated.

  • status (str) –

    Status for a document.

    • NotStarted - the document has not been translated yet.

    • Running - translation is in progress for document

    • Succeeded - translation succeeded for the document

    • Failed - the document failed to translate. Check the error property.

    • Cancelled - the job was cancelled, the document was not translated.

    • Cancelling - the job is cancelling, the document will not be translated.

  • translate_to (str) – The language code of the language the document was translated to, if successful.

  • error (DocumentTranslationError) – Returned if there is an error with the particular document. Includes error code, message, target.

  • translation_progress (float) – Progress of the translation if available. Value is between [0.0, 1.0].

  • id (str) – Document Id.

  • characters_charged (int) – Characters charged for the document.

  • has_completed (bool) – boolean to check whether a document finished translation or not. If the status returned indicates that the document has completed. A document is considered ‘complete’ if it has reached a terminal state like ‘Succeeded’, ‘Cancelled’, or ‘Failed’.”

class azure.ai.translation.document.DocumentTranslationError(**kwargs: Any)[source]

This contains the error code, message, and target with descriptive details on why a translation job or particular document failed.

Variables
  • code (str) – The error code. Possible high level values include: “InvalidRequest”, “InvalidArgument”, “InternalServerError”, “ServiceUnavailable”, “ResourceNotFound”, “Unauthorized”, “RequestRateTooHigh”.

  • message (str) – The error message associated with the failure.

  • target (str) – The source of the error. For example it would be “documents” or “document id” in case of invalid document.

Subpackages