azure.ai.translation.document package¶
-
class
azure.ai.translation.document.
DocumentTranslationClient
(endpoint: str, credential: azure.core.credentials.AzureKeyCredential, **kwargs: Any)[source]¶ DocumentTranslationClient is your interface to the Document Translation service. Use the client to translate whole documents while preserving source document structure and text formatting.
- Parameters
endpoint (str) – Supported Document Translation endpoint (protocol and hostname, for example: https://<resource-name>.cognitiveservices.azure.com/).
credential (
AzureKeyCredential
) – Credential needed for the client to connect to Azure. Currently only API key authentication is supported.
- Keyword Arguments
api_version (str or DocumentTranslationApiVersion) – The API version of the service to use for requests. It defaults to the latest service version. Setting to an older version may result in reduced feature compatibility.
Example:
from azure.core.credentials import AzureKeyCredential from azure.ai.translation.document import DocumentTranslationClient endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"] key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"] document_translation_client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))
-
cancel_job
(job_id: str, **kwargs: Any) → None[source]¶ Cancel a currently processing or queued job.
A job will not be cancelled if it is already completed, failed, or cancelling. All documents that have completed translation will not be cancelled and will be charged. If possible, all pending documents will be cancelled.
- Parameters
job_id (str) – The translation job ID.
- Returns
None
- Return type
- Raises
-
close
() → None[source]¶ Close the
DocumentTranslationClient
session.
-
create_translation_job
(inputs: List[azure.ai.translation.document._models.DocumentTranslationInput], **kwargs: Any) → azure.ai.translation.document._models.JobStatusResult[source]¶ Create a document translation job which translates the document(s) in your source container to your TranslationTarget(s) in the given language.
For supported languages and document formats, see the service documentation: https://docs.microsoft.com/azure/cognitive-services/translator/document-translation/overview
- Parameters
inputs (List[DocumentTranslationInput]) – A list of translation inputs. Each individual input has a single source URL to documents and can contain multiple TranslationTargets (one for each language) for the destination to write translated documents.
- Returns
A JobStatusResult with information on the status of the translation job.
- Return type
- Raises
Example:
from azure.core.credentials import AzureKeyCredential from azure.ai.translation.document import ( DocumentTranslationClient, DocumentTranslationInput, TranslationTarget ) endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"] key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"] source_container_url = os.environ["AZURE_SOURCE_CONTAINER_URL"] target_container_url = os.environ["AZURE_TARGET_CONTAINER_URL"] client = DocumentTranslationClient(endpoint, AzureKeyCredential(key)) job_result = client.create_translation_job(inputs=[ DocumentTranslationInput( source_url=source_container_url, targets=[ TranslationTarget( target_url=target_container_url, language_code="es" ) ] ) ] ) # type: JobStatusResult
-
get_document_formats
(**kwargs: Any) → List[azure.ai.translation.document._models.FileFormat][source]¶ Get the list of the document formats supported by the Document Translation service.
- Returns
A list of supported document formats for translation.
- Return type
List[FileFormat]
- Raises
-
get_document_status
(job_id: str, document_id: str, **kwargs: Any) → azure.ai.translation.document._models.DocumentStatusResult[source]¶ Get the status of an individual document within a translation job.
- Parameters
- Returns
A DocumentStatusResult with information on the status of the document.
- Return type
- Raises
-
get_glossary_formats
(**kwargs: Any) → List[azure.ai.translation.document._models.FileFormat][source]¶ Get the list of the glossary formats supported by the Document Translation service.
- Returns
A list of supported glossary formats.
- Return type
List[FileFormat]
- Raises
-
get_job_status
(job_id: str, **kwargs: Any) → azure.ai.translation.document._models.JobStatusResult[source]¶ Gets the status of a translation job.
The status includes the overall job status, as well as a summary of the documents that are being translated as part of that translation job.
- Parameters
job_id (str) – The translation job ID.
- Returns
A JobStatusResult with information on the status of the translation job.
- Return type
- Raises
-
list_all_document_statuses
(job_id: str, **kwargs: Any) → ItemPaged[DocumentStatusResult][source]¶ List all the document statuses under a translation job.
- Parameters
job_id (str) – The translation job ID.
- Returns
~azure.core.paging.ItemPaged[
DocumentStatusResult
]- Return type
- Raises
Example:
doc_results = client.list_all_document_statuses(job_result.id) # type: ItemPaged[DocumentStatusResult] for document in doc_results: print("Document ID: {}".format(document.id)) print("Document status: {}".format(document.status)) if document.status == "Succeeded": print("Source document location: {}".format(document.source_document_url)) print("Translated document location: {}".format(document.translated_document_url)) print("Translated to language: {}\n".format(document.translate_to)) else: print("Error Code: {}, Message: {}\n".format(document.error.code, document.error.message))
-
list_submitted_jobs
(**kwargs: Any) → ItemPaged[JobStatusResult][source]¶ List all the submitted translation jobs under the Document Translation resource.
- Returns
~azure.core.paging.ItemPaged[
JobStatusResult
]- Return type
- Raises
Example:
from azure.core.credentials import AzureKeyCredential from azure.ai.translation.document import ( DocumentTranslationClient, ) endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"] key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"] client = DocumentTranslationClient(endpoint, AzureKeyCredential(key)) translation_jobs = client.list_submitted_jobs() # type: ItemPaged[JobStatusResult] for job in translation_jobs: if job.status == "Running": job = client.wait_until_done(job.id) print("Job ID: {}".format(job.id)) print("Job status: {}".format(job.status)) print("Job created on: {}".format(job.created_on)) print("Job last updated on: {}".format(job.last_updated_on)) print("Total number of translations on documents: {}".format(job.documents_total_count)) print("Total number of characters charged: {}".format(job.total_characters_charged)) print("\nOf total documents...") print("{} failed".format(job.documents_failed_count)) print("{} succeeded".format(job.documents_succeeded_count)) print("{} cancelled\n".format(job.documents_cancelled_count))
-
wait_until_done
(job_id: str, **kwargs: Any) → azure.ai.translation.document._models.JobStatusResult[source]¶ Wait until the translation job is done.
A job is considered “done” when it reaches a terminal state like Succeeded, Failed, Cancelled.
- Parameters
job_id (str) – The translation job ID.
- Returns
A JobStatusResult with information on the status of the translation job.
- Return type
- Raises
HttpResponseError or ResourceNotFoundError – Will raise if validation fails on the input. E.g. insufficient permissions on the blob containers.
Example:
from azure.core.credentials import AzureKeyCredential from azure.ai.translation.document import ( DocumentTranslationClient, DocumentTranslationInput, TranslationTarget ) endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"] key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"] source_container_url = os.environ["AZURE_SOURCE_CONTAINER_URL"] target_container_url = os.environ["AZURE_TARGET_CONTAINER_URL"] client = DocumentTranslationClient(endpoint, AzureKeyCredential(key)) job = client.create_translation_job(inputs=[ DocumentTranslationInput( source_url=source_container_url, targets=[ TranslationTarget( target_url=target_container_url, language_code="es" ) ] ) ] ) # type: JobStatusResult job_result = client.wait_until_done(job.id) # type: JobStatusResult print("Job status: {}".format(job_result.status)) print("Job created on: {}".format(job_result.created_on)) print("Job last updated on: {}".format(job_result.last_updated_on)) print("Total number of translations on documents: {}".format(job_result.documents_total_count)) print("\nOf total documents...") print("{} failed".format(job_result.documents_failed_count)) print("{} succeeded".format(job_result.documents_succeeded_count))
-
class
azure.ai.translation.document.
DocumentTranslationApiVersion
[source]¶ Document Translation API versions supported by this package
-
V1_0_PREVIEW
= '1.0-preview.1'¶ This is the default version
-
-
class
azure.ai.translation.document.
DocumentTranslationInput
(source_url: str, targets: List[azure.ai.translation.document._models.TranslationTarget], **kwargs: Any)[source]¶ Input for translation. This requires that you have your source document or documents in an Azure Blob Storage container. Provide a SAS URL to the source file or source container containing the documents for translation. The source document(s) are translated and written to the location provided by the TranslationTargets.
- Parameters
source_url (str) – Required. Location of the folder / container or single file with your documents.
targets (list[TranslationTarget]) – Required. Location of the destination for the output. This is a list of TranslationTargets. Note that a TranslationTarget is required for each language code specified.
- Keyword Arguments
source_language_code (str) – Language code for the source documents. If none is specified, the source language will be auto-detected for each document.
prefix (str) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using a Azure storage blob Uri, use the prefix to restrict sub folders for translation.
suffix (str) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
storage_type (str or StorageInputType) – Storage type of the input documents source string. Possible values include: “Folder”, “File”.
storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.
- Variables
source_url (str) – Required. Location of the folder / container or single file with your documents.
targets (list[TranslationTarget]) – Required. Location of the destination for the output. This is a list of TranslationTargets. Note that a TranslationTarget is required for each language code specified.
source_language_code (str) – Language code for the source documents. If none is specified, the source language will be auto-detected for each document.
prefix (str) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using a Azure storage blob Uri, use the prefix to restrict sub folders for translation.
suffix (str) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
storage_type (str or StorageInputType) – Storage type of the input documents source string. Possible values include: “Folder”, “File”.
storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.
-
class
azure.ai.translation.document.
TranslationGlossary
(glossary_url: str, file_format: str, **kwargs: Any)[source]¶ Glossary / translation memory to apply to the translation.
- Parameters
glossary_url (str) – Required. Location of the glossary file. This should be a SAS URL to the glossary file in the storage blob container. If the translation language pair is not present in the glossary, it will not be applied.
file_format (str) – Required. Format of the glossary file. To see supported formats, call the
get_glossary_formats()
client method.
- Keyword Arguments
format_version (str) – File format version. If not specified, the service will use the default_version for the file format returned from the
get_glossary_formats()
client method.storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.
- Variables
glossary_url (str) – Required. Location of the glossary file. This should be a SAS URL to the glossary file in the storage blob container. If the translation language pair is not present in the glossary, it will not be applied.
file_format (str) – Required. Format of the glossary file. To see supported formats, call the
get_glossary_formats()
client method.format_version (str) – File format version. If not specified, the service will use the default_version for the file format returned from the
get_glossary_formats()
client method.storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.
-
class
azure.ai.translation.document.
StorageInputType
[source]¶ Storage type of the input documents source string
-
FILE
= 'File'¶
-
FOLDER
= 'Folder'¶
-
-
class
azure.ai.translation.document.
FileFormat
(**kwargs: Any)[source]¶ Possible file formats supported by the Document Translation service.
-
class
azure.ai.translation.document.
TranslationTarget
(target_url: str, language_code: str, **kwargs: Any)[source]¶ Destination for the finished translated documents.
- Parameters
target_url (str) – Required. The target location for your translated documents. This should be a container SAS URL to your target container.
language_code (str) – Required. Target Language Code. This is the language you want your documents to be translated to. See supported languages here: https://docs.microsoft.com/azure/cognitive-services/translator/language-support#translate
- Keyword Arguments
category_id (str) – Category / custom model ID for using custom translation.
glossaries (list[TranslationGlossary]) – Glossaries to apply to translation.
storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.
- Variables
target_url (str) – Required. The target location for your translated documents. This should be a container SAS URL to your target container.
language_code (str) – Required. Target Language Code. This is the language you want your documents to be translated to. See supported languages here: https://docs.microsoft.com/azure/cognitive-services/translator/language-support#translate
category_id (str) – Category / custom model ID for using custom translation.
glossaries (list[TranslationGlossary]) – Glossaries to apply to translation.
storage_source (str) – Storage Source. Default value: “AzureBlob”. Currently only “AzureBlob” is supported.
-
class
azure.ai.translation.document.
JobStatusResult
(**kwargs: Any)[source]¶ Status information about the translation job.
- Variables
created_on (datetime) – The date time when the translation job was created.
last_updated_on (datetime) – The date time when the translation job’s status was last updated.
status (str) –
Status for a job.
NotStarted - the job has not begun yet.
Running - translation is in progress.
Succeeded - at least one document translated successfully within the job.
Cancelled - the job was cancelled.
Cancelling - the job is being cancelled.
ValidationFailed - the input failed validation. E.g. there was insufficient permissions on blob containers.
Failed - all the documents within the job failed. To understand the reason for each document failure,
call the
list_all_document_statuses()
client method and inspect the error.error (DocumentTranslationError) – Returned if there is an error with the translation job. Includes error code, message, target.
documents_total_count (int) – Number of translations to be made on documents in the job.
documents_failed_count (int) – Number of documents that failed translation. More details can be found by calling the
list_all_document_statuses()
client method.documents_succeeded_count (int) – Number of successful translations on documents.
documents_in_progress_count (int) – Number of translations on documents in progress.
documents_not_yet_started_count (int) – Number of documents that have not yet started being translated.
documents_cancelled_count (int) – Number of documents that were cancelled for translation.
total_characters_charged (int) – Total characters charged across all documents within the job.
has_completed (bool) – boolean to check whether a job has finished or not. If the status returned indicates that the translation job has completed. A translation job is considered ‘complete’ if it has reached a terminal state like ‘Succeeded’, ‘Cancelled’, or ‘Failed’.”
-
class
azure.ai.translation.document.
DocumentStatusResult
(**kwargs: Any)[source]¶ Status information about a particular document within a translation job.
- Variables
source_document_url (str) – Location of the source document in the source container. Note that any SAS tokens are removed from this path.
translated_document_url (str) – Location of the translated document in the target container. Note that any SAS tokens are removed from this path.
created_on (datetime) – The date time when the document was created.
last_updated_on (datetime) – The date time when the document’s status was last updated.
status (str) –
Status for a document.
NotStarted - the document has not been translated yet.
Running - translation is in progress for document
Succeeded - translation succeeded for the document
Failed - the document failed to translate. Check the error property.
Cancelled - the job was cancelled, the document was not translated.
Cancelling - the job is cancelling, the document will not be translated.
translate_to (str) – The language code of the language the document was translated to, if successful.
error (DocumentTranslationError) – Returned if there is an error with the particular document. Includes error code, message, target.
translation_progress (float) – Progress of the translation if available. Value is between [0.0, 1.0].
characters_charged (int) – Characters charged for the document.
has_completed (bool) – boolean to check whether a document finished translation or not. If the status returned indicates that the document has completed. A document is considered ‘complete’ if it has reached a terminal state like ‘Succeeded’, ‘Cancelled’, or ‘Failed’.”
-
class
azure.ai.translation.document.
DocumentTranslationError
(**kwargs: Any)[source]¶ This contains the error code, message, and target with descriptive details on why a translation job or particular document failed.
- Variables
code (str) – The error code. Possible high level values include: “InvalidRequest”, “InvalidArgument”, “InternalServerError”, “ServiceUnavailable”, “ResourceNotFound”, “Unauthorized”, “RequestRateTooHigh”.
message (str) – The error message associated with the failure.
target (str) – The source of the error. For example it would be “documents” or “document id” in case of invalid document.