Current version is 1.0.0-beta.15, click here for the index

Azure Resource Manager DataFactory client library for Java

Azure Resource Manager DataFactory client library for Java.

This package contains Microsoft Azure SDK for DataFactory Management SDK. The Azure Data Factory V2 management API provides a RESTful set of web services that interact with Azure Data Factory V2 services. Package tag package-2018-06. For documentation on how to use this package, please see Azure Management Libraries for Java.

We'd love to hear your feedback

We're always working on improving our products and the way we communicate with our users. So we'd love to learn what's working and how we can do better.

If you haven't already, please take a few minutes to complete this short survey we have put together.

Thank you in advance for your collaboration. We really appreciate your time!

Documentation

Various documentation is available to help you get started

Getting started

Prerequisites

Adding the package to your product

<dependency>
    <groupId>com.azure.resourcemanager</groupId>
    <artifactId>azure-resourcemanager-datafactory</artifactId>
    <version>1.0.0-beta.15</version>
</dependency>

Azure Management Libraries require a TokenCredential implementation for authentication and an HttpClient implementation for HTTP client.

Azure Identity package and Azure Core Netty HTTP package provide the default implementation.

Authentication

By default, Azure Active Directory token authentication depends on correct configure of following environment variables.

  • AZURE_CLIENT_ID for Azure client ID.
  • AZURE_TENANT_ID for Azure tenant ID.
  • AZURE_CLIENT_SECRET or AZURE_CLIENT_CERTIFICATE_PATH for client secret or client certificate.

In addition, Azure subscription ID can be configured via environment variable AZURE_SUBSCRIPTION_ID.

With above configuration, azure client can be authenticated by following code:

AzureProfile profile = new AzureProfile(AzureEnvironment.AZURE);
TokenCredential credential = new DefaultAzureCredentialBuilder()
    .authorityHost(profile.getEnvironment().getActiveDirectoryEndpoint())
    .build();
DataFactoryManager manager = DataFactoryManager
    .authenticate(credential, profile);

The sample code assumes global Azure. Please change AzureEnvironment.AZURE variable if otherwise.

See Authentication for more options.

Key concepts

See API design for general introduction on design and key concepts on Azure Management Libraries.

Examples

// storage account
StorageAccount storageAccount = storageManager.storageAccounts().define(STORAGE_ACCOUNT)
    .withRegion(REGION)
    .withExistingResourceGroup(resourceGroup)
    .create();
final String storageAccountKey = storageAccount.getKeys().iterator().next().value();
final String connectionString = getStorageConnectionString(STORAGE_ACCOUNT, storageAccountKey, storageManager.environment());

// container
final String containerName = "adf";
storageManager.blobContainers().defineContainer(containerName)
    .withExistingStorageAccount(resourceGroup, STORAGE_ACCOUNT)
    .withPublicAccess(PublicAccess.NONE)
    .create();

// blob as input
BlobClient blobClient = new BlobClientBuilder()
    .connectionString(connectionString)
    .containerName(containerName)
    .blobName("input/data.txt")
    .buildClient();
blobClient.upload(BinaryData.fromString("data"));

// data factory
Factory dataFactory = manager.factories().define(DATA_FACTORY)
    .withRegion(REGION)
    .withExistingResourceGroup(resourceGroup)
    .create();

// linked service
final Map<String, String> connectionStringProperty = new HashMap<>();
connectionStringProperty.put("type", "SecureString");
connectionStringProperty.put("value", connectionString);

final String linkedServiceName = "LinkedService";
manager.linkedServices().define(linkedServiceName)
    .withExistingFactory(resourceGroup, DATA_FACTORY)
    .withProperties(new AzureStorageLinkedService()
        .withConnectionString(connectionStringProperty))
    .create();

// input dataset
final String inputDatasetName = "InputDataset";
manager.datasets().define(inputDatasetName)
    .withExistingFactory(resourceGroup, DATA_FACTORY)
    .withProperties(new AzureBlobDataset()
        .withLinkedServiceName(new LinkedServiceReference().withReferenceName(linkedServiceName))
        .withFolderPath(containerName)
        .withFileName("input/data.txt")
        .withFormat(new TextFormat()))
    .create();

// output dataset
final String outputDatasetName = "OutputDataset";
manager.datasets().define(outputDatasetName)
    .withExistingFactory(resourceGroup, DATA_FACTORY)
    .withProperties(new AzureBlobDataset()
        .withLinkedServiceName(new LinkedServiceReference().withReferenceName(linkedServiceName))
        .withFolderPath(containerName)
        .withFileName("output/data.txt")
        .withFormat(new TextFormat()))
    .create();

// pipeline
PipelineResource pipeline = manager.pipelines().define("CopyBlobPipeline")
    .withExistingFactory(resourceGroup, DATA_FACTORY)
    .withActivities(Collections.singletonList(new CopyActivity()
        .withName("CopyBlob")
        .withSource(new BlobSource())
        .withSink(new BlobSink())
        .withInputs(Collections.singletonList(new DatasetReference().withReferenceName(inputDatasetName)))
        .withOutputs(Collections.singletonList(new DatasetReference().withReferenceName(outputDatasetName)))))
    .create();

// run pipeline
CreateRunResponse createRun = pipeline.createRun();

// wait for completion
PipelineRun pipelineRun = manager.pipelineRuns().get(resourceGroup, DATA_FACTORY, createRun.runId());
String runStatus = pipelineRun.status();
while ("InProgress".equals(runStatus)) {
    sleepIfRunningAgainstService(10 * 1000);    // wait 10 seconds
    pipelineRun = manager.pipelineRuns().get(resourceGroup, DATA_FACTORY, createRun.runId());
    runStatus = pipelineRun.status();
}

Code snippets and samples

Troubleshooting

Next steps

Contributing

For details on contributing to this repository, see the contributing guide.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request
Packages 
Package Description
com.azure.resourcemanager.datafactory
Package containing the classes for DataFactoryManagementClient.
com.azure.resourcemanager.datafactory.fluent
Package containing the service clients for DataFactoryManagementClient.
com.azure.resourcemanager.datafactory.fluent.models
Package containing the inner data models for DataFactoryManagementClient.
com.azure.resourcemanager.datafactory.models
Package containing the data models for DataFactoryManagementClient.