How do I connect Azure Data Factory to DocumentCloud.org in 5 minutes?
Execute the following steps to connect Azure Data Factory to DocumentCloud.org. You only need knowledge of Azure Data Factory, your own business and the DocumentCloud.org administration.
DocumentCloud is an online archive of documents collected mostly by journalists.
No technical knowledge is needed nor knowledge on the DocumentCloud.org APIs to connect DocumentCloud.org to Azure Data Factory. This step-by-step plan will tell you exactly which steps you need to go through on Invantive Cloud to create your own Azure data warehouse with data from DocumentCloud.org.
The steps to connect Azure Data Factory to DocumentCloud.org are:
- Register an Invantive Cloud account.
- Create a DocumentCloud.org database.
- Make the DocumentCloud.org database available through the Microsoft OData connector.
- Connect Azure Data Factory to DocumentCloud.org through the connector.
- Load data from DocumentCloud.org into Azure Data Factory data warehouse.
Invantive Cloud offers a DocumentCloud.org connector to download data from DocumentCloud.org into Azure Data Factory, but there are over 105 other connectors available also for SQL, Power BI Desktop, Power BI Service, Power Query and/or Azure Data Factory.
With the DocumentCloud.org connector for Azure Data Factory you will by default fetch the data from all connected DocumentCloud.org companies in your dashboard. You can limit the number of DocumentCloud.org companies that are retrieved in Azure Data Factory through the Database settings. For example, you can also limit the connector for Azure Data Factory to the data from exactly one DocumentCloud.org company. Of course, you can also filter the data by a specific DocumentCloud.org company using a filter step in Azure Data Factory. The DocumentCloud.org connector for Azure Data Factory has advanced optimizations for great real-time performance with both a single DocumentCloud.org company as well as with hundreds of DocumentCloud.org companies.
When you are having questions, please check the forums for DocumentCloud.org.
Register Account on Invantive Cloud
Skip this step when you already have an account on Invantive Cloud. Otherwise execute the following steps one time to register an account on Invantive Cloud:
-
Go to Invantive Cloud start page.
Select the Log on-button.
-
Select the Next-button.
-
Select the Next-button.
Enter your password and repeat the provided password.
You will receive a six-digit verification code by email within 2 minutes.
-
Select the Sign Up-button.
-
Log on now. >
-
Make sure you have an authentication app installed on your phone. Select the Next-button.
-
Add the displayed QR code to the authentication app, enter the current verification code, and choose "Finish".
-
The Invantive Cloud dashboard will be shown.
You now have a login code on Invantive Cloud with which you can set up the connection with DocumentCloud.org and numerous other platforms. You will use the same Invantive login code and workflow for all other platforms.
Create DocumentCloud.org database
In this step, we set up a database with data from DocumentCloud.org. The database is "virtual" because it is not a traditional database, but is fed real-time from DocumentCloud.org. Invantive Cloud provides Azure Data Factory with a real-time link to DocumentCloud.org. The database will be used for all your DocumentCloud.org reporting with Azure Data Factory. So you only need to perform these steps once.
-
Click the Add Database button.
-
Please fill out the form with login information for DocumentCloud.org.
Select the OK-button.
Congratulations! You can now process data from DocumentCloud.org within the Invantive Cloud website. You can do this for example with the interactive SQL editor.
Grant connection from Azure Data Factory access to DocumentCloud.org
To retrieve the data from Azure Data Factory we create a link via Invantive Bridge Online. This creates a "bridge" between the cloud of Invantive and the standard OData connector that is available in every version of Azure Data Factory. You do not need to install anything locally: no connector, no ado.NET provider and no Azure Data Factory add-on neither.
Execute the following steps to safely use the data from DocumentCloud.org outside of Invantive Cloud:
Next to the database you will find an orange text requesting you to allow access from your current location (IP address). Select the nearby orange button.
You must completely disable IP address checking by entering an asterisk ("*") since Microsoft Azure Data Factory consists of hundreds of thousands of servers across the globe.
You are now ready to import the DocumentCloud.org data into Azure Data Factory.
Configure Azure Data Factory connector for DocumentCloud.org
You will now pull in Azure Data Factory data from DocumentCloud.org through the established link.
Please note again that Invantive Cloud under the hood handles all complexity such as setting up credentials such as a refresh token, acquiring an access token, optimizing and parallelizing access, accessing an API within the rate limits, scopes and security restrictions. There is absolutely no need for knowledge on complex technical topics such as OAuth access tokens or API. The Microsoft Azure Data Factory steps, activities and pipelines can be constructed using just the OData feed and basic authentication.
Execute the following steps:
-
Enter in the window the Bridge Online URL of the Azure Data Factory database. This URL containing the data can be found in the Database-form of Invantive Cloud. Choose authentication type 'Basic authentication'. Enter the Invantive Cloud user name and password.
-
The list of available DocumentCloud.org tables appears. Select the desired tables and construct your pipeline in Microsoft Azure Data Factory.