One of the key requirement for every business considering Hadoop on cloud is “How to upload data to BLOB”. The answer lies at the size of the data and time you want to invest during uploading process. It may also vary on the basis of activity you perform. For example whether you want to put all the data at once or prefer to perform uploading it in chunks followed by action to perform same activity multiple times or in-parallel. With all so many variation end result would be to have data available on cloud.

Looking for this reference across internet I decide to put a blog that provide us multiple possible ways to publish data to BLOB.

Upload using Open Source/3rd Party Utilities

Tools like Cloud Storage StudioCloudXplorerAzure Blob Studio 2011Azure Blob Explorer,CodePlex.SpaceBlockGladinetAzure Storage UtilityAzure Storage ViewerWindows Azure Web Storage ExplorerCloudBerry ExplorerCyberDuckCloudCombine, etc. are used to manage Azure storage.

Upload using API

Microsoft offers API (REST API and Media Services) that allows managing Azure Storage programmatically. The services is available to connect with Azure SDK and can also be leveraged using PowerShell. Provide support for languages like C#, JAVA and Python.

Upload using Management Portal

Ability to upload files directly via Azure Management Portal is not yet provided and under review. We might have it in upcoming releases.

Upload using Import/Export Service

Microsoft provides an efficient solution for importing large amounts of on- premise data (TBs/PBs) into Windows Azure Blobs. The preparation process includes creation of encrypted hard disk drives and shipping it to Microsoft data centers through FedEx. Microsoft team using high-speed internal network, will upload this data for you. Considering TBs/PBs for data upload, it’s a recommend way to ease overall deployment/upload process.

Consideration includes:

      1. One disk is recommend to be of maximum 4TB. So if you have more than 4TB to be uploaded then you have to provide those no. of disks with 4TB’s each.
      2. Each device must be shipped from FedEx with an BitLocker key encryption
      3. Drive prepared using WAImportExport.exe is considered to be best for data uploading through Microsoft import/export services

For details please visit: http://azure.microsoft.com/en-us/documentation/articles/storage-import-export-service/

Recommended References

    1. http://msdn.microsoft.com/en-us/library/dd179376.aspx
    2. http://msdn.microsoft.com/en-us/library/ee691964.aspx
    3. http://www.asp.net/aspnet/overview/developing-apps-with-windows-azure/building-real-world-cloud-apps-with-windows-azure/unstructured-blob-storage
    4. http://gauravmantri.com/2013/02/16/uploading-large-files-in-windows-azure-blob-storage-using-shared-access-signature-html-and-javascript/
    5. http://azure.microsoft.com/en-us/documentation/articles/storage-java-how-to-use-blob-storage/ (JAVA Support)
    6. http://azure.microsoft.com/en-us/documentation/articles/storage-nodejs-how-to-use-blob-storage/ (Node.js)
    7. http://blogs.msdn.com/b/tconte/archive/2013/04/17/how-to-interact-with-windows-azure-blob-storage-from-linux-using-python.aspx (Linux with Python)

Leave a Comment