AWS request data with CLI and SDK/通過命令行界面和SDK從亞馬遜雲服務下載數據

摘要:AWS CLI提供了基本而且靈活的S3(AmazonSimple Storage Service)數據獲取方式,但是高級的數據獲取方式比如續傳需要用戶自己實現。基本的數據獲取可以使用CLI命令,但是高級的實現需要依賴不同語言的API,比如Java,C#等等。

1 AWS CLI request data with s3api get-object

cmd: aws s3api get-object

https://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html

 

The example below demonstrates the use of--range to download a specific byte range from an object. Note the byte ranges needs to be prefixed with "bytes=":

awss3api get-object --bucket text-content --key dir/my_data --rangebytes=8888-9999 my_data_range

 

Synopsis

get-object

--bucket <value>

[--if-match <value>]

[--if-modified-since <value>]

[--if-none-match <value>]

[--if-unmodified-since <value>]

--key <value>

[--range <value>]

[--response-cache-control <value>]

[--response-content-disposition<value>]

[--response-content-encoding <value>]

[--response-content-language <value>]

[--response-content-type <value>]

[--response-expires <value>]

[--version-id <value>]

[--sse-customer-algorithm <value>]

[--sse-customer-key <value>]

[--sse-customer-key-md5 <value>]

[--request-payer <value>]

[--part-number <value>]

outfile <value>

 

Description (dou):

Basiccmd:

awss3api get-object --bucket text-content --key dir/my_data my_data_range

--bucket (string): data bucket, i.e. ownerdefined data pool

–key (string): full dir of requested datain bucket

Outfile: output file name to be saved, userdefined

 

Partially download:

Method 1:

aws s3apiget-object --bucket text-content --key dir/my_data --range bytes=8888-9999my_data_range

--range (string): Downloads the specifiedrange bytes of an object.

Method 2:

aws s3apiget-object --bucket text-content --key dir/my_data -- part-number 1 my_data_range

--part-number (integer) Part number of the object being read. This is a positive integer between 1 and 10,000. Effectively performs a 'ranged' GET request for the part specified. Useful for downloading just a part of an object.

 

For more AWS CLI command reference:

https://docs.aws.amazon.com/cli/latest/reference/

2 Request data with SDK (e.g. C#)

2.1 Getting Started with the AWS SDK for .NET

https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/net-dg-setup.html

var options = new CredentialProfileOptions

{

    AccessKey = "access_key",

    SecretKey = "secret_key"

};

var profile = newAmazon.Runtime.CredentialManagement.CredentialProfile("basic_profile",options);

profile.Region =RegionEndpoint.USWest1;

var netSDKFile = new NetSDKCredentialsFile();

netSDKFile.RegisterProfile(profile);

TheRegisterProfile method is used to register a new profile. Your applicationtypically calls this method only once for each profile.

(1號坑:這裏要在VS新建項目選擇AWS的模板,而不是新建普通項目添加相應的dll)

(2號坑:新建的項目編譯錯誤找不到命名空間Amazon,要查看項目.NET版本,手動選擇AWS SDK安裝目錄添加對應版本的dll,目錄一般是Program File (X86))

2.2 Continued request code

         這裏主要考慮下載的文件比較大時,網絡不穩定,下載一會就斷掉就比較坑。考慮利用分塊的方法持續下載。

(1)VS 2013 (my). New a project with template AWS S3 sample.

(2)Configure profile.

Press Ctrl+K, and then press A.

Choose the New (or Edit) Account Profile icon to the right of the Profile list.

https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/credentials.html

 

(3)Set region

The regionrefers the location of bucket. The region must right or cause an error “Thebucket you are attempting to access must be addressed using the specifiedendpoint. Please send all future requests to this endpoint.”. (3號坑:必須指定正確的region也即endpoint

Endpointscurrently do not support cross-region requests—ensure that you create yourendpoint in the same region as your bucket. You can find the location of yourbucket by using the Amazon S3 console, or by using the get-bucket-location command.

Detail: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html

Synopsis

  get-bucket-location

--bucket<value>

[--cli-input-json<value>]

[--generate-cli-skeleton<value>]

https://docs.aws.amazon.com/cli/latest/reference/s3api/get-bucket-location.html

Example:

The followingcommand retrieves the location constraint for a bucket named my-bucket, if aconstraint exists:

aws s3api get-bucket-location --bucket my-bucket

Output:

{

    "LocationConstraint":"us-west-2"

}

For my case, Iget null.(4號坑:us-east-1, i.e. US East (N. Virginia), 獲取的region是null

aws s3api get-bucket-location --bucket spacenet-dataset

Output:

{

    "LocationConstraint":"null"

}

According to the servicedocumentation, S3 returns a null location if the bucket is in the US East(N. Virginia) region. So this is expected behavior. If you are trying to use such a bucket, you need to construct the client with the RegionEndpoint.USEast1 region.

         (5號坑:VS裏通過選擇設置Region無效,通過修改App.config來修改Region

App.config

<add key="AWSRegion"value="us-east-1" />

Other way to selectAWS region (endpoint):

https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/net-dg-region-selection.html

other: China(Beijing) Region Endpoints: cn-north-1

(4) Modify code

My cmd: awss3api get-object --bucket spacenet-dataset --keySpaceNet_Roads_Competition/AOI_2_Vegas_Roads_Train.tar.gz --request-payerrequester --part-number 1 AOI_2_Vegas_Roads_Train.tar.gz.1

Code reference:

https://docs.aws.amazon.com/AmazonS3/latest/dev/AuthUsingAcctOrUserCredDotNet.html

 

// In Main()

bucketName ="spacenet-dataset";

keyName ="SpaceNet_Roads_Competition/AOI_3_Paris_Roads_Test_Public.tar.gz";

outPath ="E:\\data\\";

RP =RequestPayer.Requester;

// loop

for (PartNum =17; PartNum<10001; PartNum++)

{

bool flag =false;

do

{

    flag = ReadingAnObject();

}

while (flag ==false);

}

// update ReadingAnObject

static boolReadingAnObject()

{

    bool flag = false;

    try

    {

GetObjectRequestrequest = new GetObjectRequest()

{

    BucketName = bucketName,

    Key = keyName,

    RequestPayer = RP,

    PartNumber = PartNum

};

 

using(GetObjectResponse response = client.GetObject(request))

{

    string title =response.Metadata["x-amz-meta-title"];

    Console.WriteLine("The object's titleis {0}", title);

    // string dest =Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),keyName);

    string dest = Path.Combine(outPath,keyName) + "." + PartNum.ToString();

  

    // if (!File.Exists(dest))

    {

response.WriteResponseStreamToFile(dest);

    }

}

flag = true;

    }

    catch (AmazonS3Exception amazonS3Exception)

    {

if(amazonS3Exception.ErrorCode != null &&

   (amazonS3Exception.ErrorCode.Equals("InvalidAccessKeyId") ||

   amazonS3Exception.ErrorCode.Equals("InvalidSecurity")))

{

    Console.WriteLine("Please check theprovided AWS Credentials.");

    Console.WriteLine("If you haven'tsigned up for Amazon S3, please visit http://aws.amazon.com/s3");

}

else

{

    Console.WriteLine("An error occurredwith the message '{0}' when reading an object",amazonS3Exception.Message);

}

    }

    return flag;

}

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章