AWS S3
Scheme
The scheme you can use to access your files in Amazon Simple Storage is
**s3://**
Configuration
The configuration of the Amazon Web Services Simple Cloud Storage can be done through a variety of ways. Most require you to have an Access Key
and a Secret Key
.
Best practice is to create a specific IAM user for Apache Hop so that if needed you can fine-tune the permissions (set it to read-only for example) or indeed delete the user if it’s no longer needed.
For a complete list see Working with credentials in the AWS documentation.
Below are 2 popular ways of configuring access.
Credentials file
You can create a file in your home folder in the .aws/credentials
file which then contains content like this:
[default]
aws_access_key_id = yourSecretKey
aws_secret_access_key = a-long/series-of-characters-for-your-access-key
Variables
You can set the following system environment variables:
AWS_ACCESS_KEY_ID
: set it to your AWS access keyAWS_SECRET_ACCESS_KEY
: set it to your secret access key
Part size
You can set the default part size for new files on S3 by setting the following variable:
HOP_S3_VFS_PART_SIZE
This needs to be set as a global Hop configuration variable (in hop-config.json). You can use the Tools/Edit config variables menu in Hop GUI or you can use the hop-conf command line tool to do so.
Acceptable are 5MB
as a minimum and 5GB
as a maximum value.
If this variable is not set, 5MB
will be taken as value and a message will be printed on the console while creating files on S3:
Part size null less than minimum of 5MB, set to minimum
.
Usage and testing
To test if the configuration works you can simply upload a small CSV file in an S3 bucket and then use File/Open in Hop GUI. Then you type in s3://
as a file location and hit enter (or click the refresh button). Browse to the CSV file you uploaded and open it. If all is configured correctly you should be able to see the content in the Hop GUI.