S3

How to download Wasabi/S3 object to string/bytes using boto3 in Python

You can use io.BytesIO to store the content of an S3 object in memory and then convert it to bytes which you can then decode to a str. The following example downloads myfile.txt into memory:

# Download to file
buf = io.BytesIO()
my_bucket.download_fileobj("myfile.txt", buf)
# Get file content as bytes
filecontent_bytes = buf.getvalue()
# ... or convert to string
filecontent_str = buf.getvalue().decode("utf-8")

Full example

import boto3
import io

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
my_bucket = s3.Bucket('boto-test')

# Download to file
buf = io.BytesIO()
my_bucket.download_fileobj("myfile.txt", buf)
# Get file content as bytes
filecontent_bytes = buf.getvalue()
# ... or convert to string
filecontent_str = buf.getvalue().decode("utf-8")

print(filecontent_str)

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL instead of https://s3.eu-central-1.wasabisys.com.

Posted by Uli Köhler in Python, S3

How to upload string as Wasabi/S3 object using boto3 in Python

In order to upload a Python string like

my_string = "This shall be the content for a file I want to create on an S3-compatible storage"

to an S3-compatible storage like Wasabi or Amazon S3, you need to encode it using .encode("utf-8") and then wrap it in an io.BytesIO object:

my_bucket.upload_fileobj(io.BytesIO(my_string.encode("utf-8")), "myfile.txt")

Full example:

import boto3
import io

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
my_bucket = s3.Bucket('boto-test')

# Upload string to file
my_string = "This shall be the content for a file I want to create on an S3-compatible storage"

my_bucket.upload_fileobj(io.BytesIO(my_string.encode("utf-8")), "myfile.txt")

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL instead of https://s3.eu-central-1.wasabisys.com.

Posted by Uli Köhler in Python, S3

How to filter for objects in a given S3 directory using boto3

Using boto3, you can filter for objects in a given bucket by directory by applying a prefix filter.

Instead of iterating all objects using

for obj in my_bucket.objects.all():
    pass # ...

(see How to use boto3 to iterate ALL objects in a Wasabi / S3 bucket in Python for a full example)

you can apply a prefix filter using

for obj in my_bucket.objects.filter(Prefix="MyDirectory/"):
    print(obj)

Don’t forget the trailing / for the prefix argument ! Just using filter(Prefix="MyDirectory") without a trailing slash will also match e.g. MyDirectoryFileList.txt.

This complete example prints the object description for every object in the 10k-Test-Objects directory (from our post on How to use boto3 to create a lot of test files in Wasabi / S3 in Python).

import boto3

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
my_bucket = s3.Bucket('boto-test')

# Iterate over objects in bucket
for obj in my_bucket.objects.filter(Prefix="MyDirectory"):
    print(obj)

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL instead of https://s3.eu-central-1.wasabisys.com.

Example output:

s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/10.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/100.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1000.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/10000.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1001.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1002.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1003.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1004.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1005.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1006.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1007.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1008.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1009.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/101.txt')
[...]

 

Posted by Uli Köhler in Python, S3

How to use boto3 to iterate ALL objects in a Wasabi / S3 bucket in Python

This snippet shows you how to iterate over all objects in a bucket:

import boto3

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
my_bucket = s3.Bucket('boto-test')

# Iterate over objects in bucket
for obj in my_bucket.objects.all():
    print(obj)

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL instead of https://s3.eu-central-1.wasabisys.com.

Example output:

s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/1.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/10.txt')
s3.ObjectSummary(bucket_name='boto-test', key='10k-Test-Objects/100.txt')
[...]

 

Posted by Uli Köhler in Python, S3

How to use boto3 to create a lot of test files in Wasabi / S3 in Python

The following example code creates 10000 test files on Wasabi / S3. It is based on How to use concurrent.futures map with a tqdm progress bar:

import boto3
import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(64)

from tqdm import tqdm
import concurrent.futures
def tqdm_parallel_map(executor, fn, *iterables, **kwargs):
    """
    Equivalent to executor.map(fn, *iterables),
    but displays a tqdm-based progress bar.
    
    Does not support timeout or chunksize as executor.submit is used internally
    
    **kwargs is passed to tqdm.
    """
    futures_list = []
    for iterable in iterables:
        futures_list += [executor.submit(fn, i) for i in iterable]
    for f in tqdm(concurrent.futures.as_completed(futures_list), total=len(futures_list), **kwargs):
        yield f.result()

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
boto_test_bucket = s3.Bucket('boto-test')

def create_s3_object(i, directory):
    # Create test data
    buf = io.BytesIO()
    buf.write(f"{i}".encode())
    # Reset read pointer. DOT NOT FORGET THIS, else all uploaded files will be empty!
    buf.seek(0)

    # Upload the file
    boto_test_bucket.upload_fileobj(buf, f"{directory}/{i}.txt")

for _ in tqdm_parallel_map(executor, lambda i: create_s3_object(i, directory="10k-Test-Objects"), range(1, 10001)):
    pass

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL at https://s3.eu-central-1.wasabisys.com.

Note that running this script, especially when creating lots of test files, will send a lot of requests to your S3 provider and, depending on what plan you are using, these requests might be expensive. Wasabi, for example, does not charge for requests but charges for storage (with a minimum of 1TB storage per month being charged, at the time of writing this).

Posted by Uli Köhler in Python, S3

How to use boto3 to upload BytesIO to Wasabi / S3 in Python

This snippet provides a concise example on how to upload a io.BytesIO() object to

import boto3

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
boto_test_bucket = s3.Bucket('boto-test')

# Create a test BytesIO we want to upload
import io
buf = io.BytesIO()
buf.write(b"Hello S3 world!")

# Reset read pointer. DOT NOT FORGET THIS, else all uploaded files will be empty!
buf.seek(0)
    
# Upload the file. "MyDirectory/test.txt" is the name of the object to create
boto_test_bucket.upload_fileobj(buf, "MyDirectory/test.txt")

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL at https://s3.eu-central-1.wasabisys.com.

Also don’t forget

buf.seek(0)

or your uploaded files will be empty.

 

Posted by Uli Köhler in Python, S3

How to use boto3 to upload file to Wasabi / S3 in Python

Using boto to upload data to Wasabi is pretty simple, but not well-documented.

import boto3

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
boto_test_bucket = s3.Bucket('boto-test')

# Create a test file we want to upload
with open("upload-test.txt", "w") as outfile:
    outfile.write("Hello S3!")
    
# Upload the file. "MyDirectory/test.txt" is the name of the object to create
boto_test_bucket.upload_file("upload-test.txt", "MyDirectory/test.txt")

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL at https://s3.eu-central-1.wasabisys.com.

Posted by Uli Köhler in Python, S3