How to use boto3 to create a lot of test files in Wasabi / S3 in Python

The following example code creates 10000 test files on Wasabi / S3. It is based on How to use concurrent.futures map with a tqdm progress bar:

import boto3
import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(64)

from tqdm import tqdm
import concurrent.futures
def tqdm_parallel_map(executor, fn, *iterables, **kwargs):
    """
    Equivalent to executor.map(fn, *iterables),
    but displays a tqdm-based progress bar.
    
    Does not support timeout or chunksize as executor.submit is used internally
    
    **kwargs is passed to tqdm.
    """
    futures_list = []
    for iterable in iterables:
        futures_list += [executor.submit(fn, i) for i in iterable]
    for f in tqdm(concurrent.futures.as_completed(futures_list), total=len(futures_list), **kwargs):
        yield f.result()

# Create connection to Wasabi / S3
s3 = boto3.resource('s3',
    endpoint_url = 'https://s3.eu-central-1.wasabisys.com',
    aws_access_key_id = 'MY_ACCESS_KEY',
    aws_secret_access_key = 'MY_SECRET_KEY'
)

# Get bucket object
boto_test_bucket = s3.Bucket('boto-test')

def create_s3_object(i, directory):
    # Create test data
    buf = io.BytesIO()
    buf.write(f"{i}".encode())
    # Reset read pointer. DOT NOT FORGET THIS, else all uploaded files will be empty!
    buf.seek(0)

    # Upload the file
    boto_test_bucket.upload_fileobj(buf, f"{directory}/{i}.txt")

for _ in tqdm_parallel_map(executor, lambda i: create_s3_object(i, directory="10k-Test-Objects"), range(1, 10001)):
    pass

Don’t forget to fill in MY_ACCESS_KEY and MY_SECRET_KEY. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL at https://s3.eu-central-1.wasabisys.com.

Note that running this script, especially when creating lots of test files, will send a lot of requests to your S3 provider and, depending on what plan you are using, these requests might be expensive. Wasabi, for example, does not charge for requests but charges for storage (with a minimum of 1TB storage per month being charged, at the time of writing this).