The following example code creates 10000 test files on Wasabi / S3. It is based on How to use concurrent.futures map with a tqdm progress bar:
import boto3 import concurrent.futures executor = concurrent.futures.ThreadPoolExecutor(64) from tqdm import tqdm import concurrent.futures def tqdm_parallel_map(executor, fn, *iterables, **kwargs): """ Equivalent to executor.map(fn, *iterables), but displays a tqdm-based progress bar. Does not support timeout or chunksize as executor.submit is used internally **kwargs is passed to tqdm. """ futures_list = [] for iterable in iterables: futures_list += [executor.submit(fn, i) for i in iterable] for f in tqdm(concurrent.futures.as_completed(futures_list), total=len(futures_list), **kwargs): yield f.result() # Create connection to Wasabi / S3 s3 = boto3.resource('s3', endpoint_url = 'https://s3.eu-central-1.wasabisys.com', aws_access_key_id = 'MY_ACCESS_KEY', aws_secret_access_key = 'MY_SECRET_KEY' ) # Get bucket object boto_test_bucket = s3.Bucket('boto-test') def create_s3_object(i, directory): # Create test data buf = io.BytesIO() buf.write(f"{i}".encode()) # Reset read pointer. DOT NOT FORGET THIS, else all uploaded files will be empty! buf.seek(0) # Upload the file boto_test_bucket.upload_fileobj(buf, f"{directory}/{i}.txt") for _ in tqdm_parallel_map(executor, lambda i: create_s3_object(i, directory="10k-Test-Objects"), range(1, 10001)): pass
Don’t forget to fill in MY_ACCESS_KEY
and MY_SECRET_KEY
. Depending on what region and what S3-compatible service you use, you might need to use another endpoint URL at https://s3.eu-central-1.wasabisys.com
.
Note that running this script, especially when creating lots of test files, will send a lot of requests to your S3 provider and, depending on what plan you are using, these requests might be expensive. Wasabi, for example, does not charge for requests but charges for storage (with a minimum of 1TB storage per month being charged, at the time of writing this).