How to use concurrent.futures map with a tqdm progress bar

Problem:

You have a concurrent.futures executor, e.g.

import concurrent.futures

executor = concurrent.futures.ThreadPoolExecutor(64)

Using this executor, you want to map a function over an iterable in parallel (e.g. parallel download of HTTP pages).

In order to aid interactive execution, you want to use tqdm to provide a progress bar, showing the fraction of futures

Solution

You can use this function:

from tqdm import tqdm
import concurrent.futures

def tqdm_parallel_map(executor, fn, *iterables, **kwargs):
    """
    Equivalent to executor.map(fn, *iterables),
    but displays a tqdm-based progress bar.
    
    Does not support timeout or chunksize as executor.submit is used internally
    
    **kwargs is passed to tqdm.
    """
    futures_list = []
    for iterable in iterables:
        futures_list += [executor.submit(fn, i) for i in iterable]
    for f in tqdm(concurrent.futures.as_completed(futures_list), total=len(futures_list), **kwargs):
        yield f.result()

Note that internally, executor.submit() is used, not executor.map() because there is no way of calling concurrent.futures.as_completed() on the iterator returned by executor.map().

Usage example if you are not interested in the actual values:

import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(64)

# Run my_func with arguments ranging from 1 to 10000, in parallel
for _ in tqdm_parallel_map(executor, lambda i: my_func(i), range(1, 10000)):
    pass

In case you care about the return value of my_func(), use this snippet instead:

import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(64)

# Run my_func with arguments ranging from 1 to 10000, in parallel
for result in tqdm_parallel_map(executor, lambda i: my_func(i), range(1, 10000)):
    # result is the return value of my_func().
    # NOTE: result is not neccessarily in the same order as the input Iterable !
    # Whichever parallel execution of my_func() finishes first will be printed first ! 
    print(result)

Note: In constract to executor.map() this function does NOT yield the arguments in the same order as the input.