How to use concurrent.futures map with a tqdm progress bar

Problem:

You have a concurrent.futures executor, e.g.

import concurrent.futures

executor = concurrent.futures.ThreadPoolExecutor(64)

Using this executor, you want to map a function over an iterable in parallel (e.g. parallel download of HTTP pages).

In order to aid interactive execution, you want to use tqdm to provide a progress bar, showing the fraction of futures

Solution:

You can use this function:

from tqdm import tqdm
import concurrent.futures

def tqdm_parallel_map(executor, fn, *iterables, **kwargs):
    """
    Equivalent to executor.map(fn, *iterables),
    but displays a tqdm-based progress bar.
    
    Does not support timeout or chunksize as executor.submit is used internally
    
    **kwargs is passed to tqdm.
    """
    futures_list = []
    for iterable in iterables:
        futures_list += [executor.submit(fn, i) for i in iterable]
    for f in tqdm(concurrent.futures.as_completed(futures_list), total=len(futures_list), **kwargs):
        yield f.result()

Note that internally, executor.submit() is used, not executor.map() because there is no way of calling concurrent.futures.as_completed() on the iterator returned by executor.map().

requests: Download file if it doesn’t exist

Problem:

You want to download a URL to a file using the requests python library, but you want to skip the download if it doesn’t exist

Solution:

Use the following functions:

import requests
import os.path

def download_file(filename, url):
    """
    Download an URL to a file
    """
    with open(filename, 'wb') as fout:
        response = requests.get(url, stream=True)
        response.raise_for_status()
        # Write response data to file
        for block in response.iter_content(4096):
            fout.write(block)

def download_if_not_exists(filename, url):
    """
    Download a URL to a file if the file
    does not exist already.

    Returns
    -------
    True if the file was downloaded,
    False if it already existed
    """
    if not os.path.exists(filename):
        download_file(filename, url)
        return True
    return False

 

Removing spans/divs with style attributes from HTML

Occasionally I have to clean up some HTML code – mostly because parts of it were pasted into a CMS like WordPress from rich text editor like Word.

I’ve noticed that the formatting I want to remove is mostly based on span and div elements with a style attribute. Therefore, I’ve written a simple Python script based on BeautifulSoup4 which will replace certain tags with their contents if they have a style attribute. While in some cases some other formatting might be destroyed by such a script, it is very useful for some recurring usecases.

Read more

Upload multiple files to the tornado webserver

The following html code can be used to create an html form that allows uploading multiple files at once:

<form enctype="multipart/form-data" method="POST" action="upload.py">
  <table style="width: 100%">
    <tr>
      <td>Choose the files to upload:</td>
      <td style="text-align: right"><input type="file" multiple="" id="files" name="files"></td>
    </tr>
    <tr>
      <td><input id="fileUploadButton" type="submit" value="Upload &gt;&gt;"></td>
      <td></td>
    </tr>
  </table>
</form>

Read more

Normalizing electronics engineering value notations using Python

In electronics engineering there is a wide variety of notations for values that need to be recognized by intuitive user interfaces. Examples include:

  • 1fA
  • 0.1A
  • 0.00001
  • 1e-6
  • 4,5nA
  • 4,500.123 A
  • 4A5
  • 4k0 A

The wide variety of options, including thousands separators, comma-as-decimal-separator and suffix-as-decimal-separator, optional whitespace and scientific notations makes it difficult to normalize values without using specialized libraries. Read more

Engineering for the super-lazy: Solving equations without activating your brain

Preface

In electronics engineering, from time to time you have to use standard formulas to characterize your circuits. To what extent you need to calculate all parameters most often depends on the requirement.

For example, consider the formula for the -3dB cutoff frequency of a 1st order RC lowpass filter:

f_c=\frac{1}{2\pi RC}

Although this equation is fairly simple and most people won’t have any problem solving it for any particular variable in a few seconds, it can serve as a basic example on how to solve an equation symbolically.

One of the easiest ways of performing this task is to use SymPy, a Python library for symbolic mathematics.

Read more

How to resolve ‘Can’t perform this operation for unregistered loader type’

Problem

Recently I started to add unit tests using setuptools to one of my packages.

In order to do this, I added a test directory containing MyUnitTest.py. setup.py was properly setup using the

test_suite="tests"

option.

However, when running

python setup.py test

I got this error message:

running test
running egg_info
writing MyPackage.egg-info/PKG-INFO
writing top-level names to MyPackage.egg-info/top_level.txt
writing dependency_links to MyPackage.egg-info/dependency_links.txt
reading manifest file 'MyPackage.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'MyPackage.egg-info/SOURCES.txt'
running build_ext
Traceback (most recent call last):
  File "setup.py", line 29, in <module>
    'Topic :: Scientific/Engineering :: Information Analysis'
  File "/usr/lib/python3.4/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.4/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.4/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/test.py", line 135, in run
    self.with_project_on_sys_path(self.run_tests)
  File "/usr/lib/python3/dist-packages/setuptools/command/test.py", line 116, in with_project_on_sys_path
    func()
  File "/usr/lib/python3/dist-packages/setuptools/command/test.py", line 160, in run_tests
    testLoader = cks
  File "/usr/lib/python3.4/unittest/main.py", line 92, in __init__
    self.parseArgs(argv)
  File "/usr/lib/python3.4/unittest/main.py", line 139, in parseArgs
    self.createTests()
  File "/usr/lib/python3.4/unittest/main.py", line 146, in createTests
    self.module)
  File "/usr/lib/python3.4/unittest/loader.py", line 146, in loadTestsFromNames
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/usr/lib/python3.4/unittest/loader.py", line 146, in <listcomp>
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/usr/lib/python3.4/unittest/loader.py", line 117, in loadTestsFromName
    return self.loadTestsFromModule(obj)
  File "/usr/lib/python3/dist-packages/setuptools/command/test.py", line 26, in loadTestsFromModule
    for file in resource_listdir(module.__name__, ''):
  File "/usr/lib/python3/dist-packages/pkg_resources.py", line 954, in resource_listdir
    resource_name
  File "/usr/lib/python3/dist-packages/pkg_resources.py", line 1378, in resource_listdir
    return self._listdir(self._fn(self.module_path,resource_name))
  File "/usr/lib/python3/dist-packages/pkg_resources.py", line 1415, in _listdir
    "Can't perform this operation for unregistered loader type"
NotImplementedError: Can't perform this operation for unregistered loader type

Read more