requests: Download file if it doesn’t exist

Problem:

You want to download a URL to a file using the requests python library, but you want to skip the download if it doesn’t exist

Solution:

Use the following functions:

import requests
import os.path

def download_file(filename, url):
    """
    Download an URL to a file
    """
    with open(filename, 'wb') as fout:
        response = requests.get(url, stream=True)
        response.raise_for_status()
        # Write response data to file
        for block in response.iter_content(4096):
            fout.write(block)

def download_if_not_exists(filename, url):
    """
    Download a URL to a file if the file
    does not exist already.

    Returns
    -------
    True if the file was downloaded,
    False if it already existed
    """
    if not os.path.exists(filename):
        download_file(filename, url)
        return True
    return False

 

An introduction to Z-boxes

You most likely found this post for one of two reasons:

  • Either you haven’t heard of Z-Boxes and are interested in if they can somehow help you
  • or you have to learn about Z-Boxes and you have absolutely no idea how to understand the mathematical definitions.

Either way, we’re going to investigate Z-Boxes – not using a box of formulas but using examples and Python code.

Mehr lesen

Removing spans/divs with style attributes from HTML

Occasionally I have to clean up some HTML code – mostly because parts of it were pasted into a CMS like WordPress from rich text editor like Word.

I’ve noticed that the formatting I want to remove is mostly based on span and div elements with a style attribute. Therefore, I’ve written a simple Python script based on BeautifulSoup4 which will replace certain tags with their contents if they have a style attribute. While in some cases some other formatting might be destroyed by such a script, it is very useful for some recurring usecases.

Mehr lesen

Advantages and disadvantages of hugepages

In a previous post, I’ve written about how to check and enable transparent hugepages in Linux globally.

Although this post is important if you actually have a usecase for hugepages, I’ve seen multiple people getting fooled by the prospect that hugepages will magically increase performance. However, hugepaging is a complex topic and, if used in the wrong way, might easily decrease overall performance. Mehr lesen

In-place trimming/stripping in C

For an explanation of in-place algorithms see my previous post on zero-copy in-place splitting

The problem

You have a C string possibly containing whitespace at the beginning and/or the end.

char* s = " abc   \n\r";

Using an in-place algorithm, you want to remove the whitespace from this string.

Doing this is also possible using boost::algorithm::trim, but it has the same caveats as boost::algorithm::split as discussed in my previous post about C splitting Mehr lesen

How to interpret smartctl messages like ‘Error: UNC at LBA’?

When running smartctl on your hard drive, you often get a plethora of information that can be hard to interpret for unexperienced users. This post attempts to provide aid in interpreting what the technical reasons behind the error messages are. If you’re looking for advice on whether to replace your hard drive, the only guidance I can give you is it might fail any time, so better backup your data, but it might also run for many years to come.. Furthermore, this article does not describe basic SMART WHEN_FAILED checking but rather interpretation of more subtle signs of possibly impending HDD failures.

Mehr lesen