requests: Download file if it doesn’t exist

Problem:

You want to download a URL to a file using the requests python library, but you want to skip the download if it doesn’t exist

Solution:

Use the following functions:

import requests
import os.path

def download_file(filename, url):
    """
    Download an URL to a file
    """
    with open(filename, 'wb') as fout:
        response = requests.get(url, stream=True)
        response.raise_for_status()
        # Write response data to file
        for block in response.iter_content(4096):
            fout.write(block)

def download_if_not_exists(filename, url):
    """
    Download a URL to a file if the file
    does not exist already.

    Returns
    -------
    True if the file was downloaded,
    False if it already existed
    """
    if not os.path.exists(filename):
        download_file(filename, url)
        return True
    return False

 

Posted by Uli Köhler in Python

An introduction to Z-boxes

You most likely found this post for one of two reasons:

  • Either you haven’t heard of Z-Boxes and are interested in if they can somehow help you
  • or you have to learn about Z-Boxes and you have absolutely no idea how to understand the mathematical definitions.

Either way, we’re going to investigate Z-Boxes – not using a box of formulas but using examples and Python code.

Continue reading →

Posted by Uli Köhler in Algorithms

Removing spans/divs with style attributes from HTML

Occasionally I have to clean up some HTML code – mostly because parts of it were pasted into a CMS like WordPress from rich text editor like Word.

I’ve noticed that the formatting I want to remove is mostly based on span and div elements with a style attribute. Therefore, I’ve written a simple Python script based on BeautifulSoup4 which will replace certain tags with their contents if they have a style attribute. While in some cases some other formatting might be destroyed by such a script, it is very useful for some recurring usecases.

Continue reading →

Posted by Uli Köhler in Python

Advantages and disadvantages of hugepages

In a previous post, I’ve written about how to check and enable transparent hugepages in Linux globally.

Although this post is important if you actually have a usecase for hugepages, I’ve seen multiple people getting fooled by the prospect that hugepages will magically increase performance. However, hugepaging is a complex topic and, if used in the wrong way, might easily decrease overall performance. Continue reading →

Posted by Uli Köhler in C/C++, Performance

In-place trimming/stripping in C

For an explanation of in-place algorithms see my previous post on zero-copy in-place splitting

The problem

You have a C string possibly containing whitespace at the beginning and/or the end.

char* s = " abc   \n\r";

Using an in-place algorithm, you want to remove the whitespace from this string.

Doing this is also possible using boost::algorithm::trim, but it has the same caveats as boost::algorithm::split as discussed in my previous post about C splitting Continue reading →

Posted by Uli Köhler in C/C++

Zero-copy in-place string splitting in C

Let’s assume you have a string:

char* s = "1,23,456,7890";

You want to split said string at each comma in order to obtain its parts as C strings (with the number of parts being variable):

char* s1 = "1";
char* s2 = "23";
char* s3 = "456";
char* s4 = "7890";

Continue reading →

Posted by Uli Köhler in C/C++

How to interpret smartctl messages like ‘Error: UNC at LBA’?

When running smartctl on your hard drive, you often get a plethora of information that can be hard to interpret for unexperienced users. This post attempts to provide aid in interpreting what the technical reasons behind the error messages are. If you’re looking for advice on whether to replace your hard drive, the only guidance I can give you is it might fail any time, so better backup your data, but it might also run for many years to come.. Furthermore, this article does not describe basic SMART WHEN_FAILED checking but rather interpretation of more subtle signs of possibly impending HDD failures.

Continue reading →

Posted by Uli Köhler in Linux

Accurate calculation of PT100/PT1000 temperature from resistance

TL;DR for impatient readers

PT100/PT1000 temperatures calculation suffers from accuracy issues for large sub-zero temperatures. UliEngineering implements a polynomial-fit based algorithm to provide 58.6 \mu{\degree}C peak-error over the full defined temperature range from -200 {\degree}C to +850 °C.

Use this code snippet (replace pt1000_ by pt100- to use PT100 coefficients) to compute an accurate temperature (in degrees celsius) e.g. for a resistane of 829.91 Ω of a PT1000 sensor.

from UliEngineering.Physics.RTD import pt1000_temperature
# The following calls are equivalent and print -43.2316359463
print(pt1000_temperature("829.91 Ω"))
print(pt1000_temperature(829.91))

You install the library (compatible to Python 3.2+) using

$ pip3 install -U UliEngineering

Continue reading →

Posted by Uli Köhler in Electronics, Mathematics

Reusing your calendars, the pythonic way

Yesterday got a calendar for 2016. An interesting question came up my mind: When can I reuse this calendar, and for which year can I reuse which old calendar?

The 1st January 2015 was a Thursday. The same day in 2016 is a Friday. Once you follow this pattery you will quickly recognize that the base period of seven years is disrupted by leap years.

It quickly turns out that for some years it takes decades until you can reuse a calendar: 2016 is a leap yer, so you can not reuse it for 2044.

However, there’s a neat quirk that is currently unimplemented in online services like whencanireusethiscalendar.com: You can partially reuse a calendar.

Continue reading →

Posted by Uli Köhler in Algorithms

Automated domain name extraction from Let’s Encrypt certificate transparency logs

A few days ago, Let’s Encrypt into public beta. At the time of writing this article, almost 120k certificateshave been issued, including the certificate for TechOverflow.

I really like the Let’s Encrypt service and I believe it might actually change the way people perceive HTTPS encryption. However, there is one rarely-mentioned side-effect when protecting your domains with their certificates.

Let’s Encrypt publishes certificate transparency logs at crt.sh. This transparency does not come without side-effects, however: crt.sh effectively publishes.

In other words, hiding sites from the public by not publishing their (sub-)domain names anywhere will not work when you issue a certificate for the domain on services like Let’s Encrypt.

Continue reading →

Posted by Uli Köhler in Linux

nginx Let’s Encrypt authentication for reverse-proxy sites

Problem:

You have an nginx host that is configured as reverse-proxy-only like this:

server {
    server_name  my.domain;
    [...]
    location / {
        proxy_pass http://localhost:1234;
    }
}

For this host, you want to use Let’s Encrypt to automatically issue a certificate using the webroot method like this:

certbot certonly -a webroot --webroot-path ??? -d my.domain

The reverse-proxied webserver does not provide a webroot to use for the automated autentication process and you want to keep the flexibility of updating the cert at any time without manually modifying the nginx configuration.

Continue reading →

Posted by Uli Köhler in Linux, nginx

Accurate short & long delays on microcontrollers using ChibiOS

How system ticks work

In order to understand how delays work, we’ll first need to have a look at system ticks. Although ChibiOS 3.x supports a feature called tickless mode, we’ll stick to a simple periodic tick model for simplicity reasons.

A system tick is simply a timer that interrupts the microcontroller periodically and performs some kernel management tasks. For example, with a 1 kHz system tick (systick) frequency, the program flow is interrupted every millisecond. When being interrupted, one of the things the kernel does is to check if a thread that is currently asleep needs to be woken up. In other words, if your thread has some code like this:

// [...]
chThdSleepMilliseconds(5);
// [...]

and the kernel has a 1 kHz systick frequency, the kernel will set your thread to sleep, wait for 5 system ticks (i.e. 5 ms) and then wake up the

Continue reading →

Posted by Uli Köhler in Electronics, Embedded

Using burnout current sources for Wheatstone bridge detection

Many recent high-performance ADCs like the AD7190 include a builtin so-called burnout current source that can allegedly be used to detect an open circuit in the sensor. However, most vendors don’t provide an easy explanation on how this can be done.

In this blogpost I will attempt to explain how those current sources can be useful for practical applications. For this example, we will assume the ADC has one idealized differential channel and is connected to a simple wheatstone bridge strain gauge:

Continue reading →

Posted by Uli Köhler in Electronics

Computing the LP2980 adjust resistor using Python

The LP2980ADJ is a 50 mA LDO that be configured for an output voltage from 1.23V to 15V using a pair of resistors.

The datasheet lists a formula for the output voltage, however no easy-to-use customizable software is provided that can be used to directly compute the correct resistor in a reproducible way. Continue reading →

Posted by Uli Köhler in Electronics, Python