Programming languages

How to fix Python ‘ValueError: Namespace GnomeDesktop not available’ on Ubuntu

Problem:

On Ubuntu, you are trying to run a Python script using the gi package and GnomeDesktop but you are seeing this stacktrace:

Traceback (most recent call last):
  File "myscript.py", line 48, in <module>
    gi.require_version('GnomeDesktop', '3.0')
  File "/usr/lib/python3/dist-packages/gi/__init__.py", line 130, in require_version
    raise ValueError('Namespace %s not available' % namespace)
ValueError: Namespace GnomeDesktop not available

Solution

Install gir1.2-gnomedesktop-3.0:

sudo apt -y install gir1.2-gnomedesktop-3.0

and retry running your script.

Posted by Uli Köhler in Linux, Python

Computing the CRC8-ATM CRC in Python

The 8-bit CRC8-ATM polynomial is used in many embedded applications, including Trinamic UART-controlled stepper motor drivers like the TMC2209:

\text{CRC} = x^8 + x^2 + x^1 + x^0

The following code provides an example on how to compute this type of CRC in Python:

def compute_crc8_atm(datagram, initial_value=0):
    crc = initial_value
    # Iterate bytes in data
    for byte in datagram:
        # Iterate bits in byte
        for _ in range(0, 8):
            if (crc >> 7) ^ (byte & 0x01):
                crc = ((crc << 1) ^ 0x07) & 0xFF
            else:
                crc = (crc << 1) & 0xFF
            # Shift to next bit
            byte = byte >> 1
    return crc

This code has been field-verified for the TMC2209.

Posted by Uli Köhler in Algorithms, Embedded, MicroPython, Python

MicroPython ESP32 minimal UART example

This example shows how to use UART on the ESP32 using MicroPython. In this example, we use UART1 which is mapped to pins GPIO9 (RX) and GPIO10 (TX).

from machine import UART
uart = UART(1, 115200) # 1st argument: UART number: Hardware UART #1

# Write
uart.write("test")

# Read
print(uart.read()) # Read as much as possible using

Don’t know how to upload the file to MicroPython so it is automatically run on boot?

Posted by Uli Köhler in Embedded, MicroPython, Python

How to fix ESP32 MicroPython ‘ValueError: pin can only be input’

Problem:

You are trying to initialize an ESP32 pin in MicroPython using

import machine
machine.Pin(34, machine.Pin.OUT)

but you see an error message like

>>> machine.Pin(34, machine.Pin.OUT)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: pin can only be input

Solution:

On the ESP32, pins with numbers >= 34 are input-only pins!

You need to use other pins < 34 if you need output capability!

For reference, see the relevant MicroPython source code section:

// configure mode
if (args[ARG_mode].u_obj != mp_const_none) {
    mp_int_t pin_io_mode = mp_obj_get_int(args[ARG_mode].u_obj);
    if (self->id >= 34 && (pin_io_mode & GPIO_MODE_DEF_OUTPUT)) {
        mp_raise_ValueError("pin can only be input");
    } else {
        gpio_set_direction(self->id, pin_io_mode);
    }
}
Posted by Uli Köhler in Embedded, MicroPython, Python

How to upload files to MicroPython over USB/serial

In this post we will investigate how to connect to a wireless network on boot

First, install ampy – a tool to modify the MicroPython filesystem over a serial connection.

sudo pip3 install adafruit-ampy

Now prepare your script – we’ll use main.py in this example.

Upload the file to the board:

ampy -p /dev/ttyUSB* put main.py

This only takes about 2-4 seconds. In case ampy is still running after 10 seconds, you might need to

  • Stop ampy (Ctrl+C), reset the board using the RESET button and retry the command
  • Stop ampy (Ctrl+C). Detach USB and ensure the board is powered off (and not powered externally). Re-Attach USB and retry the command.
  • In case that doesn’t help, try re-flashing your board with the most recent version of MicroPython. See How to flash MicroPython to your ESP32 board in 30 seconds. This will also clear the internal filesystem and hence remove any file that might cause failure to boot properly.
Posted by Uli Köhler in Embedded, MicroPython, Python

MicroPython ESP32 blink example

This MicroPython code blinks GPIO2 which is connected to the LED on most ESP32 boards.

import machine
import time
led = machine.Pin(2, machine.Pin.OUT)
while True:
    led.value(1)
    time.sleep(1)
    led.value(0)
    time.sleep(1)

Don’t know how to upload the file to MicroPython so it is automatically run on boot?

Posted by Uli Köhler in Embedded, MicroPython, Python

How to run WebREPL without webrepl_setup in MicroPython

Problem:

You want to enable WebREPL on your MicroPython board using

import webrepl
webrepl.start()

but it is only showing this error message:

WebREPL is not configured, run 'import webrepl_setup'

However, you want to configure WebREPL programmatically instead of manually running it on every single board.

Solution:

Use

import webrepl
webrepl.start(password="Rua8ohje")

This will circumvent webrepl_setup completely and is compatible with an automated setup process.

Note: At the time of writing this you can only use passwords with 8 characters max! (see How to fix MicroPython WebREPL ValueError in File &#8222;webrepl.py&#8220;, line 72, in start )

Posted by Uli Köhler in Embedded, MicroPython, Python

How to fix MicroPython WebREPL ValueError in File “webrepl.py”, line 72, in start

Problem:

You want to configure your MicroPython WebREPL programmatically using webrepl.start(password="...") but you see a stacktrace like

>>> webrepl.start(password="Rua8ohjedo")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "webrepl.py", line 72, in start
ValueError:

Solution:

Use a shorter password with 8 characters max:

webrepl.start(password="Rua8ohje")

 

Posted by Uli Köhler in Embedded, MicroPython, Python

How to autoconnect to Wifi using MicroPython on your ESP32 board

In this post we will investigate how to connect to a wireless network on boot

First, install ampy – a tool to modify the MicroPython filesystem over a serial connection.

sudo pip3 install adafruit-ampy

Now download main.py and save it in your current working directory and insert your wifi credentials:

import network
station = network.WLAN(network.STA_IF)
station.active(True)
station.connect("YourWifiName", "EnterYourWifiPasswordHere")

Upload the file to the board:

ampy -p /dev/ttyUSB* put main.py

In case ampy shows no output within 5 seconds, try resetting the board, waiting for 5-10 seconds and retrying the upload using ampy

Note: You can list the files on the board’s filesystem using

ampy -p /dev/ttyUSB0 ls

You can verify the content of main.py using

ampy -p /dev/ttyUSB0 get main.py
Posted by Uli Köhler in Embedded, MicroPython, Python

How to fix Nextcloud Refused to send form data to /login/v2/grant because it violates the following Content Security Policy directive: form-action ‘self’

Problem:

When trying to connect using the Nextcloud client, during the Flow v2 authorization step where you open a page in the browser to authorize the client, you see an error message in the JS console like

Refused to send form data to 'http://nextcloud.mydomain.com/login/v2/grant' because it violates the following Content Security Policy directive: "form-action 'self'".

Solution:

Add

'overwriteprotocol' => 'https',

after this line:

'version' => '18.0.0.10',

in your nextcloud/config/config.php.

Posted by Uli Köhler in Nextcloud, PHP

How to read IDF diabetes statistics in Python using Pandas

The International Diabetes Foundation provides a Data portal with various statistics related to diabetes.

In this post we’ll show how to read the Diabetes estimates (20-79 y) / People with diabetes, in 1,000s data export in CSV format using pandas.

First download IDF (people-with-diabetes--in-1-000s).csv from the data page.

Now we can parse the CSV file:

import pandas as pd

# Download at https://www.diabetesatlas.org/data/en/indicators/1/
df = pd.read_csv("IDF (people-with-diabetes--in-1-000s).csv")
# Parse year columns to obtain floats and multiply by thousands factor. Pandas fails to parse values like "12,345.67"
for column in df.columns:
    try:
        int(column)
        df[column] = df[column].apply(lambda s: None if s == "-" else float(s.replace(",", "")) * 1000)
    except:
        pass

As you can see in the postprocessing step, the number of diabetes patients are given in 1000s in the CSV, so we multiply them by 1000 to obtain the actual numbers.

If you want to modify the data columns (i.e. the columns referring to year), you can use this simple template:

for column in df.columns:
    try:
        int(column) # Will raise ValueError() if column is not a year number
        # Whatever you do here will only be applied to year columns
        df[column] = df[column] * 0.75 # Example on how to modify a column
        # But note that if your code raises an Exception, it will be ignored!
    except:
        pass

Let’s plot some data:

regions = df[df["Type"] == "Region"] # Only regions, not individual countries

from matplotlib import pyplot as plt
plt.style.use("ggplot")
plt.gcf().set_size_inches(20,4)
plt.ylabel("Diabetes patients [millions]")
plt.xlabel("Region")
plt.title("Diabetes patients in 2019 by region")
plt.bar(regions["Country/Territory"], regions["2019"] / 1e6)

Note that if you use a more recent dataset than the version I’m using the 2019 column might not exist in your CSV file. Choose an appropriate column in that case.

Posted by Uli Köhler in Bioinformatics, Python

Parsing World Population Prospects (WPP) XLSX data in Python

The United Nations provides the Word Population Prospects (WPP) dataset on geographic and age distribution of mankind as downloadable XLSX files.

Reading these files in Python is rather easy. First we have to find out how many rows to skip. For the 2019 WPP dataset this value is 16 since row 17 contains all the column headers. The number of rows to skip might be different depending on the dataset. We’re using WPP2019_POP_F07_1_POPULATION_BY_AGE_BOTH_SEXES.xlsx in this example.

We can use Pandas read_excel() function to import the dataset in Python:

import pandas as pd

df = pd.read_excel("WPP2019_INT_F03_1_POPULATION_BY_AGE_ANNUAL_BOTH_SEXES.xlsx", skiprows=16, na_values=["..."])

This will take a few seconds until the large dataset has been processed. Now we can check if skiprows=16 is the correct value. It is correct if pandas did recognize the column names correctly:

>>> df.columns
Index(['Index', 'Variant', 'Region, subregion, country or area *', 'Notes',
       'Country code', 'Type', 'Parent code', 'Reference date (as of 1 July)',
       '0-4', '5-9', '10-14', '15-19', '20-24', '25-29', '30-34', '35-39',
       '40-44', '45-49', '50-54', '55-59', '60-64', '65-69', '70-74', '75-79',
       '80-84', '85-89', '90-94', '95-99', '100+'],
      dtype='object')

Now let’s filter for a country:

russia = df[df["Region, subregion, country or area *"] == 'Russian Federation']

This will show us the population data for multiple years in 5-year intervals from 1950 to 2020. Now let’s filter for the most recent year:

russia.loc[russia["Reference date (as of 1 July)"].idxmax()]

This will show us a single dataset:

Index                                                 3255
Variant                                          Estimates
Region, subregion, country or area *    Russian Federation
Notes                                                  NaN
Country code                                           643
Type                                          Country/Area
Parent code                                            923
Reference date (as of 1 July)                         2020
0-4                                                9271.69
5-9                                                9350.92
10-14                                              8174.26
15-19                                              7081.77
20-24                                               6614.7
25-29                                              8993.09
30-34                                              12543.8
35-39                                              11924.7
40-44                                              10604.6
45-49                                              9770.68
50-54                                              8479.65
55-59                                                10418
60-64                                              10073.6
65-69                                              8427.75
70-74                                              5390.38
75-79                                              3159.34
80-84                                              3485.78
85-89                                              1389.64
90-94                                              668.338
95-99                                              102.243
100+                                                 9.407
Name: 3254, dtype: object
​

How can we plot that data? First, we need to select all the columns that contain age data. We’ll do this by manually inserting the name of the first such column (0-4) into the following code and assuming that there are no columns after the last age column:

>>> df.columns[df.columns.get_loc("0-4"):]
Index(['0-4', '5-9', '10-14', '15-19', '20-24', '25-29', '30-34', '35-39',
       '40-44', '45-49', '50-54', '55-59', '60-64', '65-69', '70-74', '75-79',
       '80-84', '85-89', '90-94', '95-99', '100+'],
      dtype='object')

Now let’s select those columns from the russia dataset:

most_recent_russia = russia.loc[russia["Reference date (as of 1 July)"].idxmax()]
age_columns = df.columns[df.columns.get_loc("0-4"):]

russian_age_data = most_recent_russia[age_columns]

Let’s have a look at the dataset:

>>> russian_age_data
0-4      9271.69
5-9      9350.92
10-14    8174.26
15-19    7081.77
20-24     6614.7
25-29    8993.09
30-34    12543.8
35-39    11924.7
40-44    10604.6
45-49    9770.68
50-54    8479.65
55-59      10418
60-64    10073.6
65-69    8427.75
70-74    5390.38
75-79    3159.34
80-84    3485.78
85-89    1389.64
90-94    668.338
95-99    102.243
100+       9.407

That looks useable, note however that the values are in thousands, i.e. we have to multiply the values by 1000 to obtain the actual estimates of the population. Let’s plot it:

from matplotlib import pyplot as plt
plt.style.use("ggplot")

plt.title("Age composition of the Russian population (2020)")
plt.ylabel("People in age group [Millions]")
plt.xlabel("Age group")
plt.gcf().set_size_inches(15,5)
# Data is given in thousands => divide by 1000 to obtain millions
plt.plot(russian_age_data.index, russian_age_data.as_matrix() / 1000., lw=3)

The finished plot will look like this:

Here’s our finished script:

#!/usr/bin/env python3
import pandas as pd
df = pd.read_excel("WPP2019_POP_F07_1_POPULATION_BY_AGE_BOTH_SEXES.xlsx", skiprows=16)
# Filter only russia
russia = df[df["Region, subregion, country or area *"] == 'Russian Federation']

# Filter only most recent estimate (1 row)
most_recent_russia = russia.loc[russia["Reference date (as of 1 July)"].idxmax()]
# Retain only value columns
age_columns = df.columns[df.columns.get_loc("0-4"):]
russian_age_data = most_recent_russia[age_columns]

# Plot!
from matplotlib import pyplot as plt
plt.style.use("ggplot")

plt.title("Age composition of the Russian population (2020)")
plt.ylabel("People in age group [Millions]")
plt.xlabel("Age group")
plt.gcf().set_size_inches(15,5)
# Data is given in thousands => divide by 1000 to obtain millions
plt.plot(russian_age_data.index, russian_age_data.as_matrix() / 1000., lw=3)

# Export as SVG
plt.savefig("russian-demographics.svg")

 

 

Posted by Uli Köhler in Bioinformatics, Data science, Python

How to read TSV (tab-separated values) in C++

This minimal example show your how to read & parse a tab-separated values (TSV) file in C++. We use boost::algorithm::split to split each line into its tab-separated components.

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <boost/algorithm/string.hpp>

using namespace std;
using namespace boost::algorithm;

int main(int argc, char** argv) {
    ifstream fin("test.tsv");
    string line;
    while (getline(fin, line)) {
        // Split line into tab-separated parts
        vector<string> parts;
        split(parts, line, boost::is_any_of("\t"));
        // TODO Your code goes here!
        cout << "First of " << parts.size() << " elements: " << parts[0] << endl;
    }
    fin.close();
}

 

 

Posted by Uli Köhler in C/C++

C++ read file line by line minimal example

This minimal example reads a file line-by-line using std::getline and prints out each line on stdout.

#include <fstream>
#include <iostream>
#include <string>

using namespace std;

int main(int argc, char** argv) {
    ifstream fin("test.tsv");
    string line;
    while (getline(fin, line)) {
        // TODO Your code goes here. This is just an example!
        cout << line << endl;
    }
    fin.close();
}

 

Posted by Uli Köhler in C/C++

RocksDB minimal example in C++

This minimal example shows how to open a RocksDB database, write a key and how to read it.

#include <cassert>
#include <string>
#include <rocksdb/db.h>

using namespace std;

int main(int argc, char** argv) {
    rocksdb::DB* db;
    rocksdb::Options options;
    options.create_if_missing = true;
    rocksdb::Status status =
    rocksdb::DB::Open(options, "/tmp/testdb", &db);
    assert(status.ok());

    // Insert value
    status = db->Put(rocksdb::WriteOptions(), "Test key", "Test value");
    assert(status.ok());

    // Read back value
    std::string value;
    status = db->Get(rocksdb::ReadOptions(), "Test key", &value);
    assert(status.ok());
    assert(!status.IsNotFound());

    // Read key which does not exist
    status = db->Get(rocksdb::ReadOptions(), "This key does not exist", &value);
    assert(status.IsNotFound());
}

Build using this CMakeLists.txt

add_executable(rocksdb-example rocksdb-example.cpp)
target_link_libraries(rocksdb-example rocksdb dl)

Compile using

cmake .
make
./rocksdb-example

 

Posted by Uli Köhler in C/C++, Databases

How to install RocksDB on Ubuntu

deb-buildscripts provides a convenient build script for building RocksDB as a deb package. Since RocksDB optimizes for the current computer’s CPU instruction set extensions (-march=native), it is required to build RocksDB on the computer where you will run it, or at least one with the same CPU type (generation)

First install the prerequisites:

sudo apt-get -y install devscripts debhelper build-essential fakeroot zlib1g-dev libbz2-dev libsnappy-dev libgflags-dev libzstd-dev

then build RocksDB:

git clone https://github.com/ulikoehler/deb-buildscripts.git
cd deb-buildscripts
./deb-rocksdb.py

This will build the librocksdb and librocksdb-dev packages in the deb-buildscripts directory.

Posted by Uli Köhler in C/C++, Linux

How to save Matplotlib plot to string as SVG

You can use StringIO to save a Matplotlib plot to a string without saving it to an intermediary file:

from matplotlib import pyplot as plt
plt.plot([0, 1], [2, 3]) # Just a minimal showcase

# Save plot to StringIO
from io import StringIO
i = StringIO()
plt.savefig(i, format="svg")

# How to access the string
print(i.getvalue())

Note that unless you save to a file you need to set the format=... parameter when calling plt.savefig(). If saving to a file, Matplotlib will try to derive the format from the filename extension (like .svg)

Posted by Uli Köhler in Python

How to map MeSH ID to MeSH term using Python

Our MeSH-JSON project provides a pre-compiled MeSH-ID-to-term map as JSON:

Download here or use this command to download:

wget "http://techoverflow.net/downloads/mesh.json.gz"

 

How to use in Python:

#!/usr/bin/env python3
import json
import gzip

# How to load
with gzip.open("mesh.json.gz", "rb") as infile:
    mesh_id_to_term = json.load(infile)

# Usage example
print(mesh_id_to_term["D059630"]) # Prints 'Mesenchymal Stem Cells'

 

Posted by Uli Köhler in Bioinformatics, Python

How to download & sync PubMed baseline + updates

In our previous post How to download PubMed baseline data using rsync we showed how you can update PubMed’s baseline data. This dataset is only updated yearly – however, you can download the updatefiles which are typically updated once per day.

The commands to download & sync both sets of files into the PubMed directory:

rsync -Pav --delete ftp.ncbi.nlm.nih.gov::pubmed/baseline/\*.xml.gz PubMed/
rsync -Pav --delete ftp.ncbi.nlm.nih.gov::pubmed/updatefiles/\*.xml.gz PubMed/

The --delete option will ensure that files that are deleted on the server will also be deleted locally. For example, when a new baseline dataset is being published, you need to delete the old year’s files to avoid having to process duplicate data.

Posted by Uli Köhler in Bioinformatics, C/C++

How to control boost::iostreams gzip compression level

In our previous post How to gzip-compress on-the-fly in C++ using boost::iostreams we showed how to create a gzip-compressing output stream using the boost::iostreams library.

This example shows how to control the compression rate of gzip_compressor:

Instead of constructing boost::iostreams::gzip_compressor() without arguments, use boost::iostreams::gzip_params(level) as the argument, where level (1..9) represents the compression level with 9 representing the highest compression level and 1 representing the lowest compression level. Higher levels of compression lead to reduced filesizes but are slower (i.e. consume more CPU time) during compression.

If filesize matters to you, I recommend choosing level 9 since compression even with the high level is extremely fast on modern computers.

Full example:

#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
using namespace std;

int main(int argc, char** argv) {
    if(argc < 2) {
        cerr << "Usage: " << argv[0] << " <output .gz file>" << endl;
    }
    //Read filename from the first command line argument
    ofstream file(argv[1], ios_base::out | ios_base::binary);
    boost::iostreams::filtering_streambuf<boost::iostreams::output> outbuf;
    outbuf.push(boost::iostreams::gzip_compressor(
        boost::iostreams::gzip_params(9)
    ));
    outbuf.push(file);
    //Convert streambuf to ostream
    ostream out(&outbuf);
    //Write some test data
    out << "This is a test text!\n";
    //Cleanup
    boost::iostreams::close(outbuf); // Don't forget this!
    file.close();
}
cmake_minimum_required(VERSION 3.0)
find_package(Boost 1.36.0 COMPONENTS iostreams)

include_directories(${Boost_INCLUDE_DIRS})
add_executable(iostreams-gz-compress iostreams-gz-compress.cpp)
target_link_libraries(iostreams-gz-compress ${Boost_LIBRARIES})

 

Posted by Uli Köhler in C/C++