How to fix bitnami mariadb ‘mkdir: cannot create directory ‘/bitnami/mariadb’: Permission denied’

Problem:

You are trying to run a docker bitnami/mariadb container but when you try to start it up, you see an error message like

mariadb_1   | mkdir: cannot create directory '/bitnami/mariadb': Permission denied

Solution:

bitnami containers mostly are non-root-containers, hence you need to adjust the permissions for the data directory mapped onto the host.

First, find out what directory your /bitnami is mapped to on the host. For example, for

services:
     mariadb:
         image: 'bitnami/mariadb:latest'
         environment:
             - ALLOW_EMPTY_PASSWORD=yes
         volumes:
             - '/var/lib/my_docker/mariadb_data:/bitnami'

it is mapped to /var/lib/my_docker/mariadb_data.

Now chown this directory to 1001:1001 since the image is using UID 1001 as the user running the command:

sudo chown -R 1001:1001 [directory]

for example

sudo chown -R 1001:1001 /var/lib/my_docker/mariadb_data

 

Posted by Uli Köhler in Docker

RocksDB minimal example in C++

This minimal example shows how to open a RocksDB database, write a key and how to read it.

#include <cassert>
#include <string>
#include <rocksdb/db.h>

using namespace std;

int main(int argc, char** argv) {
    rocksdb::DB* db;
    rocksdb::Options options;
    options.create_if_missing = true;
    rocksdb::Status status =
    rocksdb::DB::Open(options, "/tmp/testdb", &db);
    assert(status.ok());

    // Insert value
    status = db->Put(rocksdb::WriteOptions(), "Test key", "Test value");
    assert(status.ok());

    // Read back value
    std::string value;
    status = db->Get(rocksdb::ReadOptions(), "Test key", &value);
    assert(status.ok());
    assert(!status.IsNotFound());

    // Read key which does not exist
    status = db->Get(rocksdb::ReadOptions(), "This key does not exist", &value);
    assert(status.IsNotFound());
}

Build using this CMakeLists.txt

add_executable(rocksdb-example rocksdb-example.cpp)
target_link_libraries(rocksdb-example rocksdb dl)

Compile using

cmake .
make
./rocksdb-example

 

Posted by Uli Köhler in C/C++, Databases

How to install RocksDB on Ubuntu


deb-buildscripts provides a convenient build script for building RocksDB yourself.

First install the prerequisites:

sudo apt-get -y install devscripts debhelper build-essential fakeroot zlib1g-dev libbz2-dev libsnappy-dev libgflags-dev libzstd-dev

then build RocksDB:

git clone https://github.com/ulikoehler/deb-buildscripts.git
cd deb-buildscripts
./deb-rocksdb.py

This will build the librocksdb and librocksdb-dev packages in the deb-buildscripts directory.

Posted by Uli Köhler in C/C++, Linux

How to fix nginx FastCGI error ‘upstream sent too big header while reading response header from upstream’

Problem:

You’re getting 502 Bad gateway errors in your nginx + FastCGI (PHP) setup. You see error messages like

2020/01/28 11:58:19 [error] 9728#9728: *1 upstream sent too big header while reading response header from upstream, client: 2001:16b8:2681:7600:bc28:b49d:3318:e9c4, server: techoverflow.net, request: "GET /category/calculators/ HTTP/2.0", upstream: "fastcgi://unix:/var/run/php/php7.2-fpm.sock:", host: "techoverflow.net", referrer: "https://techoverflow.net/?s=calcul"

in your error log.

Solution:

You need to increase your FastCGI buffers by adding

fastcgi_buffers 32 256k;
fastcgi_buffer_size 512k;

next to every instance of fastcgi_pass in your nginx config and then restarting nginx:

sudo service nginx restart

Note that the values for the buffer sizes listed in this example are just recommendations and might be adjusted up or down depending on your requirements – however, these values tend to work well for modern server hardware (although many administrators tend to use smaller buffers).

Posted by Uli Köhler in nginx

How to install x11vnc on DISPLAY=:0 as a systemd service

First, install x11vnc using e.g.

sudo apt -y install x11vnc

Now run this script as the user that is running the X11 session. The script needs to know the correct user to start x11vnc as.

wget -qO- https://techoverflow.net/scripts/install-x11vnc.sh | sudo bash -s $USER

This will install a systemd service like

[Unit]
Description=VNC Server for X11
Requires=display-manager.service
After=display-manager.service

[Service]
Type=simple
User=uli
Group=uli
ExecStart=/usr/bin/x11vnc -display :0 -norc -forever -shared -autoport 5900 -o /var/log/x11vnc.log
Restart=always

[Install]
WantedBy=multi-user.target

and automatically enable it on boot and start it.

You can connect to the computer using VNC now e.g. using:

vncviewer [hostname]
Posted by Uli Köhler in Linux

How to fix Platform IO “No tasks to run found. Configure tasks…”

If you see this message while trying to run a PlatformIO task like Build or Upload:

No tasks to run found. Configure tasks...

you can fix that easily: Open Preferences: Open settings (JSON) in Visual Studio code (the default keybinding to open the action menu is Ctrl+Shift+P).

Then look for this line:

"task.autoDetect": "off"

and delete it.

Now save the file. You can immediately run PlatformIO tasks after saving settings.json without restarting Visual Studio Code !

Posted by Uli Köhler in PlatformIO

How to connect to your 3D printer using picocom

Use this command to connect to your Marlin-based 3D printer:

picocom -b 115200 /dev/ttyUSB0 --imap lfcrlf --echo

This command might also work for firmwares other than Marlin.

By default, picocom uses character maps that cause the newlines not to be shown correctly. --imap lfcrlf maps line feeds sent by the printer to CR + LF on the terminal. --echo enables local echo, enabling you to see what you are typing.

 

Posted by Uli Köhler in Hardware, Linux

How to extract South Korean patent application number from PDF

Note: This approach will only work if the patent PDF contains text and is not a scanned image. If you can select the text in your PDF reader, it’s likely a suitable PDF patent. Note that Espacenet downloads PDF patents that do not contain text.

South Korean patents like this example list an application number that is listed on the front page (marked in red):

 

In this example, the application number is 10-2019-0094876.

In order to automatically extract the number, we can use pdftotext together with the ubiquitous Linux tools grep and tail.

First, download the original PDF (e.g. from Google Patents). In this example, the file is named KR20190098928A_Original_document_20200123004431.pdf.

Now run pdftotext on this file:

pdftotext KR20190098928A.pdf

This will produce a text file named KR20190098928A.txt
containing all the text from the original PDF.

Now we can grep for 출원번호 which is the Korean term for application number, together with (21), which is the column number for the application number. Just seeing boxes? Don’t worry, your computer knows what it means, you just don’t have a south korean font installed – just ignore it.

Now we can filter out only the information we want:

grep --after=1 "(21) 출원번호" KR20190098928A.txt | tail -n 1

In our example, this will print 10-2019-0094876.

How does it work?

The relevant section in KR20190098928A.txt looks like this:

(21) 출원번호
10-2019-0094876

We basically grep for the content of the first line and tell grep to print one line after the match (--after 1). This will print both the matching line and the line containing the application number. Now we can use tail -n 1 to print just the last line (-n 1) from the output.

Need any other information from the patent metadata? Often you can use a similar approach and just modify the grep statement. In some cases, consider using -layout or -raw as option to pdftotext.

Need professional software engineering services when automatically extracting data from your PDFs? Checkout TechOverflow consulting.

 

Posted by Uli Köhler in Patents

How to save Matplotlib plot to string as SVG

You can use StringIO to save a Matplotlib plot to a string without saving it to an intermediary file:

from matplotlib import pyplot as plt
plt.plot([0, 1], [2, 3]) # Just a minimal showcase

# Save plot to StringIO
from io import StringIO
i = StringIO()
plt.savefig(i, format="svg")

# How to access the string
print(i.getvalue())

Note that unless you save to a file you need to set the format=... parameter when calling plt.savefig(). If saving to a file, Matplotlib will try to derive the format from the filename extension (like .svg)

Posted by Uli Köhler in Python

How to map MeSH ID to MeSH term using Python

Our MeSH-JSON project provides a pre-compiled MeSH-ID-to-term map as JSON:

Download here or use this command to download:

wget "http://techoverflow.net/downloads/mesh.json.gz"

 

How to use in Python:

#!/usr/bin/env python3
import json
import gzip

# How to load
with gzip.open("mesh.json.gz", "rb") as infile:
    mesh_id_to_term = json.load(infile)

# Usage example
print(mesh_id_to_term["D059630"]) # Prints 'Mesenchymal Stem Cells'

 

Posted by Uli Köhler in Bioinformatics, Python

How to download & sync PubMed baseline + updates

In our previous post How to download PubMed baseline data using rsync we showed how you can update PubMed’s baseline data. This dataset is only updated yearly – however, you can download the updatefiles which are typically updated once per day.

The commands to download & sync both sets of files into the PubMed directory:

rsync -Pav --delete ftp.ncbi.nlm.nih.gov::pubmed/baseline/\*.xml.gz PubMed/
rsync -Pav --delete ftp.ncbi.nlm.nih.gov::pubmed/updatefiles/\*.xml.gz PubMed/

The --delete option will ensure that files that are deleted on the server will also be deleted locally. For example, when a new baseline dataset is being published, you need to delete the old year’s files to avoid having to process duplicate data.

Posted by Uli Köhler in Bioinformatics, C/C++

How to control boost::iostreams gzip compression level

In our previous post How to gzip-compress on-the-fly in C++ using boost::iostreams we showed how to create a gzip-compressing output stream using the boost::iostreams library.

This example shows how to control the compression rate of gzip_compressor:

Instead of constructing boost::iostreams::gzip_compressor() without arguments, use boost::iostreams::gzip_params(level) as the argument, where level (1..9) represents the compression level with 9 representing the highest compression level and 1 representing the lowest compression level. Higher levels of compression lead to reduced filesizes but are slower (i.e. consume more CPU time) during compression.

If filesize matters to you, I recommend choosing level 9 since compression even with the high level is extremely fast on modern computers.

Full example:

#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
using namespace std;

int main(int argc, char** argv) {
    if(argc < 2) {
        cerr << "Usage: " << argv[0] << " <output .gz file>" << endl;
    }
    //Read filename from the first command line argument
    ofstream file(argv[1], ios_base::out | ios_base::binary);
    boost::iostreams::filtering_streambuf<boost::iostreams::output> outbuf;
    outbuf.push(boost::iostreams::gzip_compressor(
        boost::iostreams::gzip_params(9)
    ));
    outbuf.push(file);
    //Convert streambuf to ostream
    ostream out(&outbuf);
    //Write some test data
    out << "This is a test text!\n";
    //Cleanup
    boost::iostreams::close(outbuf); // Don't forget this!
    file.close();
}
cmake_minimum_required(VERSION 3.0)
find_package(Boost 1.36.0 COMPONENTS iostreams)

include_directories(${Boost_INCLUDE_DIRS})
add_executable(iostreams-gz-compress iostreams-gz-compress.cpp)
target_link_libraries(iostreams-gz-compress ${Boost_LIBRARIES})

 

Posted by Uli Köhler in C/C++

How to check if XML element exists in PugiXML

Checking if an element exists in PugiXML is simple: Just call bool(element) or use the element directly inside an if clause:

// Example on using bool(element)
cout << "<root-element> exists: " << std::boolalpha
     << bool(doc.child("root-element")) << endl;
cout << "<not-root-element> exists: " << std::boolalpha
     << bool(doc.child("not-root-element")) << endl;

// Example on using the element directly inside an if clause
if(doc.child("root-element")) {
    cout << "Yes, <root-element> exists!" << endl;
}

Full example:

#include <iostream>
#include <pugixml.hpp>
using namespace std;
using namespace pugi;

int main() {
    xml_document doc;
    xml_parse_result result = doc.load_file("test.xml");

    // Example on using bool(element)
    cout << "<root-element> exists: " << std::boolalpha
         << bool(doc.child("root-element")) << endl;
    cout << "<not-root-element> exists: " << std::boolalpha
         << bool(doc.child("not-root-element")) << endl;

    // Example on using the element directly inside an if clause
    if(doc.child("root-element")) {
        cout << "Yes, <root-element> exists!" << endl;
    }
}
add_executable(pugixml-example pugixml-example.cpp)
target_link_libraries(pugixml-example pugixml)
<?xml version="1.0" encoding="UTF-8"?>
<root-element>Test text</root-element>

Compile using

cmake .
make
Posted by Uli Köhler in C/C++

How to gzip-compress on-the-fly in C++ using boost::iostreams

This minimal example shows you how to write data to a .gz file in C++, compressing the data on-the-fly using boost::iostreams. Using the modern iostreams layer, as opposed to a block-based approach like zlib allows you to use the full power and ease-of-use of std::ostream.

#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
using namespace std;

int main(int argc, char** argv) {
    if(argc < 2) {
        cerr << "Usage: " << argv[0] << " <output .gz file>" << endl;
    }
    //Read filename from the first command line argument
    ofstream file(argv[1], ios_base::out | ios_base::binary);
    boost::iostreams::filtering_streambuf<boost::iostreams::output> outbuf;
    outbuf.push(boost::iostreams::gzip_compressor());
    outbuf.push(file);
    //Convert streambuf to ostream
    ostream out(&outbuf);
    //Write some test data
    out << "This is a test text!\n";
    //Cleanup
    boost::iostreams::close(outbuf); // Don't forget this!
    file.close();
}
cmake_minimum_required(VERSION 3.0)
find_package(Boost 1.36.0 COMPONENTS iostreams)

include_directories(${Boost_INCLUDE_DIRS})
add_executable(iostreams-gz-compress iostreams-gz-compress.cpp)
target_link_libraries(iostreams-gz-compress ${Boost_LIBRARIES})

 

Posted by Uli Köhler in Boost, C/C++

How to decompress GZ files on-the-fly in C++ using boost::iostreams

This minimal example shows you how to open a .gz file in C++, decompress it on-the-fly using boost::iostreams and then copy its contents to stdout:

#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
using namespace std;

int main(int argc, char** argv) {
    if(argc < 2) {
        cerr << "Usage: " << argv[0] << " <gzipped input file>" << endl;
    }
    //Read from the first command line argument, assume it's gzipped
    ifstream file(argv[1], ios_base::in | ios_base::binary);
    boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
    inbuf.push(boost::iostreams::gzip_decompressor());
    inbuf.push(file);
    //Convert streambuf to istream
    istream instream(&inbuf);
    //Copy everything from instream to 
    cout << instream.rdbuf();
    //Cleanup
    file.close();
}
cmake_minimum_required(VERSION 3.0)
find_package(Boost 1.36.0 COMPONENTS iostreams)

include_directories(${Boost_INCLUDE_DIRS})
add_executable(iostreams-gz-decompress iostreams-gz-decompress.cpp)
target_link_libraries(iostreams-gz-decompress ${Boost_LIBRARIES})
Posted by Uli Köhler in Boost, C/C++

How to parse .xml.gz using PugiXML and boost::iostreams

In our previous post Minimal PugiXML file reader example we provided a short example of how to read from an uncompressed XML file using PugiXML. In practice, many large XML files are distributed as .xml.gz package.

Since you can use boost::iostreams to decompress gzipped data on the fly and pipe it directly into PugiXML, you don’t need to store the uncompressed data on your hard drive.

#include <iostream>
#include <fstream>
#include <pugixml.hpp>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
using namespace std;
using namespace pugi;

int main() {
    // Open "raw" gzipped data stream
    ifstream file("test.xml.gz", ios_base::in | ios_base::binary);
    // Configure decompressor filter
    boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
    inbuf.push(boost::iostreams::gzip_decompressor());
    inbuf.push(file);
    //Convert streambuf to istream
    istream instream(&inbuf);
    // Parse from stream
    xml_document doc;
    xml_parse_result result = doc.load(instream);
    // Print content of root element
    cout << "Load result: " << result.description() << "\n"
         << doc.child("root-element").child_value() // "Test text"
         << endl;
}
cmake_minimum_required(VERSION 3.0)
find_package(Boost 1.36.0 COMPONENTS iostreams)

include_directories(${Boost_INCLUDE_DIRS})
add_executable(pugixml-example pugixml-example.cpp)
target_link_libraries(pugixml-example pugixml ${Boost_LIBRARIES})
<?xml version="1.0" encoding="UTF-8"?>
<root-element>Test text</root-element>

Download all three files and then run

gzip test.xml
cmake .
make
./pugixml-example

You should see an output like

Load result: No error
Test text

 

Posted by Uli Köhler in Boost, C/C++

How to fix RapidJSON segmentation faults when building nested Documents

Problem:

You want to build a RapidJSON application that builds a JSON from scratch and is using Documents nested inside other documents, but when you try to run it, you see an error message like

zsh: segmentation fault (core dumped)  ./rapidjson-example

Solution:

Segmentation faults (i.e. illegal memory accesses) can have many reasons, but the most common one is that you use local allocators.

In order to fix the issue, use one allocator for your entire application.

MemoryPoolAllocator<> jsonAlloc; // I recommend to declare this statically

// ...
doc.AddMember("text", Value().SetString("Hello JSON!"), jsonAlloc);

Note that MemoryPoolAllocator never releases any memory from its memory pool.

Continue reading →

Posted by Uli Köhler in C/C++

How to fix RapidJSON Assertion `!hasRoot_’ failed.

Problem:

Your program is using RapidJSON but when running it you see an error message like

rapidjson-example: /usr/include/rapidjson/writer.h:452: void rapidjson::Writer<OutputStream, SourceEncoding, TargetEncoding, StackAllocator, writeFlags>::Prefix(rapidjson::Type) [with OutputStream = rapidjson::BasicOStreamWrapper<std::basic_ostream<char> >; SourceEncoding = rapidjson::UTF8<>; TargetEncoding = rapidjson::UTF8<>; StackAllocator = rapidjson::CrtAllocator; unsigned int writeFlags = 0]: Assertion `!hasRoot_' failed.

Solution:

You are using a Writer for more than one Document. While you can use the Stream backing the Writer for any number of documents, each Writer must only be used once!

To fix the issue, create a Writer instance (on the same output Stream) for each document you intend to write.

Continue reading →

Posted by Uli Köhler in C/C++

How to create and serialize a document in RapidJSON

RapidJSON is a JSON library optimized for speed – hence it lacks some convieniece and lacks easy-to-use documentation on how to create JSON documents from scratch.

Here’s how you can create a Document:

// Generate document: {"text": "Hello JSON!"}
Document doc;
doc.SetObject(); // Make doc an object !
doc.AddMember("text", "Hello JSON!", doc.GetAllocator());

Full example, which prints to cout:

#include <iostream>
#include <rapidjson/document.h>
#include <rapidjson/writer.h>
#include <rapidjson/ostreamwrapper.h>
using namespace rapidjson;
using namespace std;

int main() {
    // Generate document: {"text": "Hello JSON!"}
    Document doc;
    doc.SetObject(); // Make doc an object !
    doc.AddMember("text", "Hello JSON!", doc.GetAllocator());
    // Write to stdout
    OStreamWrapper out(cout);
    Writer<OStreamWrapper> writer(out);
    doc.Accept(writer);
}

 

Posted by Uli Köhler in C/C++

How to write JSON to cout in RapidJSON

RapidJSON does not provide a straightforward way of serializing JSON to cout (= stdout), but you can use OStreamWrapper to do that:

#include <rapidjson/writer.h>
#include <rapidjson/ostreamwrapper.h>
// ... 
OStreamWrapper out(cout);
Writer<OStreamWrapper> writer(out);
doc.Accept(writer);

Full example:

#include <iostream>
#include <rapidjson/document.h>
#include <rapidjson/writer.h>
#include <rapidjson/ostreamwrapper.h>
using namespace rapidjson;
using namespace std;

int main() {
    // Generate document: {"text": "Hello JSON!"}
    Document doc;
    doc.SetObject(); // Make doc an object !
    doc.AddMember("text", "Hello JSON!", doc.GetAllocator());
    // Write to stdout
    OStreamWrapper out(cout);
    Writer<OStreamWrapper> writer(out);
    doc.Accept(writer);
}

 

Posted by Uli Köhler in C/C++