Allgemein A replacement for

Recently we had to work with the tool to convert CONTRAST and TwinScan GTF output to the GFF format which can be read by many annotation tools.

Working with that script was really hard, it did not report errors at all, plus it is not programmatically reusable at all. There are different versions of the perl script on the internet, but what we needed was a standardized, short, readable version that does proper command line parsing using a standard tool like argparse and a conversion function that is usable from other scripts.

Continue reading →

Posted by Uli Köhler in Allgemein

Building LevelDB Debian (.deb) packages


You intend to install LevelDB, but you don’t want to manually install & compile it as described here.

Instead, you just want to use the debian packaging system and some reproducible method of creating a DEB package from LevelDB.

Reasons for preferring not to compile & install manually could be:

  • You want to deploy LevelDB to one ore more environments that don’t have a complete build environment
  • You prefer a clean install-uninstall-purge package lifetime management
  • You need a reproducible process to deploy LevelDB

Continue reading →

Posted by Uli Köhler in Allgemein

ffmpeg / avconv : List supported codecs

You can list all codecs supported by libavconv (the library used by ffmpeg / avconv) by using this command:

ffmpeg -codecs

If you don’t have the ffmpeg executable simply use

avconv -codecs

Note that avconv and ffmpeg are essentially the same, but the projects split at some time and then re-merged. Starting from Ubuntu 15.04, ffmpeg is available in the repositories again, whereas previously, ffmpeg was replaced by avconv.

For details, see this StackOverflow thread and this detailed post about the ffmpeg/libav situation.

Posted by Uli Köhler in Allgemein

Efficiently encoding variable-length integers in C/C++

Using fixed width integers is space-inefficient in many cases, especially if the majority of values are low and only use the less-significant bytes.

This guide describes the basics of varint (varying-length integer) encoding while focusing on C++ as programming language, but the basic concepts apply to any language.

Varint encodings use only the bytes that are needed to represent you integer value appropriately. A varint algorithm can represent the number 10 in only one byte while using 4 bytes to encode 800000000 (800 million). In many application this yields a significant overhead reduction since you would need to use larger integers if there is a slight change that your values grow beyond the boundary of the integer type that is applicable for the majority of your values. Additionally, you usually can only use 8,16,32 or 64 bit integers while 48 bit integers need to be coded manually in most languages. For example, if most of your values are between 0 and 100, but a few might be larger than 16384 (for unsigned integers), you would usually use a full 32-bit integer, even if most values could be represented by a single byte.

Continue reading →

Posted by Uli Köhler in Allgemein

Move Minimize, Maximize and Close to the right in Ubuntu Unity

In more recent Ubuntu Versions, the minimize, maximize and close icons have moved to the left upper corner of the window.

If you want them to show up on the right side instead, follow this guide:

  1. Open a terminal (e.g. click on Ubuntu Dashboard and type Terminal, then click on Terminal)
  2. Copy and paste this text into the terminal (Ctrl+V doesn’t work here, use right-click -> insert)
    gconftool-2 -s /apps/metacity/general/button_layout —type=string “menu:minimize,maximize,close”
  3. Press Return / Enter
  4. The icons should shift to the right immediately
Posted by Uli Köhler in Allgemein

Basic Tutorial on Bitfield Arithmetic

This article describes basic operations in manipulating bitfields using boolean operations. Although this article focuses in Java, most programming languages use the same syntax.

What is a Bitfield?

All modern computers use binary arithmetic – that means, the most basic unit of information is a bit – it’s value can either be 0 or 1. On almost all hardware implementations of binary arithmetic, you can’t adress and modify bits directly, but you have to use bytes (= 8 bits). Bitfields are vectors of  bits where each bit expresses a specific piece of information that can either be true or false. Depending on the application the bitfield can occupy more or less space.

Possible applications for bitfields include, but are not limited to Bloom Filters and Game artificial intelligences. In the latter case they are called bitboards if they represent a specific state in a game board.

You might also use words (2 bytes = short in most configurations), double words (4 bytes = int  in most configurations) or quad words (8 bytes = long in most configurations) instead of single bytes for addressing. On 64-bit platforms, double words or quad words are usually most efficient but anything beyond quad words (i.e. 64 bits) needs more than one instruction in the CPU, which usually makes computation inefficient (there are some tricks involving SIMD, but this is beyond the scope of this article).

Continue reading →

Posted by Uli Köhler in Allgemein