Efficiently encoding variable-length integers in C/C++

Using fixed width integers is space-inefficient in many cases, especially if the majority of values are low and only use the less-significant bytes.

This guide describes the basics of varint (varying-length integer) encoding while focusing on C++ as programming language, but the basic concepts apply to any language.

Varint encodings use only the bytes that are needed to represent you integer value appropriately. A varint algorithm can represent the number 10 in only one byte while using 4 bytes to encode 800000000 (800 million). In many application this yields a significant overhead reduction since you would need to use larger integers if there is a slight change that your values grow beyond the boundary of the integer type that is applicable for the majority of your values. Additionally, you usually can only use 8,16,32 or 64 bit integers while 48 bit integers need to be coded manually in most languages. For example, if most of your values are between 0 and 100, but a few might be larger than 16384 (for unsigned integers), you would usually use a full 32-bit integer, even if most values could be represented by a single byte.

Continue reading →

Posted by Uli Köhler in Allgemein

Move Minimize, Maximize and Close to the right in Ubuntu Unity

In more recent Ubuntu Versions, the minimize, maximize and close icons have moved to the left upper corner of the window.

If you want them to show up on the right side instead, follow this guide:

  1. Open a terminal (e.g. click on Ubuntu Dashboard and type Terminal, then click on Terminal)
  2. Copy and paste this text into the terminal (Ctrl+V doesn’t work here, use right-click -> insert)
    gconftool-2 -s /apps/metacity/general/button_layout —type=string “menu:minimize,maximize,close”
  3. Press Return / Enter
  4. The icons should shift to the right immediately
Posted by Uli Köhler in Allgemein

A text-to-Brainfuck/RNA converter in ANSI C99

Brainfuck encoder

The following small ANSI C99 program reads a String from stdin and prints out a Brainfuck program that prints the same String on stdout.

Compile using gcc -o bf bf.c

Use it like this:

cat my.txt | ./bf > my.bf

Source code:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
    unsigned char c;
    unsigned char curval = 0;
    //Initialize reg+1 with 8
    while(1) {
        c = getchar();
        if(feof(stdin)) {break;}
        while(curval != c) {
            if(curval < c) {
                putchar('+');
                curval++;
            } else if(curval > c) {
                putchar('-');
                curval--;
            }
        }
        putchar('.');
    }
}

How does it work?

Basically it uses just one of the registers of the Brainfuck Turing machine and incremets or decrements the register to be able to print out the next byte. It doesn’t use any of the more ‘advanced’ features in Brainfuck like loops.

Continue reading →

Posted by Uli Köhler in C/C++, Fun

How to resolve ADB Sideload “error: closed”

Problem:

You are using adb sideload to communicate with your Android device (e.g. to flash the Nexus 4 using Clockworkmod recovery), but every time you try to execute `adb sideload` you get this error message:

error: closed

Continue reading →

Posted by Uli Köhler in Android

Compiling & Installing LevelDB on Linux

Update: Please also take a look at this followup article for an automatic compilation script that builds Ubuntu DEB packages!

Problem:

You want to compile and install LevelDB (including development headers) on your Linux computer. ./configure && make && make install does not work so you don’t know how to do this.

or:

You have successfully compiled LevelDB, but make install doesn’t work (there is no official installation procedure yet) and you don’t know how to install it to your system

Continue reading →

Posted by Uli Köhler in Databases

Scalar vs packed operations in SSE

If you look at any SSE instruction table, you might notice that there are two basic types of operations:

  • Packed instructions (the assembly instruction ends with PS)
  • Scalar instructions (the assembly instruction ends with SS)

For most operations, there are two versions, one packed and one scalar.

What’s the difference between them? It’s pretty simple:

  • Scalar operations operate on only one element, for example a single integer.
  • Packed operations operate on any element in the vector in parallel, e.g. they multiply 4 32-bit integers in a single instruction.

SSE gains it performance from using packed operations implementing the SIMD paradigm (using a single instruction, multiple values are processed). However, it is occasionally useful to avoid expensive copying by using scalar operations operation on the SSE registers.

Also see the Original source

Posted by Uli Köhler in Performance

Shell: Strip directory from path

Problem:

In the Linux shell, you have  a file path and you want to strip everything but the filename, for example  you have the path ../images/photo.jpg  and want only photo.jpg

Continue reading →

Posted by Uli Köhler in Shell

Basic Tutorial on Bitfield Arithmetic

This article describes basic operations in manipulating bitfields using boolean operations. Although this article focuses in Java, most programming languages use the same syntax.

What is a Bitfield?

All modern computers use binary arithmetic – that means, the most basic unit of information is a bit – it’s value can either be 0 or 1. On almost all hardware implementations of binary arithmetic, you can’t adress and modify bits directly, but you have to use bytes (= 8 bits). Bitfields are vectors of  bits where each bit expresses a specific piece of information that can either be true or false. Depending on the application the bitfield can occupy more or less space.

Possible applications for bitfields include, but are not limited to Bloom Filters and Game artificial intelligences. In the latter case they are called bitboards if they represent a specific state in a game board.

You might also use words (2 bytes = short in most configurations), double words (4 bytes = int  in most configurations) or quad words (8 bytes = long in most configurations) instead of single bytes for addressing. On 64-bit platforms, double words or quad words are usually most efficient but anything beyond quad words (i.e. 64 bits) needs more than one instruction in the CPU, which usually makes computation inefficient (there are some tricks involving SIMD, but this is beyond the scope of this article).

Continue reading →

Posted by Uli Köhler in Allgemein

Othello in Java: Part 1: Data structures

You’ve got a big problem. Someone forces encourages you to implement a complete Othello UI+AI in Java, but you don’t have any idea how to do that. If you already know how to implement the basics and you are interested in more advanced strategy concepts, you might be interested in the other parts of this series (yet to come).

In this multi-part series I will not provide any complete solution to any of the standard Othello tasks. Instead, I will provide (hopefully) helpful hints how to get your coding going and explain how your code works.

Continue reading →

Posted by Uli Köhler in Java