ElasticSearch equivalent to MongoDB .distinct(…)

Let’s say we have an ElasticSearch index called strings with a field pattern of {"type": "keyword"}.

Now we want to do the equivalent of MongoDB db.getCollection('...').distinct('pattern'):

Solution:

In Python you can use the iterate_distinct_field() helper from this previous post on ElasticSearch distinct. Full example:

from elasticsearch import Elasticsearch

es = Elasticsearch()

def iterate_distinct_field(es, fieldname, pagesize=250, **kwargs):
    """
    Helper to get all distinct values from ElasticSearch
    (ordered by number of occurrences)
    """
    compositeQuery = {
        "size": pagesize,
        "sources": [{
                fieldname: {
                    "terms": {
                        "field": fieldname
                    }
                }
            }
        ]
    }
    # Iterate over pages
    while True:
        result = es.search(**kwargs, body={
            "aggs": {
                "values": {
                    "composite": compositeQuery
                }
            }
        })
        # Yield each bucket
        for aggregation in result["aggregations"]["values"]["buckets"]:
            yield aggregation
        # Set "after" field
        if "after_key" in result:
            compositeQuery["after_key"] = result["after_key"]
        else: # Finished!
            break

# Usage example
for result in iterate_distinct_field(es, fieldname="pattern.keyword", index="strings"):
    print(result) # e.g. {'key': {'pattern': 'mypattern'}, 'doc_count': 315}

How to query distinct field values in ElasticSearch

Let’s say we have an ElasticSearch index called strings with a field pattern of {"type": "keyword"}.

Get the top N values of the column

If we want to get the top N ( 12 in our example) entries, i.e. the patterns that are present in the most documents, we can use this query:

{
    "aggs" : {
        "patterns" : {
            "terms" : {
                "field" : "pattern.keyword",
                "size": 12
            }
        }
    }
}

Full example in Python:

from elasticsearch import Elasticsearch

es = Elasticsearch()

result = es.search(index="strings", body={
    "aggs" : {
        "patterns" : {
            "terms" : {
                "field" : "pattern.keyword",
                "size": 12
            }
        }
    }
})
for aggregation in result["aggregations"]["patterns"]["buckets"]:
    print(aggregation) # e.g. {'key': 'mypattern, 'doc_count': 2802}

See the terms aggregation documentation for more infos.

Get all the distinct values of the column

Getting all the values is slightly more complicated since we need to use a composite aggregation that returns an after_key to paginate the query.

This Python helper function will automatically paginate the query with configurable page size:

from elasticsearch import Elasticsearch

es = Elasticsearch()

def iterate_distinct_field(es, fieldname, pagesize=250, **kwargs):
    """
    Helper to get all distinct values from ElasticSearch
    (ordered by number of occurrences)
    """
    compositeQuery = {
        "size": pagesize,
        "sources": [{
                fieldname: {
                    "terms": {
                        "field": fieldname
                    }
                }
            }
        ]
    }
    # Iterate over pages
    while True:
        result = es.search(**kwargs, body={
            "aggs": {
                "values": {
                    "composite": compositeQuery
                }
            }
        })
        # Yield each bucket
        for aggregation in result["aggregations"]["values"]["buckets"]:
            yield aggregation
        # Set "after" field
        if "after_key" in result:
            compositeQuery["after_key"] = result["after_key"]
        else: # Finished!
            break

# Usage example
for result in iterate_distinct_field(es, fieldname="pattern.keyword", index="strings"):
    print(result) # e.g. {'key': {'pattern': 'mypattern'}, 'doc_count': 315}

How to fix ElasticSearch ‘Fielddata is disabled on text fields by default’ for keyword field

Problem:

You have a field in ElasticSearch named e.g. patterns of type keyword. However, when you query for an aggregation of this field e.g.

es.search(index="strings", body={
    "size": 0,
    "aggs" : {
        "patterns" : {
            "terms" : { "field" : "pattern" }
        }
    }
})

you see this error message:

elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'Fielddata is disabled on text fields by default. Set fielddata=true on [pattern] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.'

Solution:

This error message is confusing since you already have a keyword field. However, the ElasticSearch fielddata documentation tells us that you need to to use pattern.keyword in the query instead of just pattern.

Full example:

es.search(index="strings", body={
    "size": 0,
    "aggs" : {
        "patterns" : {
            "terms" : { "field" : "pattern.keyword" }
        }
    }
})

How to fix ModuleNotFoundError: No module named ‘grpc’ in Python

Problem:

You want to run a Python script that is using some Google Cloud services. However you see an error message similar to this:

[...]
  File "/usr/local/lib/python3.6/dist-packages/google/api_core/gapic_v1/__init__.py", line 16, in <module>
    from google.api_core.gapic_v1 import config
  File "/usr/local/lib/python3.6/dist-packages/google/api_core/gapic_v1/config.py", line 23, in <module>
    import grpc
ModuleNotFoundError: No module named 'grpc'

Solution:

Install the grpcio Python module:

sudo pip3 install grpcio

or, for Python 2.x

sudo pip install grpcio

How to solve permission denied error when trying to echo into a file

Problem:

You are trying to echo a string into a file only accessible to root, e.g.:

sudo echo "foo" > /etc/my.cnf

but you only see this error message:

-bash: /etc/my.cnf: Permission denied

Solution:

Use tee instead (this will also echo the string to stdout)

echo "foo" | sudo tee /etc/my.cfg # Overwrite: Equivalent of echo "foo" > /etc/my.cnf
echo "foo" | sudo tee -a /etc/my.cfg # Append: Equivalent of echo "foo" >> /etc/my.cnf

The reason for the error message is that while echo is executed using sudo, >> /etc/my.cnf is run as normal user (not root).

An alternate approach is to run a sub-shell as sudo:

sudo bash -c 'echo "foo" > /etc/my.cnf'

but this has several caveats e.g. related to escaping so I usually don’t recommend this approach.

How to fix ERROR: Couldn’t connect to Docker daemon at http+docker://localhost – is it running?

Problem:

You want to run a docker-container or docker-compose application, but once you try to start it, you see this error message:

ERROR: Couldn't connect to Docker daemon at http+docker://localhost - is it running?

If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.

Solution:

There are two possible reasons for this error message.

The common reason is that the user you are running the command as does not have the permissions to access docker.

You can fix this either by running the command as root using sudo (since root has the permission to access docker) or adding your user to the docker group:

sudo usermod -a -G docker $USER

and then logging out and logging back in completely (or restarting the system/server).

The other reason is that you have not started docker. On Ubuntu, you can start it using

sudo systemctl enable docker # Auto-start on boot
sudo systemctl start docker # Start right now

 

TechOverflow’s Docker install instructions automatically takes care of starting & enabling the service

How to configure OctoPrint/OctoPi with Ethernet using a static IP

In order to configure OctoPrint/OctoPi to use the Raspberry Pi Ethernet interface with a static IP, first open the rootfs partition on the SD card. After that, open etc/network/interfaces in your preferred text editor (you might need to open it as root, e.g. sudo nano etc/network/interfaces – ensure that you don’t edit your local computer’s /etc/network/interfaces but the one on the SD card).

Now copy the following text to the end of etc/network/interfaces:

auto eth0
allow-hotplug eth0
iface eth0 inet static
    address 192.168.1.234
    netmask 255.255.255.0
    gateway 192.168.1.1
    network 192.168.1.0
    broadcast 192.168.0.255
    dns-nameservers 8.8.8.8 8.8.4.4

You might need to adjust the IP addresses so they match your router.

Save the file and insert the SD card into your Raspberry Pi. You should be able to ping in – in our example, ping 192.168.1.234.

Tested with OctoPrint 0.16.0

Original source: OctoPrint forum

How to fix ‘elasticsearch exited with code 78’

Problem:

You want to run ElasticSearch using docker, but the container immediately stops again using this error message

elasticsearch exited with code 78

or

elasticsearch2 exited with code 78

Solution:

If you look through the entire log message, you’ll find lines like

elasticsearch     | [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

Therefore we need to increase the vm.max_map_count limit:

sudo sysctl -w vm.max_map_count=524288

Now we need to edit /etc/sysctl.conf so the setting will also be in effect after a reboot.

Look for any vm.max_map_count line in /etc/sysctl.conf. If you find one, set its value to 524288. If there is no such line present, add the line

vm.max_map_count=524288

to the end of /etc/sysctl.conf

Original source: GitHub

 

MongoDB: How to run db.adminCommand() in NodeJS

Problem:

You want to run a db.adminCommand() in NodeJS using the node-mongodb-native client, e.g. you want to run the NodeJS equivalent of

db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes: 100151432});

Solution:

Use conn.executeDbAdminCommand() where db is a MongoDB database object.

db.executeDbAdminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes: 100151432});

Full example:

// To install, use npm i --save mongodb
const MongoClient = require('mongodb').MongoClient;

async function configureMongoDB() {
    // Connect to MongoDB
    const conn = await MongoClient.connect('mongodb://localhost:27017/', { useNewUrlParser: true });
    const db = await conn.db('mydb');
    // Configure MongoDB settings
    await db.executeDbAdminCommand({
        setParameter: 1,
        internalQueryExecMaxBlockingSortBytes: 100151432
    });
    // Cleanup
    return conn.close();
}

// Run configureMongoDB()
configureMongoDB().then(() => {}).catch(console.error)

 

How to find absolute path on webserver using PHP

The following script shows you the absolute path on the webserver which often can’t be found using FTP alone.

<?php /* path.php */
list($scriptPath) = get_included_files();
echo $scriptPath;
?>

Upload this script to your webspace using FTP and then access it using the browser. It will show you a path like

/var/www/httpdocs/webmail.techoverflow.net/path.php

How to install docker and docker-compose on Ubuntu in 30 seconds

Copy and paste these command blocks into your Linux shell. You need to copy & paste one block at a time – you can paste the next block once the previous block is finished!

# Install prerequisites
sudo apt-get update
sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
# Add docker's package signing key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# Add repository
sudo add-apt-repository -y "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
# Install latest stable docker stable version
sudo apt-get update
sudo apt-get -y install docker-ce
# Install docker-compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.23.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod a+x /usr/local/bin/docker-compose
# Enable & start docker
sudo systemctl enable docker
sudo systemctl start docker

Note that this will install Docker as deb package whereas docker-compose will be downloaded to /usr/local/bin.

How to read NodeJS child-process.exec stdout/stderr using async/await Promises

You want to run a command like file my.pdf using NodeJS child-process.exec and get its stdout after it’s finished.

Solution:

TL;DR: (await exec('file my.pdf')).stdout

We’re using child-process-promise here in order to simplify our implementation. Install it using npm i --save child-process-promise !

const { exec } = require('child-process-promise');

async function run () {
    const ret = await exec(`file my.pdf`);
    return ret.stdout;
}

run().then(console.log).catch(console.error);

You can also use .stderr instead of .stdout to get the stderr output as a string

How to fix MicroPython I2C no data

Problem:

You’ve configured MicroPython’s I2C similar to this (in my case on the ESP8266 but this applies to many MCUs):

i2c = machine.I2C(-1, machine.Pin(5), machine.Pin(4))

but you can’t find any devices on the bus:

>>> i2c.scan()
[]

Solution:

Likely you forgot to configure the pins as pullups. I2C needs pullups to work, and many MCUs (like the ESP8266) provide support for integrated (weak) pull-ups.

p4 = machine.Pin(4, mode=machine.Pin.OUT, pull=machine.Pin.PULL_UP)
p5 = machine.Pin(5, mode=machine.Pin.OUT, pull=machine.Pin.PULL_UP)
i2c = machine.I2C(-1, p5, p4)

i2c.scan() # [47]

You can also verify this by checking with a multimeter or an oscilloscope: When no communication is going on on the I2C bus, the voltage should be equivalent to the supply voltage of your MCU (usually 3.3V or 5V – 0V indicates a missing pullup or some other error).