In our previous post Elasticsearch Python minimal index() / insert example we showed how to insert a document into Elasticsearch.
When inserting a large number of documents into Elasticsearch, you will notice that it’s extremely slow to wait for the API call to finish before trying to insert the document.
In this post we’ll show a simple way of doing many requests in parallel so multiple index
operations are running concurrently while your code is processing more documents. For this, we’ll use concurrent.futures.ThreadPoolExecutor and – after inserting all documents into the queue, use concurrent.futures.wait to wait for all requests to finish before we’ll exit.
#!/usr/bin/env python3 from elasticsearch import Elasticsearch from concurrent.futures import ThreadPoolExecutor import concurrent.futures index_executor = ThreadPoolExecutor(64) futures = [] es = Elasticsearch() for i in range(1000): future = index_executor.submit(es.index, index="test-index", id=i, body={"test": 123}) futures.append(future) print("Waiting for requests to complete...") concurrent.futures.wait(futures)