Important Note: This simple approach only works for up to ~10000 documents. Prefer using our scroll-based solution: See ElasticSearch: How to iterate / scroll through all documents in index
Use this helper function to iterate over all the documens in an index
def es_iterate_all_documents(es, index, pagesize=250, **kwargs): """ Helper to iterate ALL values from Yields all the documents. """ offset = 0 while True: result = es.search(index=index, **kwargs, body={ "size": pagesize, "from": offset }) hits = result["hits"]["hits"] # Stop after no more docs if not hits: break # Yield each entry yield from (hit['_source'] for hit in hits) # Continue from there offset += pagesize
Usage example:
for entry in es_iterate_all_documents(es, 'my_index'): print(entry) # Prints the document as stored in the DB
How it works
You can iterate over all documents in an index in ElasticSearch by using queries like
{ "size": 250, "from": 0 }
and increasing "from"
by "size"
after each iteration.