PDFJS: Read PDF from memory Buffer in NodeJS
Note: This post uses async/await and therefore requires NodeJS 8+.
This is how to read a PDF file from a file, e.g. mypdf.pdf
:
pdfjs.getDocument('mypdf.pdf');
Full example:
const pdfjs = require('pdfjs-dist');
async function readPDF() {
const pdf = await pdfjs.getDocument('mypdf.pdf');
// ...
}
Here’s how you can read the PDF from a memory buffer:
pdfjs.getDocument({data: buffer});
Full example
const fs = require('mz/fs')
const pdfjs = require('pdfjs-dist');
async function readPDF() {
// Read file into buffer
const buffer = await fs.readFile('mypdf.pdf')
// Parse PDF from buffer
const pdf = await pdfjs.getDocument({data: buffer});
// ...
}
Using mz/fs
is not required, it’s just used as an utility library to be able to use await
with files.