Generate pdf with aws lambda

Home/Stories/Generate pdf with aws lambda

Franco Berton - May 21, 2021

#pdf#aws#lambda#nodejs#puppeteer

This article explains how to generate a pdf with aws lambda in NodeJs using the puppeteer library.

Index

Introduction: This article wants to be an introductory guide on how to generate a pdf with AWS Lambda in Node Js with Puppeteer library.

Prerequisite: The following guide has as a prerequisite the integration of the following modules:

  • serverless: framework aimed at creating applications composed of microservices and facilitating distribution within the platform;
  • serverless-bundle: plugin that optimally packages your ES6 or TypeScript Lambda functions relying on the internal plugin serverless-webpack;
  • serverless-offline: plugin emulates AWS and API Gateway on your local machine to speed up your development cycles;
  • puppeteer-serverless: important plugin for generating pdf with AWS Lambda. Internally it includes as dependencies: chrome-aws-lambda and puppeteer-core

The dependencies of our package.json will look like this:

"dependencies": {
  "puppeteer-serverless": "^2.0.0"
},
"devDependencies": {
  "serverless": "^2.11.1",
  "serverless-bundle": "^4.0.1",
  "serverless-offline": "^6.8.0",
  "typescript": "^3.9.7"
}

1. Serverless file

The first step to generate a pdf with AWS Lambda in NodeJsinvolves defining a microservice within the serverless file, which contains the configuration for the deployment.

The microservice will be reachable externally with a simple get api. Here is the definition of the serverless file:

service: pdf

plugins:
  - serverless-bundle
  - serverless-offline

package:
  individually: true

custom: 
  serverless-offline:
    location: .webpack/service
  bundle:
    sourcemaps: false  
provider: 
  name: aws
  runtime: nodejs12.x
  region: eu-central-1
  stage: test
  apiGateway: 
    shouldStartNameWithService: true
    binaryMediaTypes:
      - '*/*'
  tracing:
    apiGateway: true
    lambda: true
functions:
  downloadPdf:
    handler: lambdas/download-pdf.main
    events:
      - http:
          path: download-pdf
          method: get
          cors: true
    timeout: 180

2. Creation of the lambda function

At this point we need to define a lambda function to do the magic. The pdf generation process includes:

  • definition of html content to be printed;
  • loading and opening the html file with puppeteer;
  • generation of the pdf file with puppeteer;
  • conversion of content to base64.
import puppeteer from "puppeteer-serverless";

export const main = async (event: any, context: any): Promise<any> => {
    let browser = null;
    let pdf = null;

    try {
      browser = await puppeteer.launch({});
      const page = await browser.newPage();
      await page.setContent("<html><body><p>Test</p></body></html>", {
        waitUntil: "load",
      });

      pdf = await page.pdf({
        format: "A4",
        printBackground: true,
        displayHeaderFooter: true,
        margin: {
          top: 40,
          right: 0,
          bottom: 40,
          left: 0,
        },
        headerTemplate: `
          <div style="border-bottom: solid 1px gray; width: 100%; font-size: 11px;
                padding: 5px 5px 0; color: gray; position: relative;">
          </div>`,
        footerTemplate: `
          <div style="border-top: solid 1px gray; width: 100%; font-size: 11px;
              padding: 5px 5px 0; color: gray; position: relative;">
              <div style="position: absolute; right: 20px; top: 2px;">
                <span class="pageNumber"></span>/<span class="totalPages"></span>
              </div>
          </div>
        `,
      });
    } finally {
      if (browser !== null) {
        await browser.close();
      }
    }

  return {
    headers: {
      'Content-type': 'application/pdf',
      'content-disposition': 'attachment; filename=test.pdf'
    },
    statusCode: 200,
    body: pdf.toString('base64'),
    isBase64Encoded: true
  }
}

3. Chromium binaries inclusion on bundle

As a last step we are going to update the serverless file with the inclusion of chromium binary files in the bundle, so that the pdf download can work on AWS

The binary files are present inside the chrome-aws-lambda module and therefore, they will be:

  • node_modules/chrome-aws-lambda/bin/aws.tar.br
  • node_modules/chrome-aws-lambda/bin/chromium.br
  • node_modules/chrome-aws-lambda/bin/swiftshader.tar.br
service: pdf

plugins:
  - serverless-bundle
  - serverless-offline

package:
  individually: true

custom: 
  serverless-offline:
    location: .webpack/service
  bundle:
    sourcemaps: false  
    copyFiles:                       
      - from: 'node_modules/chrome-aws-lambda/bin/aws.tar.br'
        to: './bin' 
      - from: 'node_modules/chrome-aws-lambda/bin/chromium.br'
        to: './bin'   
      - from: 'node_modules/chrome-aws-lambda/bin/swiftshader.tar.br'
        to: './bin'   
provider: 
  name: aws
  runtime: nodejs12.x
  region: eu-central-1
  stage: test
  apiGateway: 
    shouldStartNameWithService: true
    binaryMediaTypes:
      - '*/*'
  tracing:
    apiGateway: true
    lambda: true
functions:
  downloadPdf:
    handler: lambdas/download-pdf.main
    events:
      - http:
          path: download-pdf
          method: get
          cors: true
    timeout: 180

4. Result

5. Conclusion

The proposed solution highlights the simplicity and immediacy of generating a pdf with AWS lambda.

As an alternative to this approach, you can use the [PDFkit] plugin (https://www.npmjs.com/package/pdfkit), but I consider it a more complex and expensive way.

The code of the proposed solution can be viewed in this Github repository .

If you like my article, share it.

Your support and feedback mean so much to me.