• Stars
    star
    221
  • Rank 179,731 (Top 4 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created over 6 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Utility to work with Docker version of LibreOffice in Lambda

aws-lambda-libreoffice

Utility to work with Docker version of LibreOffice in Lambda

Install

$ yarn add @shelf/aws-lambda-libreoffice

Features

  • Includes CJK and X11 fonts bundled in the base Docker image!
  • Relies on the latest LibreOffice 7.4 version which is not stripped down from features as a previous layer-based version of this package
  • Requires node.js 16x runtime (x86_64)

Requirements

Lambda Docker Image

First, you need to create a Docker image for your Lambda function. See the example at libreoffice-lambda-base-image repo.

Example:

FROM public.ecr.aws/shelf/lambda-libreoffice-base:7.4-node16-x86_64

COPY ./ ${LAMBDA_TASK_ROOT}/

RUN yarn install

CMD [ "handler.handler" ]

Lambda Configuration

  • At least 3008 MB of RAM is recommended
  • At least 45 seconds of Lambda timeout is necessary
  • For larger files support, you can extend Lambda's /tmp space using the ephemeral-storage parameter
  • Set environment variable HOME to /tmp

Usage (For version 4.x; based on a Lambda Docker Image)

Given you have packaged your Lambda function as a Docker image, you can now use this package:

const {convertTo, canBeConvertedToPDF} = require('@shelf/aws-lambda-libreoffice');

module.exports.handler = async () => {
  // assuming there is a document.docx file inside /tmp dir
  // original file will be deleted afterwards

  // it is optional to invoke this function, you can skip it if you're sure about file format
  if (!canBeConvertedToPDF('document.docx')) {
    return false;
  }

  return convertTo('document.docx', 'pdf'); // returns /tmp/document.pdf
};

Usage (For version 3.x; based on a Lambda Layer)

This version requires Node 12.x or higher.

NOTE: Since version 2.0.0 npm package no longer ships the 85 MB LibreOffice but relies upon libreoffice-lambda-layer instead. Follow the instructions on how to add a lambda layer in that repo.

const {convertTo, canBeConvertedToPDF} = require('@shelf/aws-lambda-libreoffice');

module.exports.handler = async () => {
  // assuming there is a document.docx file inside /tmp dir
  // original file will be deleted afterwards

  if (!canBeConvertedToPDF('document.docx')) {
    return false;
  }

  return convertTo('document.docx', 'pdf'); // returns /tmp/document.pdf
};

Or if you want more control:

const {unpack, defaultArgs} = require('@shelf/aws-lambda-libreoffice');

await unpack(); // default path /tmp/instdir/program/soffice.bin

execSync(
  `/tmp/instdir/program/soffice.bin ${defaultArgs.join(
    ' '
  )} --convert-to pdf file.docx --outdir /tmp`
);

Troubleshooting

  • Please allocate at least 3008 MB of RAM for your Lambda function.
  • If some file fails to be converted to PDF, try converting it to PDF on your computer first. This might be an issue with LibreOffice itself

See Also

Test

Beside unit tests that could be run via yarn test, there are integration tests.

Smoke test that it works:

cd test
./test.sh

# copy converted PDF file from container to the host to see if it's ok
export CID=$(cat ./cid)
docker cp $CID:/tmp/test.pdf ./test.pdf

Publish

$ git checkout master
$ yarn version
$ yarn publish
$ git push origin master --tags

License

MIT ยฉ Shelf

More Repositories

1

chrome-aws-lambda-layer

58 MB Google Chrome to fit inside AWS Lambda Layer compressed with Brotli
627
star
2

jest-mongodb

Jest preset for MongoDB in-memory server
TypeScript
588
star
3

jest-dynamodb

Jest preset for DynamoDB local server
TypeScript
179
star
4

libreoffice-lambda-layer

Shell
109
star
5

ghostscript-lambda-layer

Ghostscript AWS Lambda layer
Shell
93
star
6

aws-lambda-tesseract

6 MB Tesseract (with English training data) to fit inside AWS Lambda
Shell
86
star
7

dynamodb-parallel-scan

Scan large DynamoDB tables faster with parallelism
TypeScript
65
star
8

libreoffice-lambda-base-image

26
star
9

fast-chunk-string

Chunk string into equal substrings with unicode support
TypeScript
18
star
10

tika-text-extract

Extract text from a document by Apache Tika
TypeScript
15
star
11

dynamodb-query-optimized

TypeScript
13
star
12

apache-tika-lambda-layer

AWS Lambda layer containing latest version of Apache Tika
Shell
13
star
13

winston-datadog-logs-transport

Winston transport for Datadog Logs (not events)
JavaScript
13
star
14

java-lambda-layer

AWS Lambda layer with Java 8
Shell
12
star
15

jest-postgres

Jest preset for running tests with local Postgres
TypeScript
10
star
16

aws-ddb-with-xray

AWS DynamoDB Document Client initialized with X-Ray
TypeScript
9
star
17

jest-elasticsearch

Jest preset for running tests with local ElasticSearch
TypeScript
9
star
18

aws-lambda-brotli-unpacker

Unpacks large Lambda binaries to /tmp
TypeScript
9
star
19

fast-natural-order-by

Lightweight (< 2.3kB gzipped) and performant natural sorting of arrays and collections by differentiating between unicode characters, numbers, dates, etc.
TypeScript
8
star
20

es-painless-fields

Generate Painless Elasticsearch script to set / unset fields on document from JavaScript Object
TypeScript
7
star
21

serverless-simplify-default-exec-role-plugin

Fixes "IamRoleLambdaExecution - Maximum policy size of 10240 bytes exceeded" error
JavaScript
7
star
22

fast-normalize-spaces

A faster (by 16-70%) implementation of "normalize-space-x" package that uses at least 3x less RAM
TypeScript
6
star
23

elasticsearch-local

Run any version of ElasticSearch locally
TypeScript
6
star
24

eslint-config

JavaScript
6
star
25

array-chunk-by-size

Chunk array of objects by their size in JSON
TypeScript
4
star
26

postgres-local

Run Postgres locally
TypeScript
4
star
27

jest-testrail-reporter

Simple package to submit jest test results to TestRail
TypeScript
3
star
28

aws-ssm-get-param-cli

Get value of SSM parameter
JavaScript
3
star
29

pspdfkit-ssr

Utilities to work with PSPDFKit's server-side rendering
TypeScript
2
star
30

gh-sdk

Convenient wrapper for GitHub API for automation tasks
TypeScript
2
star
31

react-outside-click

React library for handling outside clicks of a specified element
TypeScript
2
star
32

renovate-config-public

1
star
33

is-string-in-quotes

Check if string is inside quotation marks (21 styles)
TypeScript
1
star
34

fast-normalize-spaces-as

TypeScript
1
star
35

trim-around-tag

Trims text to max length around any HTML tag w/o breaking words
TypeScript
1
star
36

evaluate-expressions

Evaluate expressions that consist of multiple rules and joiners.
TypeScript
1
star
37

datetime

Shelf dates library
TypeScript
1
star
38

text-normalizer

Clone of openai Whisperer text normalization done and tested on Typescript!
TypeScript
1
star
39

table-of-contents

Linkify HTML headers and generate a TOC
TypeScript
1
star
40

image-preview-overlay

TypeScript
1
star