pompelmi

Fast file‑upload malware scanning for Node.js — optional YARA integration, ZIP deep‑inspection, and drop‑in adapters for Express, Koa, and Next.js. Private by design. Typed. Tiny.

Coverage badge reflects core library (src/**); adapters are measured separately.

Documentation · Install · Quick‑start · GitHub Action · Adapters · Diagrams · Config · Production checklist · YARA · Quick test · Security · FAQ

🚀 Overview

pompelmi scans untrusted file uploads before they hit disk. A tiny, TypeScript-first toolkit for Node.js with composable scanners, deep ZIP inspection, and optional signature engines.

Private by design — no outbound calls; bytes never leave your process
Composable scanners — mix heuristics + signatures; set stopOn and timeouts
ZIP hardening — traversal/bomb guards, polyglot & macro hints
Drop-in adapters — Express, Koa, Fastify, Next.js
Typed & tiny — modern TS, minimal surface

✨ Highlights

Block risky uploads early — classify uploads as clean, suspicious, or malicious and stop them at the edge.
Real guards — extension allow‑list, server‑side MIME sniff (magic bytes), per‑file size caps, and deep ZIP traversal with anti‑bomb limits.
Built‑in scanners — drop‑in CommonHeuristicsScanner (PDF risky actions, Office macros, PE header) and Zip‑bomb Guard; add your own or YARA via a tiny { scan(bytes) } contract.
Compose scanning — run multiple scanners in parallel or sequentially with timeouts and short‑circuiting via composeScanners().
Zero cloud — scans run in‑process. Keep bytes private.
DX first — TypeScript types, ESM/CJS builds, tiny API, adapters for popular web frameworks.

Keywords: file upload security, malware scanning, YARA, Node.js, Express, Koa, Next.js, ZIP scanning, ZIP bomb, PDF JavaScript, Office macros

🔧 Installation

# core library
npm i pompelmi
# or
pnpm add pompelmi
# or
yarn add pompelmi

Optional dev deps used in the examples:
npm i -D tsx express multer @koa/router @koa/multer koa next

⚡ Quick‑start

At a glance (policy + scanners)

// Compose built‑in scanners (no EICAR). Optionally add your own/YARA.
import { CommonHeuristicsScanner, createZipBombGuard, composeScanners } from 'pompelmi';

export const policy = {
  includeExtensions: ['zip','png','jpg','jpeg','pdf'],
  allowedMimeTypes: ['application/zip','image/png','image/jpeg','application/pdf','text/plain'],
  maxFileSizeBytes: 20 * 1024 * 1024,
  timeoutMs: 5000,
  concurrency: 4,
  failClosed: true,
  onScanEvent: (ev: unknown) => console.log('[scan]', ev)
};

export const scanner = composeScanners(
  [
    ['zipGuard', createZipBombGuard({ maxEntries: 512, maxTotalUncompressedBytes: 100 * 1024 * 1024, maxCompressionRatio: 12 })],
    ['heuristics', CommonHeuristicsScanner],
    // ['yara', YourYaraScanner],
  ],
  { parallel: false, stopOn: 'suspicious', timeoutMsPerScanner: 1500, tagSourceName: true }
);

Express

import express from 'express';
import multer from 'multer';
import { createUploadGuard } from '@pompelmi/express-middleware';
import { policy, scanner } from './security'; // the snippet above

const app = express();
const upload = multer({ storage: multer.memoryStorage(), limits: { fileSize: policy.maxFileSizeBytes } });

app.post('/upload', upload.any(), createUploadGuard({ ...policy, scanner }), (req, res) => {
  res.json({ ok: true, scan: (req as any).pompelmi ?? null });
});

app.listen(3000, () => console.log('http://localhost:3000'));

Koa

import Koa from 'koa';
import Router from '@koa/router';
import multer from '@koa/multer';
import { createKoaUploadGuard } from '@pompelmi/koa-middleware';
import { policy, scanner } from './security';

const app = new Koa();
const router = new Router();
const upload = multer({ storage: multer.memoryStorage(), limits: { fileSize: policy.maxFileSizeBytes } });

router.post('/upload', upload.any(), createKoaUploadGuard({ ...policy, scanner }), (ctx) => {
  ctx.body = { ok: true, scan: (ctx as any).pompelmi ?? null };
});

app.use(router.routes()).use(router.allowedMethods());
app.listen(3003, () => console.log('http://localhost:3003'));

Next.js (App Router)

// app/api/upload/route.ts
import { createNextUploadHandler } from '@pompelmi/next-upload';
import { policy, scanner } from '@/lib/security';

export const runtime = 'nodejs';
export const dynamic = 'force-dynamic';

export const POST = createNextUploadHandler({ ...policy, scanner });

🤖 GitHub Action

Run pompelmi in CI to scan repository files or built artifacts.

Minimal usage

name: Security scan (pompelmi)
on: [push, pull_request]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Scan repository with pompelmi
        uses: pompelmi/pompelmi/.github/actions/pompelmi-scan@v1
        with:
          path: .
          deep_zip: true
          fail_on_detect: true

Scan a single artifact

- uses: pompelmi/pompelmi/.github/actions/pompelmi-scan@v1
  with:
    artifact: build.zip
    deep_zip: true
    fail_on_detect: true

Inputs

Input	Default	Description
`path`	`.`	Directory to scan.
`artifact`	`""`	Single file/archive to scan.
`yara_rules`	`""`	Glob path to YARA rules (e.g. `rules/*.yar`).
`deep_zip`	`true`	Enable deep nested-archive inspection.
`max_depth`	`3`	Max nested-archive depth.
`fail_on_detect`	`true`	Fail the job if detections occur.

The Action lives in this repo at .github/actions/pompelmi-scan. When published to the Marketplace, consumers can copy the snippets above as-is.

🧩 Adapters

Use the adapter that matches your web framework. All adapters share the same policy options and scanning contract.

Framework	Package	Status
Express	`@pompelmi/express-middleware`	alpha
Koa	`@pompelmi/koa-middleware`	alpha
Next.js (App Router)	`@pompelmi/next-upload`	alpha
Fastify	`@pompelmi/fastify-plugin`	alpha
NestJS	nestjs — planned
Remix	remix — planned
hapi	hapi plugin — planned
SvelteKit	sveltekit — planned

🗺️ Diagrams

Upload scanning flow

flowchart TD
  A["Client uploads file(s)"] --> B["Web App Route"]
  B --> C{"Pre-filters<br/>(ext, size, MIME)"}
  C -- fail --> X["HTTP 4xx"]
  C -- pass --> D{"Is ZIP?"}
  D -- yes --> E["Iterate entries<br/>(limits & scan)"]
  E --> F{"Verdict?"}
  D -- no --> F{"Scan bytes"}
  F -- malicious/suspicious --> Y["HTTP 422 blocked"]
  F -- clean --> Z["HTTP 200 ok + results"]

Mermaid source

flowchart TD
  A["Client uploads file(s)"] --> B["Web App Route"]
  B --> C{"Pre-filters<br/>(ext, size, MIME)"}
  C -- fail --> X["HTTP 4xx"]
  C -- pass --> D{"Is ZIP?"}
  D -- yes --> E["Iterate entries<br/>(limits & scan)"]
  E --> F{"Verdict?"}
  D -- no --> F{"Scan bytes"}
  F -- malicious/suspicious --> Y["HTTP 422 blocked"]
  F -- clean --> Z["HTTP 200 ok + results"]

Sequence (App ↔ pompelmi ↔ YARA)

sequenceDiagram
  participant U as User
  participant A as App Route (/upload)
  participant P as pompelmi (adapter)
  participant Y as YARA engine

  U->>A: POST multipart/form-data
  A->>P: guard(files, policies)
  P->>P: MIME sniff + size + ext checks
  alt ZIP archive
    P->>P: unpack entries with limits
  end
  P->>Y: scan(bytes)
  Y-->>P: matches[]
  P-->>A: verdict (clean/suspicious/malicious)
  A-->>U: 200 or 4xx/422 with reason

Mermaid source

sequenceDiagram
  participant U as User
  participant A as App Route (/upload)
  participant P as pompelmi (adapter)
  participant Y as YARA engine

  U->>A: POST multipart/form-data
  A->>P: guard(files, policies)
  P->>P: MIME sniff + size + ext checks
  alt ZIP archive
    P->>P: unpack entries with limits
  end
  P->>Y: scan(bytes)
  Y-->>P: matches[]
  P-->>A: verdict (clean/suspicious/malicious)
  A-->>U: 200 or 4xx/422 with reason

Components (monorepo)

flowchart LR
  subgraph Repo
    core["pompelmi (core)"]
    express["@pompelmi/express-middleware"]
    koa["@pompelmi/koa-middleware"]
    next["@pompelmi/next-upload"]
    fastify(("fastify-plugin · planned"))
    nest(("nestjs · planned"))
    remix(("remix · planned"))
    hapi(("hapi-plugin · planned"))
    svelte(("sveltekit · planned"))
  end
  core --> express
  core --> koa
  core --> next
  core -.-> fastify
  core -.-> nest
  core -.-> remix
  core -.-> hapi
  core -.-> svelte

Mermaid source

flowchart LR
  subgraph Repo
    core["pompelmi (core)"]
    express["@pompelmi/express-middleware"]
    koa["@pompelmi/koa-middleware"]
    next["@pompelmi/next-upload"]
    fastify(("fastify-plugin · planned"))
    nest(("nestjs · planned"))
    remix(("remix · planned"))
    hapi(("hapi-plugin · planned"))
    svelte(("sveltekit · planned"))
  end
  core --> express
  core --> koa
  core --> next
  core -.-> fastify
  core -.-> nest
  core -.-> remix
  core -.-> hapi
  core -.-> svelte

⚙️ Configuration

All adapters accept a common set of options:

Option	Type (TS)	Purpose
`scanner`	`{ scan(bytes: Uint8Array): Promise<Match[]> }`	Your scanning engine. Return `[]` when clean; non‑empty to flag.
`includeExtensions`	`string[]`	Allow‑list of file extensions. Evaluated case‑insensitively.
`allowedMimeTypes`	`string[]`	Allow‑list of MIME types after magic‑byte sniffing.
`maxFileSizeBytes`	`number`	Per‑file size cap. Oversize files are rejected early.
`timeoutMs`	`number`	Per‑file scan timeout; guards against stuck scanners.
`concurrency`	`number`	How many files to scan in parallel.
`failClosed`	`boolean`	If `true`, errors/timeouts block the upload.
`onScanEvent`	`(event: unknown) => void`	Optional telemetry hook for logging/metrics.

Common recipes

Allow only images up to 5 MB:

includeExtensions: ['png','jpg','jpeg','webp'],
allowedMimeTypes: ['image/png','image/jpeg','image/webp'],
maxFileSizeBytes: 5 * 1024 * 1024,
failClosed: true,

✅ Production checklist

🧬 YARA Getting Started

YARA lets you detect suspicious or malicious content using pattern‑matching rules.
pompelmi treats YARA matches as signals that you can map to your own verdicts
(e.g., mark high‑confidence rules as malicious, heuristics as suspicious).

Status: Optional. You can run without YARA. If you adopt it, keep your rules small, time‑bound, and tuned to your threat model.

Starter rules

Below are three example rules you can adapt:

rules/starter/eicar.yar

rule EICAR_Test_File
{
    meta:
        description = "EICAR antivirus test string (safe)"
        reference   = "https://www.eicar.org"
        confidence  = "high"
        verdict     = "malicious"
    strings:
        $eicar = "X5O!P%@AP[4\\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*"
    condition:
        $eicar
}

rules/starter/pdf_js.yar

rule PDF_JavaScript_Embedded
{
    meta:
        description = "PDF contains embedded JavaScript (heuristic)"
        confidence  = "medium"
        verdict     = "suspicious"
    strings:
        $magic = { 25 50 44 46 } // "%PDF"
        $js1 = "/JavaScript" ascii
        $js2 = "/JS" ascii
        $open = "/OpenAction" ascii
        $aa = "/AA" ascii
    condition:
        uint32(0) == 0x25504446 and ( $js1 or $js2 ) and ( $open or $aa )
}

rules/starter/office_macros.yar

rule Office_Macro_Suspicious_Words
{
    meta:
        description = "Heuristic: suspicious VBA macro keywords"
        confidence  = "medium"
        verdict     = "suspicious"
    strings:
        $s1 = /Auto(Open|Close)/ nocase
        $s2 = "Document_Open" nocase ascii
        $s3 = "CreateObject(" nocase ascii
        $s4 = "WScript.Shell" nocase ascii
        $s5 = "Shell(" nocase ascii
        $s6 = "Sub Workbook_Open()" nocase ascii
    condition:
        2 of ($s*)
}

These are examples. Expect some false positives; tune to your app.

Minimal integration (adapter contract)

If you use a YARA binding (e.g., @automattic/yara), wrap it behind the scanner contract:

// Example YARA scanner adapter (pseudo‑code)
import * as Y from '@automattic/yara';

// Compile your rules from disk at boot (recommended)
// const sources = await fs.readFile('rules/starter/*.yar', 'utf8');
// const compiled = await Y.compile(sources);

export const YourYaraScanner = {
  async scan(bytes: Uint8Array) {
    // const matches = await compiled.scan(bytes, { timeout: 1500 });
    const matches = []; // plug your engine here
    // Map to the structure your app expects; return [] when clean.
    return matches.map((m: any) => ({
      rule: m.rule,
      meta: m.meta ?? {},
      tags: m.tags ?? [],
    }));
  }
};

Then include it in your composed scanner:

import { composeScanners, CommonHeuristicsScanner } from 'pompelmi';
// import { YourYaraScanner } from './yara-scanner';

export const scanner = composeScanners(
  [
    ['heuristics', CommonHeuristicsScanner],
    // ['yara', YourYaraScanner],
  ],
  { parallel: false, stopOn: 'suspicious', timeoutMsPerScanner: 1500, tagSourceName: true }
);

Policy suggestion (mapping matches → verdict)

malicious: high‑confidence rules (e.g., EICAR_Test_File)
suspicious: heuristic rules (e.g., PDF JavaScript, macro keywords)
clean: no matches

Combine YARA with MIME sniffing, ZIP safety limits, and strict size/time caps.

🧪 Quick test (no EICAR)

Use the examples above, then send a minimal PDF that contains risky tokens (this triggers the built‑in heuristics).

1) Create a tiny PDF with risky actions

Linux:

printf '%%PDF-1.7\n1 0 obj\n<< /OpenAction 1 0 R /AA << /JavaScript (alert(1)) >> >>\nendobj\n%%EOF\n' > risky.pdf

macOS:

printf '%%PDF-1.7\n1 0 obj\n<< /OpenAction 1 0 R /AA << /JavaScript (alert(1)) >> >>\nendobj\n%%EOF\n' > risky.pdf

2) Send it to your endpoint

Express (default from the Quick‑start):

curl -F "[email protected];type=application/pdf" http://localhost:3000/upload -i

You should see an HTTP 422 Unprocessable Entity (blocked by policy). Clean files return 200 OK. Pre‑filter failures (size/ext/MIME) should return a 4xx. Adapt these conventions to your app as needed.

🔒 Security notes

The library reads bytes; it never executes files.
YARA detections depend on the rules you provide; expect some false positives/negatives.
ZIP scanning applies limits (entries, per‑entry size, total uncompressed, nesting) to reduce archive‑bomb risk.
Prefer running scans in a dedicated process/container for defense‑in‑depth.

⭐ Star history

💬 FAQ

Do I need YARA?
No. scanner is pluggable. The examples use a minimal scanner for clarity; you can call out to a YARA engine or any other detector you prefer.

Where do the results live?
In the examples, the guard attaches scan data to the request context (e.g. req.pompelmi in Express, ctx.pompelmi in Koa). In Next.js, include the results in your JSON response as you see fit.

Why 422 for blocked files?
Using 422 to signal a policy violation keeps it distinct from transport errors; it’s a common pattern. Use the codes that best match your API guidelines.

Are ZIP bombs handled?
Archives are traversed with limits to reduce archive‑bomb risk. Keep your size limits conservative and prefer failClosed: true in production.

🧪 Tests & Coverage

Run tests locally with coverage:

pnpm vitest run --coverage --passWithNoTests

The badge tracks the core library (src/**). Adapters and engines are reported separately for now and will be folded into global coverage as their suites grow.

If you integrate Codecov in CI, upload coverage/lcov.info and you can use this Codecov badge:

[![codecov](https://codecov.io/gh/pompelmi/pompelmi/branch/main/graph/badge.svg?flag=core)](https://codecov.io/gh/pompelmi/pompelmi)

🤝 Contributing

PRs and issues welcome! Start with:

pnpm -r build
pnpm -r lint

↑ Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 283 Commits
.github		.github
.tmp-e2e		.tmp-e2e
.tmp-smoke		.tmp-smoke
assets		assets
dist		dist
docs		docs
examples		examples
packages		packages
pompelmi		pompelmi
rules		rules
samples		samples
scripts		scripts
site		site
src		src
tests		tests
tmp-yara-int		tmp-yara-int
website		website
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.knip.json		.knip.json
.node-version		.node-version
.nojekyll		.nojekyll
.npmignore		.npmignore
.npmrc		.npmrc
.nvmrc		.nvmrc
.tool-versions		.tool-versions
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
clean.txt		clean.txt
codecov.yml		codecov.yml
findings.sarif		findings.sarif
general_info.txt		general_info.txt
index.js		index.js
index.mjs		index.mjs
infos.txt		infos.txt
macro.txt		macro.txt
note_mie.txt		note_mie.txt
package.json		package.json
pixel.png		pixel.png
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
rollup.config.js		rollup.config.js
tsconfig.json		tsconfig.json
tsconfig.strict.json		tsconfig.strict.json
vitest.config.backup.1756028470.ts		vitest.config.backup.1756028470.ts
vitest.config.backup.ts		vitest.config.backup.ts
vitest.config.broken.1756025917.ts		vitest.config.broken.1756025917.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pompelmi

🚀 Overview

✨ Highlights

🔧 Installation

⚡ Quick‑start

Express

Koa

Next.js (App Router)

🤖 GitHub Action

🧩 Adapters

🗺️ Diagrams

Upload scanning flow

Sequence (App ↔ pompelmi ↔ YARA)

Components (monorepo)

⚙️ Configuration

✅ Production checklist

🧬 YARA Getting Started

Starter rules

Minimal integration (adapter contract)

Policy suggestion (mapping matches → verdict)

🧪 Quick test (no EICAR)

🔒 Security notes

⭐ Star history

💬 FAQ

🧪 Tests & Coverage

🤝 Contributing

📜 License

About

Uh oh!

Releases 65

Packages

Uh oh!

Uh oh!

Languages

License

pompelmi/pompelmi

Folders and files

Latest commit

History

Repository files navigation

pompelmi

🚀 Overview

✨ Highlights

🔧 Installation

⚡ Quick‑start

Express

Koa

Next.js (App Router)

🤖 GitHub Action

🧩 Adapters

🗺️ Diagrams

Upload scanning flow

Sequence (App ↔ pompelmi ↔ YARA)

Components (monorepo)

⚙️ Configuration

✅ Production checklist

🧬 YARA Getting Started

Starter rules

Minimal integration (adapter contract)

Policy suggestion (mapping matches → verdict)

🧪 Quick test (no EICAR)

🔒 Security notes

⭐ Star history

💬 FAQ

🧪 Tests & Coverage

🤝 Contributing

📜 License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 65

Packages 0

Uh oh!

Uh oh!

Languages

Packages