pompelmi • Grepedia

Pompelmi is a minimal, high-performance Node.js wrapper for ClamAV, designed to scan file uploads for malware across various environments including Node.js, Bun, Deno, and Cloudflare Workers. It provides a simple, dependency-free interface that integrates with major web frameworks such as Express, Fastify, NestJS, Next.js, and others. By abstracting the complexity of subprocess management or daemon connectivity, it allows developers to implement robust virus scanning with a single, type-safe function call. The tool is designed to work in three modes: using local clamscan binaries, connecting via TCP, or utilizing UNIX domain sockets, making it highly flexible for different infrastructure setups like Docker sidecars or local development machines. Pompelmi prioritizes security by returning typed Verdict symbols (Clean, Malicious, ScanError), which prevents common string-comparison bugs and ensures predictable integration into application logic.

Functionality of the tool revolves around scanning files, buffers, or streams for malicious signatures. It natively handles various data inputs and provides advanced abstractions for common web development tasks, such as scanning multipart uploads via middleware, performing directory-wide scans, or streaming S3 objects directly to the scanner to minimize disk I/O. The library also features a sophisticated scan cache that utilizes SHA256 hashing to skip rescanning identical files, thereby drastically reducing latency and resource consumption in high-traffic applications. Additionally, users can define complex scan policies that unify size limits, MIME type filtering, extension validation, and virus checking into a single, reusable security gate.

Some of the key features are:

One function scanning: Simplifies the integration process by requiring only one call to execute scans without complex configurations.
Typed verdicts: Employs TypeScript-friendly symbols to categorize scan results, ensuring reliable handling of clean, malicious, or error states.
Zero runtime dependencies: Built using standard library features, reducing the surface area for security audits and dependency management.
Cross-platform compatibility: Supports deployment on macOS, Linux, and Windows, provided ClamAV is installed.
Multiple scan modes: Operates efficiently via local processes, TCP connections to clamd, or UNIX socket communication for low-latency production setups.
Built-in caching: Includes an LRU cache with configurable TTL and file-backed persistence to avoid redundant scans of known files.
Framework middleware: Offers dedicated adapters for popular frameworks like Express and NestJS to facilitate rapid security implementation.
Multi-engine support: Allows combining results from multiple sources like ClamAV and VirusTotal for enhanced security consensus.

Operationally, Pompelmi leverages Node.js's native streaming capabilities to pipe data directly to ClamAV, avoiding unnecessary disk writing when processing buffers or streams. This approach makes it efficient for memory-intensive tasks and cloud-native environments where temporary storage might be limited or restricted. Users can configure connection pools for concurrent scanning, set timeouts for network-bound tasks, and use event-based interfaces for complex processing pipelines. The library also handles infrastructure errors gracefully, returning specific error patterns that allow developers to decide whether to reject or quarantine files when a scan fails to complete.

Some common use cases include:

Securing user file uploads: Implementing an automatic malware check for profile pictures, documents, or media files uploaded via web forms to prevent malicious content from reaching your storage.
CI/CD pipeline protection: Integrating the GitHub Action or CLI to scan repository files during build or deployment stages.
S3 storage sanitization: Triggering scans for incoming files stored in AWS S3 buckets to ensure that only verified clean objects are processed by backend services.
Compliance auditing: Using the policy and multi-engine capabilities to build structured security gates required for SOC2 or HIPAA-compliant applications.
Large-scale batch processing: Utilizing the pool and directory scanning features to audit large existing datasets for security threats during maintenance or system migrations.