Last updated: Jul 25, 2025, 10:08 AM UTC

PRD: NPM Package Distribution

Generated: 2025-07-23 00:00 UTC
Status: Complete
Verified:

Executive Summary

This PRD defines the requirements for distributing the document conversion functionality as NPM packages. It enables developers to integrate conversion capabilities directly into their Node.js applications without relying on external APIs, providing offline conversion, better performance, and enhanced security for sensitive documents.

Key Objectives

  • Distribute core conversion libraries as installable NPM packages
  • Provide TypeScript support with comprehensive type definitions
  • Maintain API compatibility with cloud service
  • Enable offline document conversion capabilities
  • Support both CommonJS and ES modules

User Stories

As a Node.js Developer

  • I want to install converters via npm for local use
  • I want TypeScript definitions for type safety
  • I want the same API as the cloud service
  • I want minimal dependencies

As an Enterprise Developer

  • I want to process documents without external API calls
  • I want to run conversions in air-gapped environments
  • I want source code access for security audits
  • I want long-term version support

As a Library Maintainer

  • I want tree-shakeable modules
  • I want peer dependency management
  • I want semantic versioning
  • I want clear migration guides

As a Security-Conscious User

  • I want to process sensitive documents locally
  • I want to verify package integrity
  • I want no telemetry or external calls
  • I want security vulnerability scanning

Functional Requirements

Package Structure

1. Core Packages

@docconverter/core          # Core utilities and interfaces
@docconverter/xlsx          # Excel conversion
@docconverter/pdf           # PDF conversion
@docconverter/docx          # Word conversion
@docconverter/ppt           # PowerPoint conversion
@docconverter/cli           # Command-line interface

2. Meta Package

{
  "name": "@docconverter/all",
  "version": "1.0.0",
  "description": "All document converters in one package",
  "dependencies": {
    "@docconverter/xlsx": "^1.0.0",
    "@docconverter/pdf": "^1.0.0",
    "@docconverter/docx": "^1.0.0",
    "@docconverter/ppt": "^1.0.0"
  }
}

3. Package Features

Individual Converters:

  • Standalone functionality
  • Minimal dependencies
  • Format-specific options
  • Optimized for size

Shared Core:

  • Common interfaces
  • Utility functions
  • Error handling
  • Type definitions

API Design

1. Conversion API

// Excel Converter
import { ExcelConverter } from '@docconverter/xlsx';

const converter = new ExcelConverter({
  maxFileSize: 50 * 1024 * 1024, // 50MB
  timeout: 30000, // 30 seconds
});

// Promise-based API
const result = await converter.convert(buffer, {
  to: 'json',
  sheets: ['Sheet1', 'Sheet2'],
  includeFormulas: false
});

// Stream API
const stream = converter.convertStream(inputStream, {
  to: 'markdown',
  chunkSize: 1024 * 1024
});

// Validation API
const validation = await converter.validate(buffer);
if (validation.isValid) {
  console.log(`Excel file with ${validation.sheets.length} sheets`);
}

2. TypeScript Support

// Type definitions
export interface ConversionOptions {
  to: 'json' | 'markdown' | 'html';
  sheets?: string[] | number[];
  includeFormulas?: boolean;
  preserveFormatting?: boolean;
  dateFormat?: string;
  nullValue?: string;
}

export interface ConversionResult {
  content: string | object;
  metadata: {
    sourceFormat: string;
    targetFormat: string;
    processingTime: number;
    pageCount?: number;
    sheetCount?: number;
  };
  warnings?: ConversionWarning[];
}

export interface ConversionWarning {
  type: 'unsupported_feature' | 'data_loss' | 'format_issue';
  message: string;
  location?: string;
}

3. Plugin System

// Custom formatter plugin
import { FormatterPlugin } from '@docconverter/core';

class CustomMarkdownFormatter implements FormatterPlugin {
  name = 'custom-markdown';
  
  format(data: any): string {
    // Custom formatting logic
    return customMarkdown;
  }
}

// Register plugin
converter.registerFormatter(new CustomMarkdownFormatter());

CLI Package

1. Command Structure

# Install globally
npm install -g @docconverter/cli

# Basic usage
docconvert input.xlsx -o output.json

# Advanced usage
docconvert *.xlsx \
  --to markdown \
  --output-dir ./converted \
  --sheets "Sheet1,Summary" \
  --parallel 4

# Batch processing
docconvert batch process.yaml

# Interactive mode
docconvert --interactive

2. CLI Features

# process.yaml - Batch configuration
conversions:
  - input: ./reports/*.xlsx
    output: ./markdown/
    format: markdown
    options:
      sheets: [0, 1]
      
  - input: ./documents/*.pdf
    output: ./text/
    format: markdown
    options:
      ocr: true
      
settings:
  parallel: 4
  on-error: continue
  log-level: info

Package Management

1. Versioning Strategy

Semantic Versioning:

  • Major: Breaking API changes
  • Minor: New features, backwards compatible
  • Patch: Bug fixes

Version Matrix:

@docconverter/xlsx@1.0.0 requires:
  - @docconverter/core@^1.0.0
  - xlsx@^0.18.0
  - Node.js >= 16.0.0

2. Distribution Channels

NPM Registry:

  • Public NPM registry
  • Scoped packages
  • Tagged releases (latest, next, lts)

Private Registry (Enterprise):

  • Self-hosted registry
  • Artifactory/Nexus support
  • Air-gapped installation
  • License validation

3. Package Security

{
  "scripts": {
    "prepublishOnly": "npm audit && npm test",
    "postinstall": "node scripts/verify-integrity.js"
  },
  "publishConfig": {
    "access": "public",
    "registry": "https://registry.npmjs.org/"
  }
}

Documentation

1. Package Documentation

README Structure:

# @docconverter/xlsx

Excel document conversion library.

## Installation
\`\`\`bash
npm install @docconverter/xlsx
\`\`\`

## Quick Start
\`\`\`javascript
const { ExcelConverter } = require('@docconverter/xlsx');
// ... examples
\`\`\`

## API Reference
[Full API documentation](https://docs.docconverter.com/npm/xlsx)

## Examples
- [Basic conversion](./examples/basic.js)
- [Streaming large files](./examples/streaming.js)
- [Custom formatting](./examples/formatting.js)

2. Code Examples

// examples/basic.js
const { ExcelConverter } = require('@docconverter/xlsx');
const fs = require('fs').promises;

async function convertExcelToJson() {
  const converter = new ExcelConverter();
  const buffer = await fs.readFile('input.xlsx');
  
  const result = await converter.convert(buffer, {
    to: 'json',
    sheets: 'all'
  });
  
  await fs.writeFile('output.json', 
    JSON.stringify(result.content, null, 2)
  );
}

Testing & Quality

1. Test Coverage

{
  "scripts": {
    "test": "jest --coverage",
    "test:integration": "jest --testMatch='**/*.integration.test.js'",
    "test:e2e": "jest --testMatch='**/*.e2e.test.js'"
  },
  "jest": {
    "coverageThreshold": {
      "global": {
        "branches": 90,
        "functions": 90,
        "lines": 90,
        "statements": 90
      }
    }
  }
}

2. Compatibility Testing

Test Matrix:

  • Node.js: 16.x, 18.x, 20.x, 21.x
  • OS: Linux, macOS, Windows
  • Architectures: x64, arm64
  • Package managers: npm, yarn, pnpm

Non-Functional Requirements

Performance Requirements

  • Package size < 10MB (individual converters)
  • Installation time < 30 seconds
  • First conversion < 1 second
  • Memory usage < 500MB for typical files

Compatibility Requirements

  • Node.js 16+ support
  • CommonJS and ESM support
  • Browser compatibility (with bundlers)
  • TypeScript 4.5+ support

Security Requirements

  • No external API calls
  • No telemetry collection
  • Regular dependency updates
  • Security audit compliance

Technical Specifications

Build System

1. Build Configuration

// rollup.config.js
export default [
  // ESM build
  {
    input: 'src/index.ts',
    output: {
      file: 'dist/index.mjs',
      format: 'es'
    },
    plugins: [typescript(), terser()]
  },
  // CommonJS build
  {
    input: 'src/index.ts',
    output: {
      file: 'dist/index.js',
      format: 'cjs'
    },
    plugins: [typescript(), terser()]
  },
  // TypeScript definitions
  {
    input: 'src/index.ts',
    output: {
      file: 'dist/index.d.ts'
    },
    plugins: [dts()]
  }
];

2. Package Optimization

{
  "sideEffects": false,
  "exports": {
    ".": {
      "types": "./dist/index.d.ts",
      "import": "./dist/index.mjs",
      "require": "./dist/index.js"
    }
  },
  "files": [
    "dist",
    "README.md",
    "LICENSE"
  ]
}

Release Process

1. Automated Release

name: Release
on:
  push:
    tags: ['v*']

jobs:
  release:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: 20
          registry-url: 'https://registry.npmjs.org'
      
      - name: Install and Build
        run: |
          npm ci
          npm run build
          npm test
      
      - name: Publish to NPM
        run: npm publish
        env:
          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

2. Version Management

# Version bump script
npm version patch -m "Release v%s"
npm version minor -m "Release v%s - New features"
npm version major -m "Release v%s - Breaking changes"

# Canary releases
npm publish --tag next
npm publish --tag beta

Success Metrics

Adoption Metrics

  • NPM downloads > 10K/month
  • GitHub stars > 1K
  • Active installations > 5K
  • Community contributors > 20

Quality Metrics

  • Test coverage > 90%
  • Bundle size reduction > 20%
  • Performance improvement > 30%
  • Zero security vulnerabilities

Developer Experience

  • Installation success rate > 99%
  • API satisfaction > 4.5/5
  • Documentation rating > 4.5/5
  • Issue resolution time < 48 hours

Dependencies

Runtime Dependencies

  • Minimal external dependencies
  • Peer dependencies clearly defined
  • Optional dependencies for features
  • No native dependencies (pure JS)

Development Dependencies

  • Testing frameworks
  • Build tools
  • Linting tools
  • Documentation generators

Timeline & Milestones

Phase 1: Core Package (Month 1)

  • Basic package structure
  • Excel converter package
  • TypeScript definitions
  • Basic documentation

Phase 2: All Formats (Month 2)

  • PDF converter package
  • Word converter package
  • PowerPoint converter
  • CLI package

Phase 3: Enhancement (Month 3)

  • Plugin system
  • Streaming support
  • Performance optimization
  • Advanced examples

Phase 4: Ecosystem (Month 4)

  • Framework integrations
  • Bundler plugins
  • VS Code extension
  • Community packages

Risk Mitigation

Technical Risks

  • Large package size: Code splitting and tree-shaking
  • Performance issues: Benchmarking and optimization
  • Compatibility problems: Extensive testing matrix

Business Risks

  • Cannibalization of API: Different pricing model
  • Support burden: Comprehensive documentation
  • Version fragmentation: Clear upgrade paths

Future Considerations

Package Evolution

  • WebAssembly modules for performance
  • Browser-native support
  • Deno compatibility
  • Edge runtime support

Ecosystem Growth

  • Framework-specific packages
  • Cloud function templates
  • Docker images
  • Kubernetes operators