DISCLAIMER ⚠️

  • THE BELOW DOCUMENTATION IS OUTDATED, PLEASE REFER TO Atomic Bulk Operations for how new bulk operation works on the BE

1. Overview

The bulk operations feature in Nimbly allows organizations to efficiently manage large-scale data operations including:

The feature is distributed across two main services:

  1. api-questionnaires: Handles questionnaire-specific bulk operations
  2. api-bulk-operations: Manages cross-entity bulk operations with comprehensive validation

2. Libraries and Dependencies

2.1 Common Libraries Across Both Services

  • @nimbly-technologies/entity-node: Entity management and repository interfaces
  • @nimbly-technologies/nimbly-backend-utils: Backend utilities including logging, middlewares, and error handling
  • @nimbly-technologies/nimbly-common: Common types, interfaces, and utilities
  • @sheet/image: Excel file processing library (v1.20201208.1)
  • express: Web framework for Node.js
  • mongoose: MongoDB object modeling
  • jsonwebtoken: JWT authentication
  • moment/moment-timezone: Date manipulation and timezone handling
  • uuid: UUID generation
  • dotenv: Environment variable management
  • firebase-admin: Firebase services integration
  • ramda: Functional programming utilities
  • joi: Schema validation
  • cors: Cross-origin resource sharing
  • busboy: Form data parsing

2.2 api-questionnaires Specific Libraries

  • jszip: ZIP file creation/manipulation
  • @sendgrid/mail: Email service integration
  • file-saver: Client-side file saving
  • normalize-url: URL normalization
  • @hapi/joi: Data validation (legacy)

2.3 api-bulk-operations Specific Libraries

  • @google-cloud/tasks: Google Cloud Tasks integration
  • lodash: Utility functions
  • sanitize-phone: Phone number sanitization
  • shortid: Short unique ID generation
  • slugify: URL-friendly string conversion

3. API Endpoints

3.1 api-questionnaires Bulk Operation Endpoints

EndpointMethodDescription
/questionnaires/bulk-uploadPOSTLegacy bulk upload for questionnaires
/questionnaires/v2/bulk-upload/validatePOSTValidates questionnaire data for V2 bulk upload
/questionnaires/v2/bulk-uploadPOSTPerforms V2 bulk upload of questionnaires
/questionnaires/v2/bulk-upload/questionnaire-dept/validatePOSTValidates questionnaire-department mapping
/questionnaires/v2/bulk-upload/questionnaire-deptPOSTBulk uploads questionnaire-department mappings
/questionnaires/v2/bulk-upload/questionnaire-issue-owner/validatePOSTValidates questionnaire issue owner mapping
/questionnaires/v2/bulk-upload/questionnaire-issue-ownerPOSTBulk uploads questionnaire issue owner mappings
/questionnaires/v2/bulk-upload/category-level-escalation/validatePOSTValidates category level escalation data
/questionnaires/v2/bulk-upload/category-level-escalationPOSTBulk uploads category level escalation data
/questionnaires/v2/bulk-upload/category-configurations/validatePOSTValidates category attributes/configurations
/questionnaires/v2/bulk-upload/category-configurationsPOSTBulk uploads category attributes/configurations
/questionnaires/v2/bulk-upload/questionnaire-configurations/validatePOSTValidates questionnaire configurations
/questionnaires/v2/bulk-upload/questionnaire-configurationsPOSTBulk uploads questionnaire configurations
/questionnaires/v2/bulk-edit/:questionnaireIDGETGets data for bulk editing a questionnaire
/questionnaires/v2/bulk-edit/issue-owners/:questionnaireIDGETGets issue owner data for bulk editing
/questionnaires/v2/bulk-edit/category-escalations/:questionnaireIDGETGets category escalation data for bulk editing
/questionnaires/v2/bulk-edit/category-configurations/:questionnaireIDGETGets category configuration data for bulk editing
/questionnaires/v2/bulk-edit/questionnaire-configurations/:questionnaireIDGETGets questionnaire configuration data for bulk editing

3.2 api-bulk-operations Endpoints

EndpointMethodDescription
/POSTProcesses bulk operations across multiple entities
/bulk-edit/downloadPOSTGenerates bulk edit template for download
/bulk-onboard-templateGETGets bulk onboarding template
/:bulkOperationsIDGETGets bulk operation details by ID

4. Authentication and Authorization Flow

4.1 Authentication Implementation

Both services implement JWT-based authentication through a middleware layer. The authentication flow ensures that:

  1. Every request contains a valid JWT token in the Authorization header
  2. The token is verified against the configured secret key
  3. Token expiration is checked to prevent use of expired tokens
  4. User context is extracted from the token and passed to subsequent handlers
  5. Unauthorized requests are rejected with 401 status code

4.2 Authentication Sequence

sequenceDiagram
    participant Client
    participant API Gateway
    participant Auth Middleware
    participant Controller
    participant Use Case

    Client->>API Gateway: Request with JWT Token
    API Gateway->>Auth Middleware: Validate Token
    Auth Middleware->>Auth Middleware: Verify JWT Signature
    Auth Middleware->>Auth Middleware: Check Token Expiry
    Auth Middleware->>Auth Middleware: Extract User Context
    alt Token Valid
        Auth Middleware->>Controller: Pass Request with User Context
        Controller->>Use Case: Execute Business Logic
        Use Case-->>Controller: Return Result
        Controller-->>Client: Return Response
    else Token Invalid
        Auth Middleware-->>Client: Return 401 Unauthorized
    end

The authentication middleware performs several critical functions:

  • Token Validation: Verifies the JWT signature using the configured secret
  • Expiration Check: Ensures the token hasn’t expired
  • Context Extraction: Extracts user information (userID, organizationID, role) from the token
  • Request Enhancement: Adds the user context to the request object for downstream use

5. Bulk Operations Architecture

5.1 High-Level Architecture

graph TB
    subgraph "Client Layer"
        Excel[Excel File]
        WebUI[Web Interface]
    end
    
    subgraph "API Layer"
        QuestAPI[api-questionnaires]
        BulkAPI[api-bulk-operations]
    end
    
    subgraph "Service Layer"
        QuestService[Questionnaire Service]
        BulkService[Bulk Operations Service]
        ValidService[Validation Service]
    end
    
    subgraph "Data Layer"
        MongoDB[(MongoDB)]
        Firebase[(Firebase)]
        CloudStorage[Cloud Storage]
    end
    
    Excel --> QuestAPI
    Excel --> BulkAPI
    WebUI --> QuestAPI
    WebUI --> BulkAPI
    
    QuestAPI --> QuestService
    BulkAPI --> BulkService
    
    QuestService --> ValidService
    BulkService --> ValidService
    
    QuestService --> MongoDB
    QuestService --> Firebase
    BulkService --> MongoDB
    BulkService --> CloudStorage

5.2 Request Processing Stages

The bulk operations follow a carefully orchestrated multi-stage processing pattern:

Stage 1: Upload and Parsing

  • Client uploads Excel file through multipart form data
  • Busboy middleware streams the file content
  • Excel parser extracts data from different sheets
  • Data is structured into appropriate formats for processing

Stage 2: Validation

  • Schema Validation: Ensures all required fields are present and properly formatted
  • Business Rule Validation: Checks for duplicates, validates relationships, enforces business constraints
  • Dependency Validation: Verifies that referenced entities exist (departments, users, sites)
  • Cross-Entity Validation: Ensures consistency across related data

Stage 3: Entity Creation

  • Database transactions are initiated to ensure atomicity
  • Entities are created or updated in the correct order to maintain referential integrity
  • Indexes are updated for efficient querying
  • Related entities are linked appropriately

Stage 4: Completion and Cleanup

  • Transaction is committed if all operations succeed
  • Events are published for downstream systems
  • Status is updated to reflect completion
  • Temporary data is cleaned up

6. api-questionnaires Implementation Details

6.1 Controller Layer Architecture

The controller layer in api-questionnaires serves as the entry point for all bulk operation requests. It implements several key responsibilities:

Request Handling Pattern

The controller receives requests from the router, extracts necessary data from the request payload, and delegates to the appropriate use case. Each controller method follows a consistent pattern:

  1. Extract data from request context and payload
  2. Call the corresponding use case method
  3. Handle any errors with appropriate logging
  4. Format and return the response

Validation vs Upload Separation

The controller provides separate endpoints for validation and actual upload operations. This separation allows users to:

  • Validate their data before committing changes
  • Receive detailed error messages for correction
  • Ensure data quality before processing
  • Reduce failed upload attempts

6.2 Middleware Layer Design

Excel File Processing

The parseXlsx middleware handles the complexity of Excel file processing:

  1. Uses Busboy to stream file uploads, preventing memory issues with large files
  2. Buffers the stream data for processing by the Excel parser
  3. Extracts data from multiple sheets within the workbook
  4. Structures the data according to expected formats
  5. Passes parsed data to the controller for processing

Error Handling in Middleware

The middleware implements comprehensive error handling:

  • File type validation to ensure only Excel files are processed
  • Size limits to prevent resource exhaustion
  • Parsing error capture and user-friendly error messages
  • Stream error handling to prevent crashes

6.3 Use Case Layer Business Logic

The use case layer implements the core business logic for questionnaire bulk operations:

Validation Logic Flow

The validation process follows a specific sequence to ensure data integrity:

  1. Department Mapping Validation

    • Verifies that all referenced departments exist in the system
    • Checks user permissions for the specified departments
    • Validates assignment types (primary/secondary)
    • Ensures no circular dependencies
  2. Issue Owner Validation

    • Confirms that all specified users exist and are active
    • Verifies users belong to the correct organization
    • Checks that users have appropriate roles for issue ownership
    • Validates email formats and uniqueness
  3. Category Configuration Validation

    • Ensures all referenced categories exist
    • Validates escalation levels are sequential
    • Checks time intervals for reasonableness
    • Verifies notification email formats

Transaction Management Strategy

The use case implements a robust transaction strategy:

  1. Initiates a MongoDB session for transaction support
  2. Processes all operations within the transaction boundary
  3. Implements rollback on any failure to maintain consistency
  4. Publishes events only after successful commit
  5. Ensures proper cleanup in finally blocks

6.4 Data Processing Flow Details

flowchart TD
    A[Excel File Upload] --> B[Parse Excel Data]
    B --> C{Validation Type}
    
    C -->|Department Mapping| D[Validate Departments]
    C -->|Issue Owners| E[Validate Users]
    C -->|Category Escalation| F[Validate Categories]
    C -->|Questionnaire Config| G[Validate Questions]
    
    D --> H{Validation Result}
    E --> H
    F --> H
    G --> H
    
    H -->|Errors Found| I[Return Validation Errors]
    H -->|No Errors| J[Start Transaction]
    
    J --> K[Create/Update Entities]
    K --> L[Update Indexes]
    L --> M[Publish Events]
    M --> N[Commit Transaction]
    
    N --> O[Return Success]
    
    K -->|Error| P[Rollback Transaction]
    P --> Q[Return Error]

Processing Logic Explanation

  1. Parse Excel Data: Extracts structured data from uploaded Excel file
  2. Validation Type: Routes to appropriate validation based on endpoint
  3. Validate Entities: Performs specific validation for each entity type
  4. Validation Result: Aggregates all validation errors
  5. Transaction Management: Ensures all-or-nothing processing
  6. Update Indexes: Maintains search indexes for performance
  7. Publish Events: Notifies other systems of changes
  8. Error Handling: Rollback and cleanup on any failure

7. api-bulk-operations Implementation Details

7.1 Controller Layer Design

The bulk operations controller implements a different pattern from api-questionnaires, focusing on multi-entity operations:

Asynchronous Processing Pattern

The controller immediately returns a tracking ID to the client while processing continues in the background:

  1. Creates a bulk operation entity with unique ID
  2. Returns the ID to the client for tracking
  3. Triggers asynchronous processing
  4. Allows clients to poll for status updates

This pattern prevents timeout issues with large operations and provides better user experience.

7.2 Use Case Layer Implementation

The bulk operations use case orchestrates complex multi-entity processing:

Dependency-Ordered Processing

The system processes entities in a specific order to maintain referential integrity:

  1. Users, Departments, Sites: Base entities processed first
  2. Reference Building: Creates lookup maps from processed entities
  3. Questionnaires: Validated against user/department/site references
  4. Schedules: Validated against all previous entities
  5. Non-Ops Days: Final validation using all references

Validation Aggregation Strategy

The use case aggregates validation errors from multiple services:

  • Each service validates its specific domain
  • Errors are collected with entity name and line numbers
  • All errors are returned together for comprehensive feedback
  • Processing stops if any validation errors exist

7.3 Service Layer Architecture

classDiagram
    class BulkOperationsUsecase {
        -bulkOperationsRepo: IBulkOperationsRepository
        -questionnaireBulkUploadService: QuestionnaireBulkUploadService
        -userDeptSiteBulkUploadService: UserDeptSiteBulkUploadService
        -scheduleUsecase: IScheduleUsecase
        -nonOpsUsecase: INonOpsUsecase
        +createBulkOperationsEntity()
        +processBulkUpload()
        +generateBulkEditTemplate()
    }

    class QuestionnaireBulkUploadService {
        -questionnaireRepo: IQuestionnaireRepository
        -categoryRepo: ICategoryRepository
        +extractQuestionnaires()
        +validate()
        +create()
    }

    class UserDeptSiteBulkUploadService {
        -userRepo: IUserRepository
        -departmentRepo: IDepartmentRepository
        -siteRepo: ISiteRepository
        +extractUserSeptSiteEntity()
        +validate()
        +create()
    }

    BulkOperationsUsecase --> QuestionnaireBulkUploadService
    BulkOperationsUsecase --> UserDeptSiteBulkUploadService

Service Responsibilities

  • BulkOperationsUsecase: Orchestrates the entire process
  • QuestionnaireBulkUploadService: Handles questionnaire-specific logic
  • UserDeptSiteBulkUploadService: Manages user, department, and site operations
  • ScheduleUsecase: Processes schedule-related bulk operations
  • NonOpsUsecase: Handles non-operational days configuration

7.4 Validation Framework Design

The validation framework implements a comprehensive multi-stage approach:

Three-Stage Validation Process

  1. Schema Validation

    • Validates data types and formats
    • Ensures required fields are present
    • Checks field lengths and constraints
    • Uses Joi schema definitions for consistency
  2. Business Rule Validation

    • Enforces domain-specific rules
    • Checks for duplicate entries
    • Validates conditional relationships
    • Ensures business logic compliance
  3. Dependency Validation

    • Verifies referenced entities exist
    • Checks permissions and access rights
    • Validates cross-entity relationships
    • Ensures referential integrity

8. Data Models and Structures

8.1 Questionnaire Bulk Upload Models

The system uses different models for basic and advanced questionnaires:

Basic Questionnaire Model

  • Contains essential fields: question type, text, category
  • Includes flags for attachments and comments
  • Supports multiple choice options
  • Enables conditional pathing

Advanced Questionnaire Model

  • Extends basic model with additional fields
  • Includes department assignment configuration
  • Supports issue owner specification
  • Enables escalation level settings
  • Allows score weighting for analytics

8.2 Bulk Operations Entity Model

The bulk operations entity tracks the entire lifecycle:

Status Tracking

  • uploading: Initial file processing
  • validating: Running validation checks
  • validation-failed: Errors found during validation
  • creating-entities: Processing entity creation
  • completed: Successfully finished
  • failed: Error during processing

Metadata Storage

  • Validation errors with line numbers and field names
  • Entity creation statistics (created, updated, failed counts)
  • Timestamp tracking for performance monitoring
  • User information for audit trails

9. Processing Flows

9.1 Questionnaire Bulk Upload Flow

sequenceDiagram
    participant Client
    participant API
    participant Controller
    participant UseCase
    participant Validator
    participant Repository
    participant EventBus

    Client->>API: POST /v2/bulk-upload (Excel file)
    API->>Controller: Parse Excel data
    Controller->>UseCase: bulkUploadV2(data)
    
    UseCase->>Validator: Validate questionnaire data
    Validator->>Repository: Check existing questionnaires
    Repository-->>Validator: Return existing data
    Validator-->>UseCase: Return validation results
    
    alt Validation Failed
        UseCase-->>Controller: Return validation errors
        Controller-->>Client: 400 Bad Request with errors
    else Validation Passed
        UseCase->>Repository: Start transaction
        UseCase->>Repository: Create/Update questionnaires
        UseCase->>Repository: Update indexes
        UseCase->>EventBus: Publish questionnaire events
        UseCase->>Repository: Commit transaction
        UseCase-->>Controller: Return success
        Controller-->>Client: 200 OK with questionnaire IDs
    end

Flow Explanation

  1. Initial Request: Client sends Excel file with questionnaire data
  2. Data Parsing: API layer parses Excel into structured format
  3. Validation Phase: Comprehensive validation against schemas and business rules
  4. Decision Point: Process continues only if validation passes
  5. Transaction Phase: All database operations occur within transaction
  6. Event Publishing: Success events notify other systems
  7. Response: Client receives success confirmation or detailed errors

9.2 Multi-Entity Bulk Operations Flow

flowchart TD
    subgraph "Client Upload"
        A[Excel File with Multiple Sheets]
        A --> B[Users Sheet]
        A --> C[Departments Sheet]
        A --> D[Sites Sheet]
        A --> E[Questionnaires Sheet]
        A --> F[Schedules Sheet]
        A --> G[Non-Ops Days Sheet]
    end
    
    subgraph "API Processing"
        H[Parse Excel File]
        B --> H
        C --> H
        D --> H
        E --> H
        F --> H
        G --> H
        
        H --> I[Create Bulk Operation Entity]
        I --> J[Extract Entity Data]
        
        J --> K[Validate Users/Depts/Sites]
        J --> L[Build Reference Data]
        L --> M[Validate Questionnaires]
        L --> N[Validate Schedules]
        L --> O[Validate Non-Ops Days]
    end
    
    subgraph "Validation Results"
        K --> P{Any Errors?}
        M --> P
        N --> P
        O --> P
        
        P -->|Yes| Q[Update Status: validation-failed]
        P -->|No| R[Update Status: creating-entities]
    end
    
    subgraph "Entity Creation"
        R --> S[Create Users]
        S --> T[Create Departments]
        T --> U[Create Sites]
        U --> V[Create Questionnaires]
        V --> W[Create Schedules]
        W --> X[Create Non-Ops Days]
        X --> Y[Update Status: completed]
    end
    
    Q --> Z[Return Validation Errors]
    Y --> AA[Return Success with Entity Details]

Multi-Entity Processing Logic

  1. Sheet Extraction: Different entity types from separate Excel sheets
  2. Tracking Entity: Creates bulk operation record for monitoring
  3. Ordered Validation: Validates in dependency order
  4. Reference Building: Creates lookup maps for cross-validation
  5. Error Aggregation: Collects all errors before failing
  6. Sequential Creation: Creates entities maintaining referential integrity
  7. Status Updates: Provides real-time progress tracking

10. Validation Strategies

10.1 Schema Validation Strategy

Schema validation serves as the first line of defense against invalid data:

Implementation Approach

  • Uses Joi library for declarative schema definition
  • Validates data types, formats, and constraints
  • Provides clear error messages for invalid fields
  • Supports conditional validation rules

Key Validation Rules

  • Required Fields: Ensures essential data is present
  • Data Types: Validates strings, numbers, booleans, dates
  • Format Validation: Checks emails, phone numbers, URLs
  • Length Constraints: Enforces min/max character limits
  • Enum Validation: Restricts values to predefined options

10.2 Business Rule Validation Logic

Business rule validation ensures data meets domain-specific requirements:

Duplicate Detection Strategy

The system prevents duplicate entries using composite keys:

  • Combines multiple fields to create unique identifiers
  • Checks against existing data in the database
  • Provides specific error messages indicating duplicates
  • Allows same values in different contexts

Conditional Logic Validation

  • Validates conditional pathing references exist
  • Ensures circular dependencies don’t exist
  • Checks condition syntax and logic
  • Validates target questions are reachable

10.3 Dependency Validation Approach

Dependency validation maintains referential integrity across entities:

Reference Verification Process

  1. Build Reference Maps: Creates lookup structures from existing data
  2. Check References: Validates each reference against maps
  3. Permission Validation: Ensures user has access to referenced entities
  4. Cascade Impact: Checks impact of changes on dependent data

Cross-Entity Consistency

  • Validates users belong to referenced departments
  • Ensures sites are linked to valid departments
  • Checks questionnaires reference existing categories
  • Validates schedules reference active questionnaires

11. Error Handling and Recovery

11.1 Error Classification System

The system classifies errors for appropriate handling:

Error Categories

  • Validation Errors: Data doesn’t meet requirements
  • Processing Errors: Issues during entity creation
  • Dependency Errors: Referenced entities missing
  • Transaction Errors: Database operation failures
  • System Errors: Infrastructure or service failures

Error Severity Levels

  • Critical: Operation cannot continue
  • Error: Specific item failed but others may succeed
  • Warning: Non-blocking issues that should be addressed
  • Info: Informational messages for user awareness

11.2 Transaction Management Philosophy

The system implements strict transaction management for data consistency:

Transaction Principles

  1. Atomicity: All operations succeed or all fail
  2. Consistency: Data remains in valid state
  3. Isolation: Concurrent operations don’t interfere
  4. Durability: Committed changes persist

Rollback Strategy

When errors occur during processing:

  • Transaction is immediately aborted
  • All changes within transaction are reversed
  • System state returns to pre-operation condition
  • Error details are logged for debugging
  • User receives comprehensive error information

11.3 Recovery Mechanisms

stateDiagram-v2
    [*] --> Uploading
    Uploading --> Validating
    Validating --> ValidationFailed: Errors Found
    Validating --> CreatingEntities: No Errors
    CreatingEntities --> Failed: Error
    CreatingEntities --> Completed: Success
    ValidationFailed --> [*]
    Failed --> Retry: Manual Retry
    Retry --> Validating
    Completed --> [*]

Recovery Options

  • Validation Failed: User corrects data and re-uploads
  • Processing Failed: System admin investigates and retries
  • Partial Success: Rollback ensures no partial data
  • Retry Logic: Clean state allows safe retry

12. Performance Optimization

12.1 Batch Processing Strategy

The system optimizes performance through intelligent batching:

Batching Logic

  1. Dynamic Batch Sizing: Adjusts based on data complexity
  2. Parallel Processing: Multiple batches processed concurrently
  3. Resource Management: Limits concurrent operations
  4. Memory Optimization: Processes data in chunks

Benefits of Batching

  • Reduces memory footprint for large datasets
  • Improves response times through parallelization
  • Prevents database connection exhaustion
  • Enables progress tracking at batch level

12.2 Caching Strategy

Strategic caching reduces redundant operations:

Cache Implementation

  • Validation Cache: Stores results of expensive validation queries
  • Reference Data Cache: Keeps frequently accessed lookup data
  • Temporary Cache: Exists only during operation lifecycle
  • Cache Invalidation: Clears after operation completes

Performance Impact

  • Reduces database queries by up to 80%
  • Improves validation speed significantly
  • Enables larger batch processing
  • Reduces overall operation time

13. Conclusion

The Nimbly bulk operations feature represents a sophisticated system for handling large-scale data operations across multiple entities. The architecture demonstrates several key strengths:

  1. Separation of Concerns: Clear separation between API layers, business logic, and data access
  2. Robust Validation: Multi-stage validation ensures data integrity
  3. Transaction Management: Proper transaction handling prevents partial updates
  4. Scalability: Batch processing and async operations support large datasets
  5. Error Handling: Comprehensive error tracking and recovery mechanisms
  6. Security: Input validation, authentication, and authorization at multiple levels

The system’s modular design allows for easy extension and maintenance, while the event-driven architecture enables loose coupling between services. The comprehensive validation framework ensures data quality, and the performance optimizations handle large-scale operations efficiently.