External Integrations Architecture & Operations

1. Introduction

In line with other pipeline architecture, we have developed a set of pipelines to handle integrations with external third-party API providers. These pipelines provide a consistent, secure, and operationally robust mechanism for invoking external services such as credit reference agencies, while ensuring that consuming systems do not need to handle provider-specific authentication, API behaviour, retries, or error handling.

The integrations are implemented as Ruby-based pipelines that can be executed in two primary ways:

  • Command-line execution, supporting operational use cases, testing, and controlled re-runs
  • HTTP-triggered execution, enabling invocation from upstream applications and user interfaces via Power Automate for example

At present, the following external integration pipelines are in scope:

  • Experian Pipeline – supporting company and individual credit-related API calls
  • Creditsafe Pipeline – supporting company, director, and consumer credit-related API calls

Each pipeline follows a shared architectural pattern covering:

  • explicit user intent
  • authentication and token management
  • execution control and retry handling
  • structured logging and auditability
  • secure persistence of responses where required

This document describes the architecture, processing model, security controls (such as authentication, authorisation, and data protection), and operational considerations for these external integrations. It is intentionally focused on the integration layer and does not attempt to fully document downstream processes unless they are directly impacted by the integration process.


2. Scope & Objectives

2.1 In Scope

  • Architecture of the external integration pipelines
  • Interaction patterns with Experian and Creditsafe
  • Error handling, retries, and reprocessing controls
  • Security, audit, and compliance considerations
  • Operational run-book and support model

2.2 Out of Scope

  • Business intelligence and reporting models
  • Detailed UI implementation (covered in separate documentation)
  • Integration with Alph4 APIs (currently in planning stage)

3. Architecture

3.1 Detailed Architecture Overview

The external integration pipelines provide a dedicated integration layer between Leasepath and third-party API providers. This layer is responsible for executing external API calls in a controlled, secure, and observable manner, while shielding upstream systems from provider-specific concerns.

The integration layer is responsible for:

  • authentication and token management
  • execution control (timeouts, retries, backoff)
  • provider-specific request and response handling
  • structured logging and auditability
  • persistence of external responses for downstream consumption

Upstream systems interact with the integration layer via an HTTP interface or controlled command-line execution. They do not interact directly with third-party providers.

flowchart LR
    A[Upstream System / UI] --> B[HTTP Layer]
    B --> C[Integration Pipeline]
    C --> D[External API Provider]
    C --> E[Logging & Monitoring]
    C --> F[Persistent Storage Dynamics / Data Lake]

3.2 Execution Environment & Deployment Model

The integration pipelines are implemented as Ruby services and are deployed on Windows Server. They are designed to support asynchronous invocation via an HTTP interface, where incoming requests initiate a pipeline execution and return immediately, with results persisted for later consumption. Pipelines may also be executed directly via the command line for operational and support purposes.

Key characteristics include:

  • environment-specific configuration supplied via environment variables
  • no hard-coded credentials or endpoints
  • consistent execution behaviour regardless of invocation method

3.3 Authentication & Provider Isolation

Authentication with third-party providers is handled within the integration layer using provider-specific token mechanisms. Token acquisition, caching, and refresh logic is encapsulated within the pipelines and is not exposed to upstream callers.

This ensures that:

  • consuming systems are not coupled to provider authentication models
  • credentials are centrally managed and rotated
  • provider-specific behaviour is isolated to the relevant pipeline

3.4 Observability, Audit & Persistence

All pipeline executions are fully observable and auditable. Each execution is associated with correlation identifiers that link:

  • the originating request
  • external API calls
  • logs and metrics
  • persisted response artefacts

External API responses are persisted to Dynamics tables (and, where appropriate, additional storage eg Sharepoint).

3.5 Architectural Constraints & Assumptions

The integration architecture operates under the following explicit constraints:

  • User-initiated execution only Credit-impacting API calls are only executed as a result of explicit user actions. Automated or scheduled credit checks are not permitted.

  • Provider dependency The integration layer depends on the availability and contractual stability of third-party provider APIs.

  • Fail-fast behaviour Invalid input, authentication failures, or unrecoverable errors result in immediate failure rather than silent degradation.

  • Incremental evolution The architecture is designed to evolve incrementally, with improvements to idempotency, observability, and audit integration introduced without changing upstream contracts.

4. External Integration Execution Model

4.1 Pipeline Execution

Each pipeline follows a consistent execution lifecycle:

  1. Input validation and intent verification
  2. Authentication token acquisition
  3. External API request execution
  4. Response validation and normalisation
  5. Persistence (where applicable)
  6. Structured logging
sequenceDiagram
    participant Caller
    participant Pipeline
    participant Provider
    participant Log

    Caller->>Pipeline: Execute request
    Pipeline->>Provider: API call
    Provider-->>Pipeline: Response
    Pipeline->>Log: Write audit & metrics
    Pipeline-->>Caller: Result / Error

The caller does not wait for the external API call to complete; execution continues asynchronously after the initial request is accepted.

4.2 Request, Response & Execution Model

The external integration pipelines expose a well-defined execution model for invoking third-party APIs. Each pipeline defines the inputs it accepts, the external services it interacts with, and the controls applied during execution.

Each integration is defined in terms of:

  • Request inputs Business identifiers and parameters required to determine the external API call to be made.

  • Execution options Runtime controls that govern how the integration behaves, including timeouts, retry limits, backoff strategy, and diagnostic modes.

  • External API interactions The third-party endpoints invoked and any provider-specific constraints or behaviours.

  • Response handling How responses are returned, logged, and optionally persisted for audit or reprocessing purposes.

  • Correlation and traceability Identifiers used to link requests, external API calls, logs, and persisted artefacts across the execution lifecycle.

This model ensures that each external API call is executed in a controlled, observable, and auditable manner, with clear separation between business intent, execution behaviour, and operational concerns.

4.2.1 Request Inputs & Identifiers

Each pipeline requires a minimal set of business identifiers to determine the external API call to be executed. Examples include:

  • Company registration number
  • Provider-specific identifiers (e.g. connectId, peopleId)
  • Explicit action or intent (e.g. search vs credit report)
  • Correlation ID

These identifiers are validated prior to execution and are included in structured logs to support traceability.


4.2.2 Execution Options (Runtime Controls)

In addition to business identifiers, each pipeline exposes a set of execution options that control how API calls are made. These options are consistent across providers to ensure predictable operational behaviour.

Common execution options include:

Option Description Purpose
http_timeout_seconds HTTP open/read timeout for external API calls Prevents indefinite blocking
http_max_attempts Maximum number of retry attempts Bounds retry behaviour
http_backoff_base_seconds Base delay for exponential backoff Controls retry pacing
http_backoff_max_seconds Maximum backoff delay Prevents excessive wait times
debug Enables logging of raw API responses Diagnostic use only
dry_run Logs intended API calls without executing them Safe testing and validation

These options are available whether the pipeline is invoked via the command line or through the HTTP layer, ensuring consistent behaviour across execution contexts.


4.2.3 Provider-Specific Requests

For each provider, the pipeline maps validated inputs and execution options to one or more external API calls.

Examples include:

  • Experian
    • Company search and credit-related endpoints
    • Authentication via token-based access
    • Strict timeout guidance as per provider recommendations
  • Creditsafe
    • Company, director, and consumer endpoints
    • Distinct identifiers for companies and individuals
    • Explicit handling of ambiguous search results

Provider-specific request logic is encapsulated within the relevant pipeline to prevent leakage of provider concerns into upstream systems.


4.2.4 Response Artefacts & Persistence

The integration pipelines are not designed to stream full third-party responses synchronously back to callers.

For each successful external API invocation, the response is written to a Dynamics table associated with the originating request. This enables downstream systems and user interfaces to retrieve results asynchronously and ensures a durable audit trail.

The HTTP interface returns an acknowledgement of execution rather than response data.


4.2.5 Correlation, Audit & Traceability

Every pipeline execution is assigned a correlation identifier that is propagated across:

  • Incoming request
  • External API calls
  • Logs and metrics
  • Persisted artefacts (where applicable)

This enables:

  • End-to-end traceability
  • Safe operational investigation
  • Controlled reprocessing without unintended duplicate actions

4.2.6 Validation & Guardrails

Prior to execution, the pipelines enforce a set of guardrails, including:

  • Mandatory input validation (e.g. registration number required)
  • Validation of execution option ranges (e.g. timeouts > 0)
  • Explicit failure for invalid or incomplete requests

These controls ensure that invalid requests fail early and predictably, before any external API calls are made.


4.3 Error Handling & Reprocessing

Failure scenarios handled include:

  • Network timeouts
  • Authentication failures
  • Provider-side errors
  • Ambiguous or multiple search results

Key controls:

  • Bounded retry logic
  • No automatic retries for credit report actions without safeguards
  • Operator-driven reprocessing using correlation identifiers
  • Clear distinction between transient and terminal failures

4.4 Authentication & Token Management

External API providers typically require short-lived access tokens. The integration pipelines centralise token handling so that individual API calls do not need to implement provider-specific token lifecycle logic.

Token Provider Pattern

Each provider pipeline composes a TokenProvider responsible for obtaining access tokens. Token providers follow a common interface:

  • get() returns a valid access token
  • tokens are cached in-memory for the lifetime of the running process
  • refresh occurs automatically when the token is expired (or close to expiry)

This pattern ensures:

  • consistent authentication behaviour across providers
  • fewer token endpoint calls (reduced load and reduced failure surface)
  • separation of concerns (auth logic not duplicated across API calls)

Cached Token Provider

Token caching is implemented using a shared component:

  • The token is stored in-memory (cached_token)
  • The expiry time is tracked as an epoch timestamp (cached_expiry_epoch)
  • If a token is still valid, get() returns it immediately
  • Otherwise, the provider-specific fetch function is invoked to obtain a new token

A small safety buffer is applied to avoid using tokens that are close to expiry:

  • cached_expiry_epoch = now + expires_in - 60

This “refresh 60 seconds early” behaviour helps reduce race conditions where a token expires mid-request.

If token acquisition fails, the error is:

  • logged
  • re-raised to ensure the pipeline fails fast and predictably

Provider-Specific Token Acquisition (Example: Creditsafe)

Creditsafe authentication is implemented via a provider-specific TokenProvider that:

  • reads the token endpoint and credentials from environment configuration
  • performs a POST to the provider token endpoint
  • extracts the token from the JSON response
  • delegates caching/refresh behaviour to the shared cached token provider

This keeps Creditsafe-specific details isolated while still using the common caching behaviour.

Configuration inputs (Creditsafe)

  • CREDITSAFE_TOKEN_URL
  • CREDITSAFE_USERNAME
  • CREDITSAFE_PASSWORD

Notes and Considerations

In-memory scope

  • Token caching is per-process. Tokens are not shared across multiple running pipeline processes or servers.

Expiry handling

  • The cached token provider expects token metadata including expires_in (seconds) to calculate expiry.
  • Where a provider does not supply expires_in, a default or conservative expiry strategy should be defined to prevent tokens being treated as valid indefinitely.

Logging

  • Token acquisition failures are logged with error class and message.
  • Tokens and credentials must never be logged.

5. Security, Controls & Compliance

5.1 Access & Role Model

  • Separation between:
    • users initiating requests
    • services executing integrations

5.2 Data Protection

  • All data in transit encrypted using TLS
  • Data at rest encrypted in:
    • Azure Data Lake
    • SQL databases (where applicable)
  • Sensitive fields handled as PII, no sensitive data persisted beyond that which is required for operational use (eg pdf credit reports stored in sharepoint, API response data stored in Dynamics)

5.3 Audit & Logging

  • Every request assigned a correlation identifier
  • Logs capture:
    • request metadata
    • provider response status
    • execution timing

6. Technology Stack & Environments

6.1 Technology Inventory

  • Ruby (integration pipelines)
  • Windows Server hosting
  • External APIs (Experian, Creditsafe)
  • Azure Data Lake
  • SQL Database
  • Centralised logging and alerting

6.2 Integration Configuration

External integrations are configured entirely via environment variables. This ensures that sensitive information is not embedded in code and that configuration can vary cleanly between environments (e.g. sandbox vs production).

The following configuration categories are relevant to the Experian and Creditsafe pipelines.


Provider Endpoints

Each provider exposes one or more base URLs that define the external API surfaces used by the integration pipelines.

Experian

  • EXPERIAN_API_BASE_URL Base URL for Experian API requests.
  • EXPERIAN_TOKEN_URL OAuth token endpoint used for authentication.

Creditsafe

  • CREDITSAFE_API_BASE_URL Base URL for Creditsafe Connect API requests.
  • CREDITSAFE_TOKEN_URL Authentication endpoint used to obtain access tokens.

These values are environment-specific and differ between sandbox and production deployments.


Authentication Credentials

Provider credentials are supplied via environment variables and are read at runtime by the relevant pipeline.

Experian

  • EXPERIAN_USERNAME
  • EXPERIAN_PASSWORD
  • EXPERIAN_CLIENT_ID
  • EXPERIAN_CLIENT_SECRET

Creditsafe

  • CREDITSAFE_USERNAME
  • CREDITSAFE_PASSWORD
  • CREDITSAFE_CLIENT_ID
  • CREDITSAFE_CLIENT_SECRET

Credentials are never logged and are only used within the integration layer for token acquisition and API authentication.


Timeout Configuration

All outbound HTTP calls to third-party providers are subject to a configurable open/read timeout.

  • Default timeout: 30 seconds
  • Controlled via runtime execution options
  • Applies to both authentication and API request calls

Timeouts ensure that external provider latency does not cause indefinite blocking within the integration service.


Retry & Backoff Configuration

Integration pipelines apply bounded retry logic for transient failures (for example, network issues or temporary provider unavailability).

  • Default maximum retry attempts: 4
  • Backoff strategy: exponential backoff with a bounded maximum delay
  • Retries are applied only where safe and appropriate

Credit-impacting actions (such as credit report retrieval) are explicitly guarded to prevent unintended repeated calls.


Feature Flags & Execution Modes

The integration pipelines support a small number of execution modes controlled via runtime options rather than environment configuration.

Examples include:

  • Dry-run mode, which logs intended external API calls without executing them
  • Debug mode, which enables additional diagnostic logging under controlled conditions

These modes are intended for testing, troubleshooting, and operational support and are not enabled by default.


Configuration Management Principles

  • All configuration is externalised via environment variables
  • No provider endpoints or credentials are hard-coded
  • Defaults are conservative and aligned with provider guidance
  • Configuration changes do not require code changes