The example oracle ships a single config that runs unit tests by default and switches to integration mode with --mode int. Copy it.File: vitest.config.ts
import 'reflect-metadata';import { fileURLToPath } from 'node:url';import { dirname, resolve } from 'node:path';import { config as dotenvConfig } from 'dotenv';import { expect } from 'vitest';import { langchainMatchers } from '@langchain/core/testing';import { Logger } from '@nestjs/common';const __dirname = dirname(fileURLToPath(import.meta.url));const appRoot = resolve(__dirname, '../..');dotenvConfig({ path: resolve(appRoot, '.env') });dotenvConfig({ path: resolve(appRoot, '.env.integration'), override: true });expect.extend(langchainMatchers);process.env.LOG_LEVEL ??= 'warn';Logger.overrideLogger(['error', 'warn']);
The setup file loads .env first (runtime config), then layers .env.integration on top with override: true (test-only credentials). Then it registers LangChain matchers and quiets logs.
Each .int.test.ts file declares its required env up front and throws on missing values — see the next step. No silent skips.
import { describe, expect, it } from 'vitest';import { createTestRuntime } from '@ixo/oracle-runtime/testing';import { WeatherPlugin } from '../src/plugins/weather/index.js';describe('WeatherPlugin', () => { it('registers get_current_weather at boot', async () => { const { runtime } = await createTestRuntime({ plugins: [new WeatherPlugin()], env: { WEATHER_DEFAULT_UNITS: 'celsius' }, }); expect(runtime.toolRegistry.toolNames()).toContain('get_current_weather'); });});
createTestRuntime resolves plugins, populates registries, and builds a RuntimeContext you can hand straight to tool handlers — but does not boot Nest, talk to Matrix, or call the LLM. Use for fast, focused tests.
import { afterAll, beforeAll, describe, expect, test } from 'vitest';import { createIntegrationRuntime, type IntegrationRuntime,} from '@ixo/oracle-runtime/testing/integration';import { WeatherPlugin } from '../../src/plugins/weather/index.js';describe('Tier A — direct invoke', () => { let runtime: IntegrationRuntime | undefined; beforeAll(async () => { runtime = await createIntegrationRuntime({ plugins: [new WeatherPlugin()], user: { did: process.env.TEST_USER_DID! }, }); }, 60_000); afterAll(async () => { if (runtime) await runtime.close(); }); test('get_current_weather({ city: "Berlin" }) returns numeric temperature', async () => { const raw = await runtime!.invokeTool('get_current_weather', { city: 'Berlin' }); const result = JSON.parse(raw as string) as { temp: number; city: string }; expect(result.city.toLowerCase()).toContain('berlin'); expect(Number.isFinite(result.temp)).toBe(true); });});
Tier A boots the runtime registries against your plugin list but skips the agent loop entirely. Use it to verify env wiring, upstream-API contracts, and config threading.
Integration tests share a single Matrix admin user. Two test files booting in parallel collide on Matrix’s one-time key uploads at the homeserver. Run sequentially.
120s timeouts
Real Nest boot, Matrix sync, and LLM round-trips run 5-30s each. 120s leaves headroom for retries; pushing higher means cutting scope, not raising the cap.
Share one session across tests in a describe
client.createSession() is a server-side round-trip. Create once in beforeAll, reuse for every test. Only mint per-test sessions when the test’s whole point is session isolation (first-contact, cross-session recall).
Assert structurally on streamed events
Tier B uses a real model — response wording drifts. Collect tool_call events from the stream and assert which tools fired with which args, not on text output.
Don’t loosen assertions to mask failures. Broadening a regex, adding “or” clauses, or raising tolerances to make a flaky test pass discards the check that catches the bug. Investigate the real failure.
Don’t edit plugin code to make tests pass. Two test-side retry attempts max per failure, then stop and ask. Plugins are presumed-working production code; tests describe behaviour, not dictate it.
Don’t add skip-real-services flags (skipMatrixInit, skipGracefulShutdown) to integration tests as a speed-up. Integration tests must boot the same way production does — that’s their point.