Skip to main content
warning

🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.

Guide to Adding a Third-Party Engine to Cortex

Introduction​

This guide outlines the steps to integrate a custom engine with Cortex. We hope this helps developers understand the integration process.

Implementation Steps​

1. Implement the Engine Interface​

First, create an engine that implements the EngineI.h interface. Here's the interface definition:


class EngineI {
public:
struct RegisterLibraryOption {
std::vector<std::filesystem::path> paths;
};
struct EngineLoadOption {
// engine
std::filesystem::path engine_path;
std::filesystem::path cuda_path;
bool custom_engine_path;
// logging
std::filesystem::path log_path;
int max_log_lines;
trantor::Logger::LogLevel log_level;
};
struct EngineUnloadOption {
bool unload_dll;
};
virtual ~EngineI() {}
virtual void RegisterLibraryPath(RegisterLibraryOption opts) = 0;
virtual void Load(EngineLoadOption opts) = 0;
virtual void Unload(EngineUnloadOption opts) = 0;
// Cortex.llamacpp interface methods
virtual void HandleChatCompletion(
std::shared_ptr<Json::Value> json_body,
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
virtual void HandleEmbedding(
std::shared_ptr<Json::Value> json_body,
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
virtual void LoadModel(
std::shared_ptr<Json::Value> json_body,
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
virtual void UnloadModel(
std::shared_ptr<Json::Value> json_body,
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
virtual void GetModelStatus(
std::shared_ptr<Json::Value> json_body,
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
// Compatibility and model management
virtual bool IsSupported(const std::string& f) = 0;
virtual void GetModels(
std::shared_ptr<Json::Value> jsonBody,
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
// Logging configuration
virtual bool SetFileLogger(int max_log_lines,
const std::string& log_path) = 0;
virtual void SetLogLevel(trantor::Logger::LogLevel logLevel) = 0;
};

Lifecycle Management​

RegisterLibraryPath​

virtual void RegisterLibraryPath(RegisterLibraryOption opts) = 0;

This method is called during engine initialization to set up dynamic library search paths. For example, in Linux, we still have to use LD_LIBRARY_PATH to add CUDA dependencies to the search path.

Parameters:

  • opts.paths: Vector of filesystem paths that the engine should register

Implementation Requirements:

  • Register provided paths for dynamic library loading
  • Handle invalid paths gracefully
  • Thread-safe implementation
  • No exceptions should escape the method
Load​

virtual void Load(EngineLoadOption opts) = 0;

Initializes the engine with the provided configuration options.

Parameters:

  • engine_path: Base path for engine files
  • cuda_path: Path to CUDA installation
  • custom_engine_path: Flag for using custom engine location
  • log_path: Location for log files
  • max_log_lines: Maximum number of lines per log file
  • log_level: Logging verbosity level

Implementation Requirements:

  • Validate all paths before use
  • Initialize engine components
  • Set up logging configuration
  • Handle missing dependencies gracefully
  • Clean initialization state in case of failures
Unload​

virtual void Unload(EngineUnloadOption opts) = 0;

Performs cleanup and shutdown of the engine.

Parameters:

  • unload_dll: Boolean flag indicating whether to unload dynamic libraries

Implementation Requirements:

  • Clean up all allocated resources
  • Close file handles and connections
  • Release memory
  • Ensure proper shutdown of running models
  • Handle cleanup in a thread-safe manner

2. Create a Dynamic Library​

We recommend using the dylib library to build your dynamic library. This library provides helpful tools for creating cross-platform dynamic libraries.

3. Package Dependencies​

Please ensure all dependencies are included with your dynamic library. This allows us to create a single, self-contained package for distribution.

4. Publication and Integration​

4.1 Publishing Your Engine (Optional)​

If you wish to make your engine publicly available, you can publish it through GitHub. For reference, examine the cortex.llamacpp releases structure:

  • Each release tag should represent your version
  • Include all variants within the same release
  • Cortex will automatically select the most suitable variant or allow users to specify their preferred variant

4.2 Integration with Cortex​

Once your engine is ready, we encourage you to:

  1. Notify the Cortex team about your engine for potential inclusion in our default supported engines list
  2. Allow us to help test and validate your implementation

5. Local Testing Guide​

To test your engine locally:

  1. Create a directory structure following this hierarchy:

engines/
└── cortex.llamacpp/
└── mac-arm64/
└── v0.1.40/
├── libengine.dylib
└── version.txt

  1. Configure your engine:

    • Edit the ~/.cortexrc file to register your engine name
    • Add your model with the appropriate engine field in model.yaml
  2. Testing:

    • Start the engine
    • Load your model
    • Verify functionality

Future Development​

We're currently working on expanding support for additional release sources to make distribution more flexible.

Contributing​

We welcome suggestions and contributions to improve this integration process. Please feel free to submit issues or pull requests through our repository.