Installation and Configuration¶
Installation¶
Install RDAgent: For different scenarios
for purely users: please use
pip install rdagentto install RDAgentfor dev users: See development
Install Docker: RDAgent is designed for research and development, acting like a human researcher and developer. It can write and run code in various environments, primarily using Docker for code execution. This keeps the remaining dependencies simple. Users must ensure Docker is installed before attempting most scenarios. Please refer to the official 🐳Docker page for installation instructions. Ensure the current user can run Docker commands without using sudo. You can verify this by executing docker run hello-world.
LiteLLM Backend Configuration (Default)¶
Note
🔥 Attention: We now provide experimental support for DeepSeek models! You can use DeepSeek’s official API for cost-effective and high-performance inference. See the configuration example below for DeepSeek setup.
Option 1: Unified API base for both models¶
# Set to any model supported by LiteLLM. CHAT_MODEL=gpt-4o EMBEDDING_MODEL=text-embedding-3-small # Configure unified API base # The backend api_key fully follows the convention of litellm. OPENAI_API_BASE=<your_unified_api_base> OPENAI_API_KEY=<replace_with_your_openai_api_key>
Option 2: Separate API bases for Chat and Embedding models¶
# Set to any model supported by LiteLLM. # CHAT MODEL: CHAT_MODEL=gpt-4o OPENAI_API_BASE=<your_chat_api_base> OPENAI_API_KEY=<replace_with_your_openai_api_key> # EMBEDDING MODEL: # TAKE siliconflow as an example, you can use other providers. # Note: embedding requires litellm_proxy prefix EMBEDDING_MODEL=litellm_proxy/BAAI/bge-large-en-v1.5 LITELLM_PROXY_API_KEY=<replace_with_your_siliconflow_api_key> LITELLM_PROXY_API_BASE=https://api.siliconflow.cn/v1
Configuration Example: DeepSeek Setup¶
Many users encounter configuration errors when setting up DeepSeek. Here’s a complete working example:
# CHAT MODEL: Using DeepSeek Official API CHAT_MODEL=deepseek/deepseek-chat DEEPSEEK_API_KEY=<replace_with_your_deepseek_api_key> # EMBEDDING MODEL: Using SiliconFlow for embedding since DeepSeek has no embedding model. # Note: embedding requires litellm_proxy prefix EMBEDDING_MODEL=litellm_proxy/BAAI/bge-m3 LITELLM_PROXY_API_KEY=<replace_with_your_siliconflow_api_key> LITELLM_PROXY_API_BASE=https://api.siliconflow.cn/v1
Necessary parameters include:
CHAT_MODEL: The model name of the chat model.
EMBEDDING_MODEL: The model name of the embedding model.
OPENAI_API_BASE: The base URL of the API. If EMBEDDING_MODEL does not start with litellm_proxy/, this is used for both chat and embedding models; otherwise, it is used for CHAT_MODEL only.
Optional parameters (required if your embedding model is provided by a different provider than CHAT_MODEL):
LITELLM_PROXY_API_KEY: The API key for the embedding model, required if EMBEDDING_MODEL starts with litellm_proxy/.
LITELLM_PROXY_API_BASE: The base URL for the embedding model, required if EMBEDDING_MODEL starts with litellm_proxy/.
Note: If you are using an embedding model from a provider different from the chat model, remember to add the litellm_proxy/ prefix to the EMBEDDING_MODEL name.
The CHAT_MODEL and EMBEDDING_MODEL parameters will be passed into LiteLLM’s completion function.
Therefore, when utilizing models provided by different providers, first review the interface configuration of LiteLLM. The model names must match those allowed by LiteLLM.
Additionally, you need to set up the the additional parameters for the respective model provider, and the parameter names must align with those required by LiteLLM.
For example, if you are using a DeepSeek model, you need to set as follows:
# For some models LiteLLM requires a prefix to the model name. CHAT_MODEL=deepseek/deepseek-chat DEEPSEEK_API_KEY=<replace_with_your_deepseek_api_key>
Besides, when you are using reasoning models, the response might include the thought process. For this case, you need to set the following environment variable:
REASONING_THINK_RM=True
For more details on LiteLLM requirements, refer to the official LiteLLM documentation.
Configuration Example 2: Azure OpenAI Setup¶
Here’s a sample configuration specifically for Azure OpenAI, based on the official LiteLLM documentation:
If you’re using Azure OpenAI, below is a working example using the Python SDK, following the LiteLLM Azure OpenAI documentation:
from litellm import completion import os # Set Azure OpenAI environment variables os.environ["AZURE_API_KEY"] = "<your_azure_api_key>" os.environ["AZURE_API_BASE"] = "<your_azure_api_base>" os.environ["AZURE_API_VERSION"] = "<version>" # Make a request to your Azure deployment response = completion( "azure/<your_deployment_name>", messages = [{ "content": "Hello, how are you?", "role": "user" }] )
To align with the Python SDK example above, you can configure the CHAT_MODEL based on the response model setting and use the corresponding os.environ variables by writing them into your local .env file as follows:
cat << EOF > .env # CHAT MODEL: Azure OpenAI via LiteLLM CHAT_MODEL=azure/<your_deployment_name> AZURE_API_BASE=https://<your_azure_base>.openai.azure.com/ AZURE_API_KEY=<your_azure_api_key> AZURE_API_VERSION=<version> # EMBEDDING MODEL: Using SiliconFlow via litellm_proxy EMBEDDING_MODEL=litellm_proxy/BAAI/bge-large-en-v1.5 LITELLM_PROXY_API_KEY=<your_siliconflow_api_key> LITELLM_PROXY_API_BASE=https://api.siliconflow.cn/v1 EOF
This configuration allows you to call Azure OpenAI through LiteLLM while using an external provider (e.g., SiliconFlow) for embeddings.
If your Azure OpenAI API Key` supports embedding model, you can refer to the following configuration example.
cat << EOF > .env EMBEDDING_MODEL=azure/<Model deployment supporting embedding> CHAT_MODEL=azure/<your deployment name> AZURE_API_KEY=<replace_with_your_openai_api_key> AZURE_API_BASE=<your_unified_api_base> AZURE_API_VERSION=<azure api version>
Execution Environment Configuration¶
Coder Environment Configuration (Docker vs. Conda)
RD-Agent’s coders can execute code in different environments. You can control this behavior by setting environment variables in your .env file. This is useful for switching between a local Conda environment and an isolated Docker container.
To configure the environment, add the corresponding line to your .env file based on the scenario you are running.
For the Model (Quant) Scenario:
The execution environment is determined by the MODEL_COSTEER_ENV_TYPE variable, which is read from rdagent/components/coder/model_coder/conf.py.
To use Docker (recommended for isolated execution):
MODEL_COSTEER_ENV_TYPE=docker
To use Conda (for running in a local Conda environment):
MODEL_COSTEER_ENV_TYPE=conda
For the Data Science Scenario:
The execution environment is determined by the DS_CODER_COSTEER_ENV_TYPE variable, which is read from rdagent/components/coder/data_science/conf.py.
To use Docker (recommended for isolated execution):
DS_CODER_COSTEER_ENV_TYPE=docker
To use Conda (for running in a local Conda environment):
DS_CODER_COSTEER_ENV_TYPE=conda
Custom Time Segment Configuration (Train / Valid / Test)¶
RD-Agent now supports user-defined time segments for training, validation,
and testing (backtesting). Users can customize these segments via environment
variables in the .env file, depending on the scenario being executed.
This feature allows greater flexibility when running experiments on different time ranges without modifying code or YAML configurations.
Fin-Factor Scenario¶
When running the fin_factor scenario, you can configure the time segments using the following environment variables. These variables are read by the Factor-related PropSettings and directly affect the execution process.
Add the following entries to your .env file as needed:
QLIB_FACTOR_TRAIN_START=<train start date, default is 2008-01-01>
QLIB_FACTOR_TRAIN_END=<train end date, default is 2014-12-31>
QLIB_FACTOR_VALID_START=<valid start date, default is 2015-01-01>
QLIB_FACTOR_VALID_END=<valid end date, default is 2016-12-31>
QLIB_FACTOR_TEST_START=<test / backtest start date, default is 2017-01-01>
QLIB_FACTOR_TEST_END=<test / backtest end date, default is 2020-12-31>
Fin-Model Scenario¶
When running the fin_model scenario, the model training, validation, and testing time segments can be configured independently via the following environment variables:
QLIB_MODEL_TRAIN_START=<train start date, default is 2008-01-01>
QLIB_MODEL_TRAIN_END=<train end date, default is 2014-12-31>
QLIB_MODEL_VALID_START=<valid start date, default is 2015-01-01>
QLIB_MODEL_VALID_END=<valid end date, default is 2016-12-31>
QLIB_MODEL_TEST_START=<test / backtest start date, default is 2017-01-01>
QLIB_MODEL_TEST_END=<test / backtest end date, default is 2020-12-31>
These settings are used during model training and evaluation and directly impact the execution workflow.
Fin-Quant Scenario¶
When running the fin_quant scenario, RD-Agent supports configuring time segments for factor, model, and quant stages simultaneously.
Note: The QLIB_QUANT_* variables are only used for front-end UI display
purposes and do not affect the actual execution process.
You may configure the following variables in your .env file:
QLIB_FACTOR_TRAIN_START=<train start date, default is 2008-01-01>
QLIB_FACTOR_TRAIN_END=<train end date, default is 2014-12-31>
QLIB_FACTOR_VALID_START=<valid start date, default is 2015-01-01>
QLIB_FACTOR_VALID_END=<valid end date, default is 2016-12-31>
QLIB_FACTOR_TEST_START=<test / backtest start date, default is 2017-01-01>
QLIB_FACTOR_TEST_END=<test / backtest end date, default is 2020-12-31>
QLIB_MODEL_TRAIN_START=<train start date, default is 2008-01-01>
QLIB_MODEL_TRAIN_END=<train end date, default is 2014-12-31>
QLIB_MODEL_VALID_START=<valid start date, default is 2015-01-01>
QLIB_MODEL_VALID_END=<valid end date, default is 2016-12-31>
QLIB_MODEL_TEST_START=<test / backtest start date, default is 2017-01-01>
QLIB_MODEL_TEST_END=<test / backtest end date, default is 2020-12-31>
QLIB_QUANT_TRAIN_START=<train start date, default is 2008-01-01>
QLIB_QUANT_TRAIN_END=<train end date, default is 2014-12-31>
QLIB_QUANT_VALID_START=<valid start date, default is 2015-01-01>
QLIB_QUANT_VALID_END=<valid end date, default is 2016-12-31>
QLIB_QUANT_TEST_START=<test / backtest start date, default is 2017-01-01>
QLIB_QUANT_TEST_END=<test / backtest end date, default is 2020-12-31>
This setup allows the front-end to display consistent segment information across different stages while keeping execution logic unchanged.
Configuration(deprecated)¶
To run the application, please create a .env file in the root directory of the project and add environment variables according to your requirements.
If you are using this deprecated version, you should set BACKEND to rdagent.oai.backend.DeprecBackend.
BACKEND=rdagent.oai.backend.DeprecBackend
Here are some other configuration options that you can use:
OpenAI API¶
Here is a standard configuration for the user using the OpenAI API.
OPENAI_API_KEY=<your_api_key> EMBEDDING_MODEL=text-embedding-3-small CHAT_MODEL=gpt-4-turbo
Azure OpenAI¶
The following environment variables are standard configuration options for the user using the OpenAI API.
USE_AZURE=True EMBEDDING_OPENAI_API_KEY=<replace_with_your_azure_openai_api_key> EMBEDDING_AZURE_API_BASE= # The endpoint for the Azure OpenAI API. EMBEDDING_AZURE_API_VERSION= # The version of the Azure OpenAI API. EMBEDDING_MODEL=text-embedding-3-small CHAT_OPENAI_API_KEY=<replace_with_your_azure_openai_api_key> CHAT_AZURE_API_BASE= # The endpoint for the Azure OpenAI API. CHAT_AZURE_API_VERSION= # The version of the Azure OpenAI API. CHAT_MODEL= # The model name of the Azure OpenAI API.
Use Azure Token Provider¶
If you are using the Azure token provider, you need to set the CHAT_USE_AZURE_TOKEN_PROVIDER and EMBEDDING_USE_AZURE_TOKEN_PROVIDER environment variable to True. then use the environment variables provided in the Azure Configuration section.
☁️ Azure Configuration - Install Azure CLI:
`sh curl -L https://aka.ms/InstallAzureCli | bash `
Log in to Azure:
`sh az login --use-device-code `exit and re-login to your environment (this step may not be necessary).
Configuration List¶
OpenAI API Setting
Configuration Option |
Meaning |
Default Value |
|---|---|---|
OPENAI_API_KEY |
API key for both chat and embedding models |
None |
EMBEDDING_OPENAI_API_KEY |
Use a different API key for embedding model |
None |
CHAT_OPENAI_API_KEY |
Set to use a different API key for chat model |
None |
EMBEDDING_MODEL |
Name of the embedding model |
text-embedding-3-small |
CHAT_MODEL |
Name of the chat model |
gpt-4-turbo |
EMBEDDING_AZURE_API_BASE |
Base URL for the Azure OpenAI API |
None |
EMBEDDING_AZURE_API_VERSION |
Version of the Azure OpenAI API |
None |
CHAT_AZURE_API_BASE |
Base URL for the Azure OpenAI API |
None |
CHAT_AZURE_API_VERSION |
Version of the Azure OpenAI API |
None |
USE_AZURE |
True if you are using Azure OpenAI |
False |
CHAT_USE_AZURE_TOKEN_PROVIDER |
True if you are using an Azure Token Provider in chat model |
False |
EMBEDDING_USE_AZURE_TOKEN_PROVIDER |
True if you are using an Azure Token Provider in embedding model |
False |
Globol Setting
Configuration Option |
Meaning |
Default Value |
|---|---|---|
max_retry |
Maximum number of times to retry |
10 |
retry_wait_seconds |
Number of seconds to wait before retrying |
1 |
log_trace_path |
Path to log trace file |
None |
log_llm_chat_content |
Flag to indicate if chat content is logged |
True |
Cache Setting
Configuration Option |
Meaning |
Default Value |
|---|---|---|
dump_chat_cache |
Flag to indicate if chat cache is dumped |
False |
dump_embedding_cache |
Flag to indicate if embedding cache is dumped |
False |
use_chat_cache |
Flag to indicate if chat cache is used |
False |
use_embedding_cache |
Flag to indicate if embedding cache is used |
False |
prompt_cache_path |
Path to prompt cache |
./prompt_cache.db |
max_past_message_include |
Maximum number of past messages to include |
10 |
Loading Configuration¶
For users’ convenience, we provide a CLI interface called rdagent, which automatically runs load_dotenv() to load environment variables from the .env file. However, this feature is not enabled by default for other scripts. We recommend users load the environment with the following steps:
- ⚙️ Environment Configuration
- Place the .env file in the same directory as the .env.example file.
The .env.example file contains the environment variables required for users using the OpenAI API (Please note that .env.example is an example file. .env is the one that will be finally used.)
Export each variable in the .env file:
export $(grep -v '^#' .env | xargs)
If you want to change the default environment variables, you can refer to the above configuration and edith the .env file.