feat: SearXNG self-hosted search + OPENAI_EMBEDDING_BASE_URL for custom embeddings by mareurs · Pull Request #1644 · assafelovic/gpt-researcher

mareurs · 2026-02-26T19:36:55Z

Summary

Two additions for users who want to avoid paid external APIs:

1. SearXNG as a self-hosted search backend (no API key required)

GPT-Researcher already supports RETRIEVER=searx, but running SearXNG alongside required manual Docker setup. This PR adds it as a first-class --profile searxng option in docker-compose.yml.

# Start with self-hosted search (no Tavily key needed)
docker compose --profile searxng up -d

Then in .env:

RETRIEVER=searx
SEARX_URL=http://searxng:8080

What's included:

searxng service in docker-compose.yml under --profile searxng
searxng/settings.yml with JSON format enabled (required by the searx retriever) and sensible defaults for Google, Bing, and DuckDuckGo
RETRIEVER and SEARX_URL forwarded to the gpt-researcher service
Default remains RETRIEVER=tavily — no breaking change for existing users

SearXNG aggregates results from multiple engines without any per-query API costs, making it ideal for local/private deployments.

2. `OPENAI_EMBEDDING_BASE_URL` for the `custom` embedding provider

When running a dedicated embedding service (e.g. HuggingFace TEI, Infinity) alongside a separate LLM API, a single OPENAI_BASE_URL can't address both endpoints.

The custom embedding provider now checks OPENAI_EMBEDDING_BASE_URL first, falling back to OPENAI_BASE_URL and then the LM Studio default:

OPENAI_BASE_URL=http://localhost:8000/v1        # LLM endpoint
OPENAI_EMBEDDING_BASE_URL=http://localhost:8080/v1  # Embedding endpoint
EMBEDDING=custom:BAAI/bge-large-en-v1.5

No behaviour change when OPENAI_EMBEDDING_BASE_URL is unset.

Test plan

docker compose --profile searxng up -d starts SearXNG on port 4000
curl "http://localhost:4000/search?q=test&format=json" returns JSON results
RETRIEVER=searx SEARX_URL=http://localhost:4000 produces research results
RETRIEVER=tavily (default) still works — no regression
Setting OPENAI_EMBEDDING_BASE_URL routes custom embeddings to the specified endpoint
Omitting OPENAI_EMBEDDING_BASE_URL falls back to OPENAI_BASE_URL (no regression)

🤖 Generated with Claude Code

Adds two zero-cost alternatives for users who prefer not to depend on paid external APIs: 1. **SearXNG as a self-hosted search backend** (`--profile searxng`) - New `searxng` Docker Compose service behind `--profile searxng` - Bundles `searxng/settings.yml` with JSON format enabled (required by the `searx` retriever) and sensible engine defaults - `gpt-researcher` service now forwards `RETRIEVER` and `SEARX_URL` env vars so the profile switch is self-contained - Default remains `RETRIEVER=tavily` — zero breaking change for existing users 2. **`OPENAI_EMBEDDING_BASE_URL` for the `custom` embedding provider** - When running a dedicated embedding service (e.g. HuggingFace TEI, Infinity, Ollama) alongside a separate LLM endpoint, a single `OPENAI_BASE_URL` cannot address both - `custom` now checks `OPENAI_EMBEDDING_BASE_URL` first, falling back to `OPENAI_BASE_URL` and then the LM Studio default - No behaviour change when `OPENAI_EMBEDDING_BASE_URL` is unset Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

assafelovic · 2026-03-01T14:12:46Z

@mareurs can you please add under the docs directory in relevant file addition for explaining how to use this?

mareurs mentioned this pull request Feb 27, 2026

Guide on setting up MCP with SearXNG assafelovic/gptr-mcp#18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SearXNG self-hosted search + OPENAI_EMBEDDING_BASE_URL for custom embeddings#1644

feat: SearXNG self-hosted search + OPENAI_EMBEDDING_BASE_URL for custom embeddings#1644
mareurs wants to merge 1 commit intoassafelovic:mainfrom
mareurs:searxng-upstream

mareurs commented Feb 26, 2026

Uh oh!

assafelovic commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mareurs commented Feb 26, 2026

Summary

1. SearXNG as a self-hosted search backend (no API key required)

2. OPENAI_EMBEDDING_BASE_URL for the custom embedding provider

Test plan

Uh oh!

assafelovic commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2. `OPENAI_EMBEDDING_BASE_URL` for the `custom` embedding provider