u/Durovilla — reddlx

▲ 0 r/cursor

LLMs are trained on a snapshot of the web: APIs change, libraries update, and models confidently generate code that no longer works. The problem gets worse with newer or more niche devtools.

Some platforms are solving this by publishing llms.txt - AI-friendly versions of their docs that are always up-to-date. The catch is that there there's no good way for agents to search across or within them.

So I built Statespace, the first search engine for llms.txt sites. It fetches relevant links from millions of pages, leaving the context retrieval up to your agent. And it's 100% free to use via web, SDK, MCP, or CLI.

You can run plain queries to search across all docs:

mcp server setup
vector database embeddings
oauth2 token refresh

Or scope your queries to a specific site with site: query

stripe: webhook verification
mistral.ai: function calling
docs.supabase.com: edge functions auth

Quotes work like Google for exact phrases:

"context window limit"
vector database "semantic search"
stripe: "webhook signature verification"

u/Durovilla — 14 days ago

▲ 1 r/mcp

LLMs are trained on a snapshot of the web: APIs change, libraries update, and models confidently generate code that no longer works. The problem gets worse with newer or more niche devtools.

You can run plain queries to search across all docs:

mcp server setup
vector database embeddings
oauth2 token refresh

Or scope your queries to a specific site with site: query

stripe: webhook verification
mistral.ai: function calling
docs.supabase.com: edge functions auth

Quotes work like Google for exact phrases:

"context window limit"
vector database "semantic search"
stripe: "webhook signature verification"

---

Search for humans (website): statespace.com
Search for agents (CLI, SDK, and MCP): https://github.com/statespace-tech/statespace

u/Durovilla — 14 days ago

▲ 1 r/Rag

LLMs are trained on a snapshot of the web: APIs change, libraries update, and models confidently generate code that no longer works. The problem gets worse with newer or more niche tools.

Some developer platforms (e.g. Mintlify, Vercel, Auth0) are solving this by publishing llms.txt - AI-friendly versions of their docs that are always up-to-date. The catch is that there there's no good for agents to RAG across them.

So I built Statespace, the first search engine for llms.txt docs and sites. And it's free to use via web, SDK, MCP, or CLI.

You can run plain queries to search across all llms.txt sites:

mcp server setup
vector database embeddings
oauth2 token refresh

Or scope your queries to a specific site with site: query

stripe: webhook verification
mistral.ai: function calling
docs.supabase.com: edge functions auth

Quotes work like Google for exact phrases:

"context window limit"
vector database "semantic search"
stripe: "webhook signature verification"

Search for humans (website): statespace.com
Search for agents (CLI, SDK, and MCP): https://github.com/statespace-tech/statespace

u/Durovilla — 15 days ago

▲ 0 r/SEO

I've been feeling more and more that llms.txt skeptics aren't wrong: AI agents rarely reach for llms.txt on their own, and even Google has said it has no plans to support it (although that may be because they see it as a threat to their search monopoly).

At this point, "add llms.txt to rank better in AI search" is mostly wishful thinking. Though it doesn't stop decision makers from nagging people about it.

But in my line of work (developer tools), llms.txt are quietly becoming a standard. This is precisely because Coding agents need the latest developer docs, from the source, but reading raw HTML generally consumes massive amounts of token. Companies like Vercel, Auth0, Cursor, Anthropic, and even OpenAI itself already have some of the most comprehensive llms.txt sites out there. And their (already existing) customers are using them.

IIRC, this was Jeremy Howard's original use case when he proposed the standard in 2024. Not SEO or AI search, but rather agents reading docs.

I'd be curious to get your thoughts and experience on this.

reddit.com

u/Durovilla — 15 days ago

▲ 2 r/LangChain

LLMs are trained on a snapshot of the web: APIs change, libraries update, and models confidently generate code that no longer works. The problem gets worse with newer or more niche tools.

So I built Statespace, the first search engine for llms.txt docs and sites. And it's free to use via web, SDK, MCP, or CLI.

You can run plain queries to search across all llms.txt sites:

mcp server setup
vector database embeddings
oauth2 token refresh

Or scope your queries to a specific site with site: query

stripe: webhook verification
mistral.ai: function calling
docs.supabase.com: edge functions auth

Quotes work like Google for exact phrases:

"context window limit"
vector database "semantic search"
stripe: "webhook signature verification"

Search for humans (website): statespace.com
Search for agents (CLI, SDK, and MCP): https://github.com/statespace-tech/statespace

Looking for beta testers and feedback!

u/Durovilla — 15 days ago

▲ 6 r/aiagents

More companies, especially devtools, are publishing AI-friendly versions of their websites and docs with llms.txt.

However, there's still no good way for developers or AI agents to search across these sites. So I built Statespace, the first seach engine for llms.txt sites - and it's 100% free.

You can run plain search to search across all llms.txt sites

mcp server setup
vector database embeddings
oauth2 token refresh
rate limiting middleware

Or scope your queries to a specific site with site: query

stripe: webhook verification
mistral.ai: function calling
docs.supabase.com: edge functions auth

Quotes work like Google for exact phrases:

"context window limit"
vector database "semantic search"
stripe: "webhook signature verification"

Search from statespace.com, or use with your agent via CLI, SDK, MCP, or Skill.

This is still a work in progress, as there are are plenty of llms.txt files out there I haven't crawled yet. Looking for beta testers and feedback!

---

GitHub: https://github.com/statespace-tech/statespace

Discord: https://discord.com/invite/rRyM7zkZTf

u/Durovilla — 16 days ago

▲ 3 r/LLMDevs

More companies, especially devtools, are publishing AI-friendly versions of their websites and docs with llms.txt.

However, there's still no good way for developers or AI agents to search across these sites. So I built Statespace, the first seach engine for llms.txt sites - and it's 100% free.

You can run plain search to search across all llms.txt sites:

mcp server setup
vector database embeddings
oauth2 token refresh
rate limiting middleware

Or scope your queries to a specific site with site: query:

stripe: webhook verification
mistral.ai: function calling
docs.supabase.com: edge functions auth

Quotes work like Google for exact phrases:

"context window limit"
vector database "semantic search"
stripe: "webhook signature verification"

Search from statespace.com, or use with your agent via CLI, SDK, MCP, or Skill.

This is still a work in progress, as there are are plenty of llms.txt files out there I haven't crawled yet. Looking for beta testers and feedback!

---

GitHub: https://github.com/statespace-tech/statespace

Discord: https://discord.com/invite/rRyM7zkZTf

u/Durovilla — 16 days ago