We built a retrieval system that can do analyst-style SEC filing research in seconds. Need advice from finance and RAG builders.

Hi everyone,

Looking for advice from people who either:
- work with SEC filings professionally
- build AI/retrieval systems for finance
- have experience with tools like AlphaSense, Hebbia, Deep Research, internal RAG stacks, etc.

My co-founder and I come from information retrieval backgrounds (drug discovery and government/legal information systems).

Over the last 7 months we’ve been exploring a different retrieval architecture based on a simple idea:

Instead of forcing an agent to repeatedly rediscover the same relationships at query time, can more of that work be done once at ingestion and then reused?

We designed quite powerful system with a complex agentic ingestion pipeline that automatically restructures and logically connects information into a graph form (not the classical knowledge graph approach and no GraphRag since I worked with them before and aware of all the issues with them 😵‍💫).

To test the system we went for a densely connected data and processed the latest S&P 500 10-K filings.

we were quite surprised to find out how much faster and cheaper retrieval can be shifting the compute and using different information structure.
Queries that would normally require deep research-style retrieval that takes 10,15,20+ minutes are taking a few seconds(<5).

Now we’re thinking about realistic and complex queries that people building financial AI agents could be impressed with.

If you are building AI agents in finance or using AI tools to run research across documents such as SP500, 10Ks, 8Ks and 10Qs - would really appreciate if you can share queries that the systems usually struggle with.

Thank you.

reddit.com
u/Ancient-Estimate-346 — 14 days ago
▲ 4 r/Rag

We built a retrieval system that answers analyst-style SEC filing questions in seconds. Need advice from finance and RAG builders.

Hi everyone,

Looking for advice from people who either:
- work with SEC filings professionally
- build AI/retrieval systems for finance
- have experience with tools like AlphaSense, Hebbia, Deep Research, internal RAG stacks, etc.

My co-founder and I come from information retrieval backgrounds (drug discovery and government/legal information systems).

Over the last 7 months we’ve been exploring a different retrieval architecture based on a simple idea:

Instead of forcing an agent to repeatedly rediscover the same relationships at query time, can more of that work be done once at ingestion and then reused?

We designed quite powerful system with a complex agentic ingestion pipeline that automatically restructures and logically connects information into a graph form (not the classical knowledge graph approach and no GraphRag since I worked with them before and aware of all the issues with them 😵‍💫).

To test the system we went for a densely connected data and processed the latest S&P 500 10-K filings.

we were quite surprised to find out how much faster and cheaper retrieval can be shifting the compute and using different information structure.
Queries that would normally require deep research-style retrieval that takes 10,15,20+ minutes are taking a few seconds(<5).

Now we’re thinking about realistic and complex queries that people building financial AI agents could be impressed with.

If you are building AI agents in finance or using AI tools to run research across documents such as SP500, 10Ks, 8Ks and 10Qs - would really appreciate if you can share queries that the systems usually struggle with.

Thank you.

reddit.com
u/Ancient-Estimate-346 — 14 days ago
▲ 0 r/CFO+1 crossposts

AI in Finance - Advice

Hi all,
I’m hoping this is an appropriate question for the community.

My co-founder and I come from information retrieval and knowledge systems rather than finance, and we’re trying to understand how professionals actually work with SEC filings today since we developed a retrieval system and testing it on this domain.

We want to run test queries against S&P500 and 10Ks ( it’s our first batch) and would be great to learn:

- What are the most typical and also complex requests we could try to ask to this data?
- For those who tried using AI tools for such work - What questions does AI consistently struggle with ?

Thanks a lot!

reddit.com
u/Ancient-Estimate-346 — 14 days ago

We’ve been working on a retrieval system for teams building AI agents in finance.

(mainly around workflows that need to do in-depth web research).

A few patterns we keep running into:

- cost per query gets high quickly with deep research flows

- latency makes it hard to use in real workflows ( not the quick superficial simple search)

- bloated context windows

Anyone here who is running ai agents in production or uses deep research APIs regularly:

- what is your experience with using those for automations of the financial research tasks?

Would really appreciate any examples of a better approach or any other challenges you see that we are still going to get into.

reddit.com
u/Ancient-Estimate-346 — 2 months ago

We’ve been working on a retrieval system for teams building AI agents in finance.

(mainly around workflows that need to do in-depth web research).

A few patterns we keep running into:

- cost per query gets high quickly with deep research flows

- latency makes it hard to use in real workflows ( not the quick superficial simple search)

- bloated context windows

Anyone here who is running ai agents in production or uses deep research APIs regularly:

- what is your experience with using those for automations of the financial research tasks?

Would really appreciate any examples of a better approach or any other challenges you see that we are still going to get into.

reddit.com
u/Ancient-Estimate-346 — 2 months ago

We’ve been working on a retrieval system for teams building AI agents in finance.

(mainly around workflows that need to do in-depth web research).

A few patterns we keep running into:

- cost per query gets high quickly with deep research flows

- latency makes it hard to use in real workflows ( not the quick superficial simple search)

- bloated context windows

Anyone here who is running ai agents in production or uses deep research APIs regularly:

- what is your experience with using those for automations of the financial research tasks?

Would really appreciate any examples of a better approach or any other challenges you see that we are still going to get into.

reddit.com
u/Ancient-Estimate-346 — 2 months ago