▲ 3 r/n8n
Help: How to scrape dynamic websites using n8n
Hi everyone,
I’m working with n8n and trying to scrape data from dynamic websites (JavaScript-rendered pages), but I’m running into some limitations.
For example, I’m trying to extract content from pages like this:
http://www.iort.gov.tn/WD120AWP/WD120Awp.exe/CTX_9648-63-ijOAVgefuu/CodesJuridiques/SYNC_218121892
The issue is that:
- The page content is loaded dynamically (not fully available in the initial HTML)
- The URL changes randomly every time (session-based or generated links), so it’s not stable
- Using the HTTP Request node in n8n doesn’t return the actual rendered content
- I suspect it relies on JavaScript execution or internal requests
What I’ve tried so far:
- Basic HTTP Request node → only returns partial/empty HTML
- Comparing page source vs inspected DOM → content mismatch
My questions:
- What’s the best way to scrape this kind of dynamic website using n8n?
- Is there a way to integrate a headless browser (like Puppeteer or Playwright) with n8n?
- How do you handle scraping when URLs are dynamic/session-based like this?
- Should I try to replicate the underlying API calls from the Network tab instead?
- Any recommended workflow architecture for handling this reliably?
I’d really appreciate any tips, best practices, or examples 🙏
Thanks!
u/Tricky_Literature397 — 22 hours ago