u/automata_n8n

Just published an Avito.ma (Morocco) Scraper: Bypassing brittle CSS by tapping directly into Next.js JSON payloads
▲ 12 r/apify+1 crossposts

Just published an Avito.ma (Morocco) Scraper: Bypassing brittle CSS by tapping directly into Next.js JSON payloads

Hey, fellow scrapers! 👋

I’ve been working on a data pipeline for the MENA region and just published our first major Actor: Avito Maroc Scraper. Avito is the absolute giant of classifieds in Morocco (cars, real estate, electronics, jobs), but their UI structure updates frequently, making traditional CSS selectors a nightmare to maintain.

The Technical Approach: Instead of fighting the DOM, I built this scraper to strictly intercept the underlying NEXT_DATA JSON payloads.

  • Zero CSS Reliance: If the data is on the page, the actor gets it directly from the backend state. It’s incredibly stable.
  • Dynamic Attribute Parsing: Avito has vastly different attributes per category (e.g., Mileage and Transmission for cars vs. Rooms and Square Meters for apartments). The actor dynamically maps these into clean, structured JSON objects.
  • HD Images: It bypasses the compressed UI thumbnails and extracts the full high-res image URLs.

AI-Ready Output: I specifically designed the output to be ingested into LLM context windows and RAG pipelines. It spits out pristine, standardized JSON that you can immediately pipe into your vector databases or autonomous agents.

Quick Note on Proxies: Avito has some pretty aggressive anti-bot protection. While datacenter proxies might work for tiny runs, you really need Apify Residential Proxies if you want to scale this for thousands of items.

👇 How to get started:

Don't want to deal with code or infrastructure? You can run it directly from the cloud and download the data in Excel/CSV/JSON. Just paste an Avito link and click start: 👉 https://apify.com/scraper_guru/avito-maroc-scraper

I’m on a mission to build out the "Data Mine" for the MENA region. I'd love your feedback!

u/automata_n8n — 1 day ago
▲ 2 r/ayautomate+1 crossposts

We deployed a 24/7 Dutch AI chatbot, here's what happened to their support tickets

Customer Support AI Chatbot Deployment

Built a support chatbot for a Dutch loyalty program. The kind where customers ask "how do I save points" and "what's my balance" a hundred times a day. Every question needed a human agent to respond, no after-hours support, and when agents did answer, quality varied wildly depending on who picked up.

What we built: a RAG-powered chatbot that answers in native Dutch, pulling from a single PDF stored in Google Drive. That last part is the bit I like most. The entire knowledge base is one PDF. When content needs updating, the client edits the PDF, an n8n workflow detects the change, re-chunks it, generates new embeddings with OpenAI, and re-indexes into Pinecone. No CMS, no migration scripts, no developer needed.

Every response comes back with a confidence score and an escalation flag. If the bot isn't sure or the topic is sensitive, it hands off to a human with contact info instead of guessing. Felt important to build that in from the start.

Stack: Next.js frontend, n8n for all backend orchestration (no traditional server), GPT-4.1-mini, Pinecone, Supabase.

Full case study: https://www.ayautomate.com/case-studies/customer-support-automation

For anyone who's built something similar, how do you handle the knowledge base updates? The single-PDF approach works but I'm curious if there's something better for larger docs.

reddit.com
u/automata_n8n — 3 days ago