r/bigquery

▲ 6 r/bigquery+2 crossposts

BiqQuery - larger dataset issue

Has anyone had an issue when trying fetch 20k+ records from BiqQuery to Postgres DB? Everything works fine if I keep it under 10k, using Table Input + SQL, but as soon as I try more records the pipeline fails. Odd Java error message. Ultimately, I am looking to move like 500k records from BQ to Postgres DB.

reddit.com
u/zadrogasauce — 4 days ago
▲ 22 r/bigquery+5 crossposts

Hey everyone,

u/pacingagency here, we’re a London-based marketing team with analytics in BigQuery and client reporting in Looker Studio.

We’ve got dashboard and modeling work coming up (project-based freelance, not full-time). We’d love to expand our talent pool so when a build spikes or needs deep SQL + reporting chops, we can pull in someone who actually can help.

Typical asks look like:

  • Connecting BigQuery → Looker Studio (tables, views, custom SQL — sensible live vs extract choices).
  • Building client-ready dashboards (filters, clear KPIs, definitions that survive handover).
  • Helping shape a reporting layer in BigQuery when raw data isn’t chart-friendly (nested fields, attribution-style joins, sensible grain).

Concrete example: we’re shaping a lead report - reconciling leads our client sends us with behavioural data in BigQuery (starting with form submission date/time matching; moving toward stronger user-id joins when the data supports it). The report needs things like first / last touch platform, click counts tied to gclid and other ad platform click IDs where we capture them, plus session count and how many calendar days those sessions span.

Requirements (strong overlap is important):

  • Hands-on BigQuery SQL: views / scheduled transforms are part of normal life for you.
  • Looker Studio: you’ve delivered real dashboards from BigQuery, not “I’ve played with it.”
  • Comfortable discussing GCP access / sharing basics (least privilege, how you’d onboard client viewers safely).

Notes:
This is freelance / as-needed. Filling out the form adds you to our pool; we’ll reach out when there’s a project that fits.

Interested? Please apply here https://form.pacing.agency/forms/designer-application-2askqd

Questions welcome in the thread!

Thanks!

u/pacingAgency — 8 days ago

Is BigQuery late to the AI game?

I've used BigQuery for a few years now and this past year I've seen so many different AI tools that help with everything from text-to-SQL to actually building reports and other features.

On one hand I understand they make their bread and butter from the actual warehouse and processing but as a user I would've liked to see more AI features integrated into the product. The new Gemini features work alright but it seems like an afterthought, like there's no way to build reports or visualizations, integrate into messaging apps, or connecting your context and semantics layers.

That was one of the reasons why I joined Bruin as a Developer Advocate recently because I wanted to be involved in building tools that address the stuff I wished I had as a data engineer. We just made our AI data analyst generally available. It connects to any warehouse like BigQuery, it imports the metadata of your datasets and creates a mental map of your data. You can also connect your dbt, airflow, dagster, or bruin pipeline repos to add additional context about your models.

The whole point is to have an agent that lives right inside your team and acts like a team member - from answering quick questions to preparing reports and even troubleshooting data & pipeline issues.

I was quite skeptical at first but we have dozens of clients using it and the more they use it the better the agent gets because it is self-correcting - every conversation and every correction further improves the context.

While I'm speaking about Bruin here, this is the general blueprint and framework for any organization to build themselves an AI data agent that does more than just text-to-sql.

reddit.com
u/uncertainschrodinger — 3 days ago

I'm exploring Agentic BI workflows using Vertex AI and the Model Context Protocol (MCP) to let agents query BQ directly. It works great in a sandbox, but handing an LLM the keys to run potentially massive, unoptimized queries on production tables feels terrifying.

Has anyone built reliable guardrails or a bulletproof semantic layer for this, or are we all just hoping the AI doesn't accidentally scan a 10TB unpartitioned table?

reddit.com
u/netcommah — 9 days ago