r/mavenanalytics

What hiring managers actually look for in a data portfolio (and what they ignore)

I've been on both sides of this one... trying to get hired, and trying to find great talent. So I've got some opinions here 😄

Consistently, there always seems to be a gap between what candidates think matters and what actually gets attention on the employer side.

What doesn't move the needle as much as people think...

-- The number of projects. Three strong projects beat ten mediocre ones every time. I personally also always tell people that a few **industry-specific** projects will move the needle a lot more for you, if of course you know which type of role you're going for. Think about it... hiring managers aren't posting looking for a "generic data analyst". They want someone who can solve their specific problems, in their function, and their industry. Try to lean into your domain expertise and flaunt it. #1 tip, from my perspective.

-- Flashy visuals. A beautiful dashboard will get you some attention. It doesn't hurt. But if that dashboard doesn't answer a real business question then it won't HOLD the attention or inspire follow up interest. Business skills are better than design skills (which don't hurt, just prioritize accordingly).

-- Using the most buzz wordy tools. Nobody gets hired because they crammed in a neural network where a pivot table would do. I remember a former CEO saying "John, do some data science on this"... which I laughed at because it was a problem solved by third grade math. The tool matters less than getting the job done well. That's the real skill

What actually matters, in my experience...

-- Evidence that you can frame a problem. The best portfolios start with a real question about a business function, not "I analyzed this dataset." What were you trying to find out? What did you find? Why does it matter? What action should get taken as a result?

-- Messy, real-world data. If every dataset in your portfolio was already clean and structured, it raises questions. Show that you can handle real data. Personally, I always recommend you include your data cleaning process AFTER the business info up front. That's because data cleaning is boring. No one will get sucked in by it. Hook them with the business problem, then once they are interested in you, NOW it's time to dazzle them with more technical work like code and data cleaning.

-- Clear communication. Can someone non-technical look at your project and understand what you found and why they should care? If the answer is no, the analysis doesn't matter. Also, not everyone in your audience is a data pro. HR team members, external recruiters, maybe a marketing leader business partner. You need to get through these gate keepers too. Poor communication skills is a red flag, and really acts like a physical barrier to getting hired. Lots of data pros skip this, but you need to build it like any other muscle.

At a high level, a portfolio is really a communication and marketing exercise at its core, not a technical one. Do you know who you are trying to impress and what you are trying to show them? If not, that's where you should start. Then figure out what to put in front of them.

What's one thing you did, or wish you had done differently with your portfolio?

(I'll share my biggest mistake after we get two good comments from other folks)

reddit.com

u/johnthedataguy — 1 day ago

▲ 13 r/mavenanalytics+2 crossposts

Most people learn SELECT, WHERE, GROUP BY, and call it done. Then they hit a problem that GROUP BY can't solve — and that's usually when window functions finally click.

Here's the short version of what they do:

They let you perform calculations across a set of rows related to the current row, without collapsing the data into groups. So you can calculate a running total, rank records, or compare each row to a previous one...

...all while keeping every row intact.

The functions you'll use most often:

ROW_NUMBER() — assigns a unique rank to each row within a partition. Great for deduplication.

RANK() / DENSE_RANK() — similar, but handles ties differently. Useful for leaderboards or top-N problems.

LAG() / LEAD() — pulls a value from the previous or next row. The fastest way to do period-over-period comparisons.

SUM() / AVG() OVER() — running totals and moving averages without subqueries.

The syntax that trips people up at first is the OVER() clause — that's where you define the window. PARTITION BY works like GROUP BY (but doesn't collapse rows), and ORDER BY sets the order within each partition.

Once this clicks, you'll start seeing problems differently. A lot of things that used to require subqueries or self-joins become one clean window function.

What's your go-to window function, and what problem does it solve for you?

reddit.com

u/Dakota_from_Maven — 2 days ago

▲ 2 r/mavenanalytics

Hey everyone! Now that we're already almost a week into May, we want to know:

What are you working on this month?

It could be anything, like:

Finishing a Maven course you started
Building or improving a portfolio project
Practicing SQL / Python / BI tools
Experimenting more with AI in your workflow
Prepping for interviews or job applications
Or just staying consistent with your learning

A lot of people in this space are feeling the pace of change right now. New tools, new expectations, more noise.

Our take: the goal isn’t to chase everything; it’s to stay consistent, build real skills, and learn how to use AI as an advantage.

If you’re up for it, drop your May goal(s) below.

Even better if you share where you’re starting from.

We’ll be in here with you, and cheering you on!

reddit.com

u/Dakota_from_Maven — 7 days ago