From the Data Warehouse to GA4, AI, and Beyond: Mixpanel Paves the Way with Event-Based Analytics

Mammoth Growth Podcast | Insights From The Trenches

Welcome to the latest episode of our podcast! In this new interview led by Mammoth Growth CEO Ryan Koonce, Mixpanel SVP Product Neil Rahilly establishes Mixpanel’s value for event-based analytics. Neil makes the important point that Google Analytics (GA) is more focused on website performance and traffic analysis, while Mixpanel allows for more in-depth analysis of user behavior and complex questions. Ryan offers more context to this, sharing that while GA can tell you what happened in different website sessions, it’s not designed to tell you who did what in those sessions. And that means, you can’t ask questions about particular users or cohorts in GA, since it wasn’t built to be a true cross-platform behavioral UX tool. 

When Google sunsetted their popular Universal Analytics (UA)  on July 1st 2023 and forced growth marketers onto Google Analytics 4 (GA4), Mixpanel took the opportunity to define a very specific niche:

  • Whereas many people found GA easy to use, oftentimes you couldn’t get the answers you needed for even the most basic questions about user engagement and conversion. GA4 has many of the same limitations with an even steeper learning curve.
  • On the other hand, while SQL is both extremely powerful and flexible, and it can give you answers to just about any question related to business intelligence (BI), it is too difficult for most people to use, and much too slow.

Mixpanel has positioned their event-based analytics solution at the sweet spot between GA and BI, offering everyone from growth marketers to product developers the opportunity to quickly get meaningful insights from their customer data. 

Neil concludes the conversation with a taste of Mixpanel’s AI tool, Spark. While Mixpanel has done everything they can to make event-based analytics as accessible as possible, Spark aims to reduce any remaining barrier to entry even further. Ask Spark a question about your customer data, and it will automatically build a chart to answer that question, and you can see exactly how it did it. Neil hints that Mixpanel’s next application for AI could be a tool that looks for spikes, drops, or anomalies that might indicate a broken data pipeline.

Transcript

Ryan Koonce (00:05):
Hey everybody, I'm Ryan Koonce, one of the co-founders at Mammoth Growth, and today I'm happy to have with me Neil Rahilly, the SVP of Pat Mixpanel. Hey Neil, how's it going?


Neil Rahilly (00:16):
Great, thanks for having me.


RK (00:19):
Yeah, absolutely. So one of the big things that I think people get confused about is what is analytics? So maybe you can start by chiming in on how Mixpanel thinks about what analytics is today, how that might differ from where you were, I don't know, a couple years ago, and we can talk about where you're going.


NR (00:41):
Cool. So analytics is for us a way for people to get visibility in what's going on with their users and their business and to track their progress by building metrics that describe the performance of the product in the business and to get answers to questions, to make good decisions. And really you're trying to get into this loop of defining goals that you want to see improve, being able to quantify them and see a line that actually moves when you make progress and then ship changes and you can really see what the impact of those changes were, which things were working, which things didn't work, and sometimes things that you thought would make things better, made things worse. And just working in that way with the real visibility of the effects of the changes that you're making and making sort of more informed decisions gives you a big productivity lift and you want everyone in your company to be able to self-serve. Being able to do that so that they can work in that style.


RK (02:04):
We call that benchmark test and optimize. And so we have these loops and the idea is you have to have the underlying baseline data, but then what are we going to do to make it better? And then let's measure it and actually go do it. And so the thing is though is I know what we can do with Mixpanel. I think one of the things that comes up in our conversations a lot is why not just do this with GA? Why shouldn't companies just use GA to do this? Which when I say GA, I'm talking about Google Analytics.


NR (02:32):
Yeah. So I think that Google Analytics is really built to help you understand how your website is performing and how the traffic that you're driving from Google ad spend is landing in your website. I think once you get into wanting to look at more in depth aspects of how users are really using your product and asking more complex questions about how they go suit through certain workflows and how well they retain, and once you start trying to dig into that sort of stuff, the Google Analytics, you kind of hit limits in it. You can't go as deep. And so that's where a true event-based analytics tool is going to give you a lot more flexibility and let you answer a lot more questions.


RK (03:39):
Well, you can't follow the user through the full journey in Google Analytics. So I think when we think about it, it's not really a true cross-platform behavioral UX tool. It just tells you what happened in a session, not who did it in a session. And so I think when we think about, well, I dunno, maybe you chime in on this. I mean, look, with GA I can test, but I can't really test to a cohort or to particular users, for example. And I guess how do you guys think about that as it relates to analytics, data market, other analytics tools, right? I mean, you might have people that say building dashboards and Lookers’ analytics. What's the difference? I mean, where you have GA on one side and Mixpanel in the middle and Looker or Sigma Tableau on the other end.


NR (04:30):
So I think bringing up SQL BI on the other side, you've got, it's sort of the opposite or a different set of issues there. Ultimately SQL gives you almost unlimited ability to transform and query the data on. But


RK (04:50):
Hold on, you're not going to build a funnel report in Tableau,


NR (04:53):
Right? Well, that's where I'm going with this is that it takes a lot of time and it's difficult, and so it becomes really expensive and slow. And so as a result, a lot of the time people either end up waiting really long time or they don't get an answer. And in lots of companies, there's an analysis team that has to build and configure dashboards and queries and SQL and in BI for then the actual product and marketing people to get an answer. And so we often find, we go into companies and we says, okay, the product and marketing teams are blocked, and the data science team is swamped because they're just inundated with so many questions and they can't get to them all. And actually most of the time they hate doing it. They didn't become a data scientist to tell you how many users signed up for this new feature, it should be something people can look up on their own.


(05:55):
And so you kind of have on one side SQL BI, theoretically unlimited depths, but very slow, very expensive, and you need experts. And then on the other side, you've got Google Analytics, it's sort of a solution for web traffic and ads, but it's not a full, so you're stuck between something. GA is easy to use, but you can't get the answers that you need. BI SQL, you can get the answers, but it's too difficult for most people to use and too slow. And so Mixpanel, event-based analytics, which is what we do, is sort of this sweet spot where everyone can use it. It's pretty simple to use. You can set up a funnel report and under a minute or any number of funnels, retention, all kinds of stuff, but you can go incredibly deep with it at the same time. So we think of it as fast, easy, but powerful analytics for everyone.


RK (07:04):
Yeah, that's great. And I guess you touched on it a little bit, but as accurate, reliable, consistent, accessible data becomes table stakes at companies and there are a lot of companies with a long way to go, who do you think or what is your perspective on who should own the data in an organization today?


NR (07:25):
Well, that depends a lot on what size of company and stage you're at. For startups just adding some tracking with something like Mixpanel, it's pretty easy. The engineering team can just, the same team that's building the product can easily just add that as part of what they do, and the team works pretty tightly. And so everyone, the company might be small enough, everyone has all the context of what's going on and can make use of the data and ask the person sitting next to 'em if there are questions. As companies get bigger and they're collecting more and more data and they have more and more teams working on more and more stuff, and there's also more and more consumers of the data that don't necessarily have all of that context, I think at some point it makes more sense to shift over and have a centralized data team that's focused on a, curating the company's data and preparing it and governing it and documenting it in ways that then let everybody trust it and understand it. And where that point is, I can't say for sure, but probably somewhere around a couple hundred employees.


RK (08:50):
And I think what we find with the data teams is just because those people were trained on Python and SQL and traditional data warehouses, sometimes they have a hard time getting their head around this idea that you can identify users and track events and have time series data in the warehouse and you can get at it in another place that's not writing mountains and mountains of Python and SQL. And so one of the, I think evolutions in the industry that we're seeing is that the data team still owns the data and maybe the data governance, but then other people are getting involved to do the change management and to provide a framework for asking the questions in the right way. I mean, it's shocking the number of times that we run into executives that ask for things that are kind of pointless, mostly because they just can't get their hands on anything.


(09:42):
And so anything's better than nothing, even if it's not the right thing. And so the data team isn't always the best team to reframe that conversation and start with, okay, well really, why are we trying to do this and how do we triage this giant spaghetti mess that you guys are throwing at me? It's interesting because that evolution's happening quickly now and the data teams are actually happy about it because now to your point, data scientists don't want to do data engineering. Now maybe if we get everything in place the way that it needs to be, they can do data science.


NR (10:13):
Yeah, yeah, exactly. They're freed up to work on higher leverage, more interesting data problems. And I think, and here's shameless plug, Ryan, one of the things that is key in that context is the data team will more than likely put the data in a data warehouse like cloud data warehouse, Snowflake, BigQuery, something like that. And so what analytics tool like Mixpanel now if you have BI, that BI have always just connected directly to the data warehouse in the past event-based analytics tools like Mixpanel kind of stood alone. You track data directly to them. And so what we actually just released is the ability to connect Mixpanel to your data warehouse. And so the data team can really prepare the data there. They can get it from all the different data sources across the company from support tickets to financial data to obviously product usage data, anything that's getting pulled into that data warehouse, use dbt or whatever they use to govern and join that data together. And then from there, it can be loaded into something like Mixpanel where people can self-serve on getting answers out of that data.


RK (11:41):
I wasn't going to go there, but I'm glad you brought it up because reverse ETL is a hot topic these days. And one of the things that we run into is that there's a sense that it's the end all be all, and you don't have to do any work to get the data anymore. And so I think it'd be good to clarify that if you don't have the time series data to start with because you didn't collect it via an SDK or some other mechanism, it's not there and you can't send it back up to Mixpanel. And so maybe speak to that briefly because I think people lose sight of the fact that some things you just don't collect in the warehouse anywhere else.


NR (12:18):
Yeah, yeah, yeah. So I think you can use a CDP Segment or you can track a tool like a Mixpanel, but one way or another, yeah, there are things that you're going to capture in your application database as obviously picks up a ton of state about what users are doing. So basically your users' table is your signups stable, you don't need to, and in fact, when that is the case, you're better off using that application data because it's in a transactional database, it's part of your application stack, your engineering team, your application depends on it being correct. And so it's a really, really high quality dataset.


RK (13:10):
That's right.


NR (13:11):
There's a lot of things like


RK (13:12):
Signups, revenue, it's anything we would think of as what's the thing we use the API to connect to in a traditional implementation?


NR (13:20):
But then, yes, you've got this whole server side. Yeah, you've got this whole giant set of things, of all these interactions with your product that you may not save that as a byproduct. You may not save any data as a byproduct of that. And so if you want to know that users swiped here or did a certain thing in your app, then you have to track that product usage, user behavior data. You've got to track those events. And for that, you need to set up something like Segment or Mixpanel, collect the data and load it into your data warehouse if you want to have it in your data warehouse alongside all that other data.


RK (14:04):
And I want to emphasize the importance of identity resolution in that front end tracking. And so we have a user that visits a website and signs up or signs in and we tag them back to their anonymous visits or they use the app next and we can tag them to that app and we can follow them across the entire journey. Most companies aren't capable of doing that just in the warehouse. I wouldn't say it's impossible because nothing is impossible, but the amount of effort that goes into doing something in one place is first. Another is something that you have to key in on very frequently because square hole,


NR (14:44):
Yeah, we are on version three of our ID management infrastructure at Mixpanel, and it's actually one the hardest engineering problems that it is. One of those things doesn't seem like a big deal, but if you really want to, especially if you're trying to stitch together activity from different platforms, different tools, logged out activity, and then when people log in, attributing it to the right user, it's really hard.


RK (15:16):
Lemme put it you this way, about once every two or three months we run into a client or a potential client that says, oh, no, no problem. We built our own identity resolution system. And I'm just like, okay, well, I know why none of your data adds up now,

NR (15:32):
So

RK (15:33):
Let's go. Let's take a look at that and audit that first because I guarantee you it doesn't work at the level that it needs to do the job that you can get from an off-the-shelf system today. And there's too many edge cases that aren't accounted for.

NR (15:49):
Yeah, it'll melt your mind. Yeah.

RK (15:52):
I want to take a pivot real quick, and it's been on the top of everybody's mind lately. This big move to GA4 and how it's supposed to be the greatest thing ever. We haven't seen it do anything well yet, but what is your perspective on that and how are you messaging maybe the difference in the market today?

NR (16:15):
Yeah, so one of the things that, so GA4 in terms of the design of the data model, it's an event-based system. So Google was moving to an event-based analytics system and that I think it's a good thing. I think it's opening people's eyes to event-based analytics and why it's great, it's easy and intuitive and powerful. But GA4 itself, the widespread customer user sentiment that we're seeing everywhere is that it's really not a great product. It's full of all kinds of,

RK (17:10):
You're just being diplomatic. It's not there.

NR (17:13):
It's not ready yet. And so I know the record goes on and I hold back and that's been great for us. So what it's really done is it's pushed a ton of marketing teams back into the market to look at what else is out there because it's also, it's a huge amount of effort to migrate from the old GA to GA4.

RK (17:42):
Well, it's just different. I mean, look, let's be real. The old GA isn't even a real tool for analytics. I mean, it's good directional opportunities, but if you really want to know what's going on and you really want to audit your data, which you can't do in GA, you have to use something else. And so I think for a lot of people, it's just a shock that this other opportunity exists. And that's why that's another reason it's so difficult. It's like, go hold on a minute. I am going to actually have to think about what I'm measuring, and also I have to make sure it's right, which in the past if it was wrong, you just didn't know.

NR (18:19):
And so it's been great for us. It's caused tons of marketing teams especially out there to look around, say, okay, if we're going to do a migration, if we're going to move to this new, so what else is out there and we don't like using GA4 and what it did for us is it really accelerated? We've always wanted to go beyond just the product development use case, but that's where Mixpanel started and that's where we were primarily focused up until a year or so ago. But event-based analytics, very generic. An event is just any interaction that your user is having with the product or company. And so you can model not just their in-product stuff, but you can model seeing an ad and clicking on an ad. Those are just more events. And the beauty of it's then that event stream, you have this complete event stream where you can sort of see all the way through the journey.

(19:23):
There were some things that marketing teams needed, particularly around attribution and multi-touch attribution around just session and page view and duration on page, certain features that they liked in GA that we added into Mixpanel and now I think have pretty much complete capabilities of what you could do in GA. You can do a Mixpanel and then you can do much, much more. And so that was all an effort to create a really kind of soft landing for marketing teams in Mixpanel. And so that's been a big trend over the last few quarters for us is just more and more marketing teams and companies coming over from GA. And last thing that's awesome about that is now you've got your product and your marketing team looking at the same numbers,

RK (20:23):
Same numbers. Well, that's a big problem

NR (20:25):
Historically. Yeah. Yeah. Big problems historically is you get these different departments have completely different views on the data.

RK (20:33):
I've only got you for a few more minutes, so I want to make sure I ask you about AI since it's the hot topic. Where's AI going with analytics? Are all my data analysts going to be out of jobs soon?

NR (20:46):
Yeah, no, I don't think so. I hope

RK (20:49):
Not.

NR (20:50):
No, I think that, well that we've started, I think there's a bunch of different places that you could apply these sort of new large language models to analytics. The place we started was letting people ask questions in natural language and then creating reports and dashboards for them based,

RK (21:12):
This is Spark or this is,

NR (21:13):
Yeah, this is called Spark. So that's our AI. And that's really great. There is still a barrier to entry in these tools, but what's cool about integrating it into an analytics product is that you're not just giving the answer and they don't know how you got there. You're giving them the answer comes in the form of a chart that they can then interact with and see all the underlying data. And so you can really trust and understand where that answer came from. And you can sort of learn by doing, right? You can learn by example. You had a question, it builds a chart and you're like, oh, okay. That's how I would answer a question with this chart. And so it's a great way to just help users onboard and to save them time, but you still need the actual analytics tool in the background as well. I think it's going to help a lot on the data governance side as well. So just being able to look for things that look like they're mislabeled or sudden spikes or drops that anomalies that might represent some broken pipeline. That's probably the next space we're going to apply AI.

RK (22:35):
Well, I think we're probably in the first inning, so excited to see where everybody takes it.

NR (22:41):
Yeah, yeah, me too.

RK (22:44):
All right. Well listen, we're at time, so thank you so much for hopping on today. To recap, I've got Neil Rahilly, the SVP of product at Mixpanel, and I'm Ryan Koonce, co-founder at Mammoth Growth. Thanks everybody.

NR (22:58):
Thanks Ryan.

Ready to unlock new
growth opportunities?

We and selected third parties collect personal information. You can provide or deny-  your consent to the processing of your sensitive personal information at any time via the “Accept” and “Reject” buttons.