Case Study·E-commerce / Marketplace·SEO·Command Center
84%of crawls hitting zero-value pages
99%SEO scope reduction
10xfaster strategic analysis
01

The Challenge

The marketplace's SEO team manages over 8 billion URLs with individual datasets measured in terabytes. Yet critical SEO signals were trapped in silos across multiple platforms, requiring separate logins and manual extraction. Senior analysts spent their time wrangling spreadsheets instead of optimizing strategy. When leadership asked fundamental questions like "How many pages do we actually need?", the team couldn't answer. They called it the "infinity problem": infinite pages, no prioritization framework, and no way to separate signal from noise at scale.

02

The Constraint

Standard ETL pipelines timed out against terabyte-scale datasets. Google Search Console sampling stripped away the granular signals needed for optimization. Web log data auto-archived every six months, erasing seasonal context critical for a marketplace business driven by holiday peaks. Internal link and sitemap relationships weren't captured at all, leaving blind spots across billions of interconnected pages. This wasn't just fragmentation. It was a computational wall.

03

The Approach

Mammoth Growth architected the SEO Command Center: a unified BigQuery platform consolidating GSC, Botify crawl data, server logs, and internal link data using medallion architecture optimized for terabyte-scale performance, with reusable LookML components designed for cross-team adoption.

04

The Outcome

The platform revealed that 84% of bot crawls targeted pages with zero visits and zero search volume, enabling crawl budget reallocation to revenue-driving pages. Of roughly 1 billion pages, only 10 million drive SEO value, giving leadership a concrete optimization target. Quality score analysis spanning 15 months of multi-source data was completed in under 3 weeks, work that previously would have taken months.

What This Unlocked

Executive-level answers, on demand. The SEO team now confidently reports which pages matter, where crawl budget is wasted, and how performance shifts year over year.

Self-serve insights beyond SEO. Reusable LookML components and standardized data models enable cross-functional teams to run their own analyses.

AI-ready architecture. The platform is designed to incorporate new data sources and support AI-driven automation and optimization agents, positioning the marketplace to lead in AI-powered SEO by 2026.

From SEO project to org-wide data foundation. What started as infrastructure for one team became the analytical backbone for the company.

Services

Data EngineeringAnalyticsStrategy

Tech Stack

BigQueryLooker (LookML)Google Search ConsoleBotifyGoogle Cloud PlatformGoogle Cloud StorageMedallion Architecture

Results

Crawl waste identified

84% hitting zero-value pages

SEO scope reduction

99% (from ~1B to ~10M pages)

Analysis acceleration

10x (under 3 weeks vs. months)

Cross-dataset analysis

Enabled for the first time

These numbers don't happen by accident.

Talk to us about what's possible for your business.