
Olist E-Commerce Analytics & Customer Segmentation [SQL]
The Business Question: Who are our most valuable customers, and what hidden operational, financial, and strategic factors are driving our critical revenue churn?
​
Project Overview: This project represents a comprehensive, end-to-end data pipeline and strategic analysis of a massive e-commerce dataset (93,000+ customers, R$15.4M in delivered revenue). Utilizing PostgreSQL, I engineered a robust relational database and developed a dynamic RFM (Recency, Frequency, Monetary) segmentation model using advanced SQL (Window Functions, CTEs, complex JOINs). Rather than stopping at generic dashboards, I designed targeted SQL queries to answer high-stakes business questions across five distinct departments: Operations, Marketing, Finance, Profitability and Strategy. The final deliverable translated these complex database outputs into clear, cross-functional strategic directives for executive leadership.
Full Screen

01 [Operations]
Carrier Performance & Geo-Churn Risk
Investigated the true cost of logistics failures by mapping late deliveries against the "At Risk" and "Hibernating" cohorts. By aggregating revenue tied directly to SLA breaches, this query generated a geographic heat map for supply chain prioritization. The output revealed that while São Paulo loses the most raw volume, states like Bahia (BA) suffer a much higher late-delivery churn rate (11.7%), pinpointing exactly where carrier renegotiations are most urgently needed.
02 [Marketing]
Gateway Category LTV Analysis
Shifted the marketing focus from raw GMV to customer acquisition quality. This script tracked initial product purchases to calculate downstream lifetime value and a custom "Loyalty Ratio" (Champions vs. one-time Hibernators). The SQL outputs uncovered that categories like "Watches & Gifts" act as powerful gateway funnels (converting at a massive 75.4% Loyalty Ratio), whereas other high-volume categories primarily acquire one-hit-wonders, fundamentally altering target ROAS and ad spend allocation.


03 [Finance]
Cash Flow vs. Loyalty Assessment
Assessed the financial risk of restructuring credit installment policies. By joining the dynamic RFM segments with payment data, the query revealed that our most valuable "Champions" are highly dependent on credit (80.7% revenue share) with over 28% utilizing 6+ month installment plans. This conclusively proved to the Finance team that restricting high-installment plans to improve short-term cash flow would disproportionately damage our best, most defensible customer relationships.
04 [Profitability]
Margin-Bleed from Freight Ratios
Conducted a line-item profitability audit to locate structural margin degradation. By calculating shipping costs as a percentage of the item price across all states and categories, this query isolated "red zone" transactions (where freight consumes >30% of the price). The data highlighted severe margin bleeds—such as electronics in Espírito Santo (ES) hitting 95% freight ratios—providing the pricing team with the exact data needed to adjust minimum order thresholds.


05 [Strategy]
The Silent Defector Phenomenon
Addressed the board-level question of why seemingly highly satisfied customers churn. This advanced query isolated defectors who experienced flawless operations (zero late deliveries) and left glowing feedback (4-5 star reviews). The staggering output proved that 76.8% of enterprise churn (R$3.93M) was driven by external market forces, not internal failures. This successfully pivoted the executive strategy away from operations optimization and toward competitive pricing, product exclusivity, and loyalty programs.
Dynamic RFM
Segmentation
Developed a dynamic Recency, Frequency, Monetary (RFM) model to categorize over 93,000 unique buyers into actionable cohorts. The SQL outputs isolated a massive structural insight: our dormant "At Risk" segment (14,329 customers) exactly mirrors our highest-value "Champions" in spending behavior. By identifying this specific cohort, the analysis exposed a R$4.49M revenue-recovery opportunity, allowing marketing to concentrate retention resources where they have the highest ROI.


Data Architecture
& Modeling
Architected a CRM-ready relational database schema from scratch using PostgreSQL. Engineered the core data model to ingest nine interconnected tables—including orders, payments, products, and geolocation data. By defining explicit primary and foreign key constraints, this DDL script enforces strict referential integrity across millions of rows, establishing a pristine, fully relational foundation required for advanced analytical querying.
Full Screen

View the Executive Summary
Strategic Intelligence Briefing
Data is only as valuable as the decisions it drives. Download the comprehensive executive briefing below to see exactly how these SQL queries were translated into a clear, bottom-line strategy document designed for cross-functional leadership teams.
