top of page
Top - SQL Upscale.jpeg

Olist E-Commerce Analytics & Customer Segmentation [SQL]

The Business Question: Who are our most valuable customers, and what hidden operational, financial, and strategic factors are driving our critical revenue churn?

​

Project Overview: This project represents a comprehensive, end-to-end data pipeline and strategic analysis of a massive e-commerce dataset (93,000+ customers, R$15.4M in delivered revenue). Utilizing PostgreSQL, I engineered a robust relational database and developed a dynamic RFM (Recency, Frequency, Monetary) segmentation model using advanced SQL (Window Functions, CTEs, complex JOINs). Rather than stopping at generic dashboards, I designed targeted SQL queries to answer high-stakes business questions across five distinct departments: Operations, Marketing, Finance, Profitability and Strategy. The final deliverable translated these complex database outputs into clear, cross-functional strategic directives for executive leadership.

Full Screen

Black-Arrow-PNG-HD-Image.png
1.png

01 [Operations]

Carrier Performance & Geo-Churn Risk

Investigated the true cost of logistics failures by mapping late deliveries against the "At Risk" and "Hibernating" cohorts. By aggregating revenue tied directly to SLA breaches, this query generated a geographic heat map for supply chain prioritization. The output revealed that while São Paulo loses the most raw volume, states like Bahia (BA) suffer a much higher late-delivery churn rate (11.7%), pinpointing exactly where carrier renegotiations are most urgently needed.

02 [Marketing]

Gateway Category LTV Analysis

Shifted the marketing focus from raw GMV to customer acquisition quality. This script tracked initial product purchases to calculate downstream lifetime value and a custom "Loyalty Ratio" (Champions vs. one-time Hibernators). The SQL outputs uncovered that categories like "Watches & Gifts" act as powerful gateway funnels (converting at a massive 75.4% Loyalty Ratio), whereas other high-volume categories primarily acquire one-hit-wonders, fundamentally altering target ROAS and ad spend allocation.

2.png
3.png

03 [Finance]

Cash Flow vs. Loyalty Assessment

Assessed the financial risk of restructuring credit installment policies. By joining the dynamic RFM segments with payment data, the query revealed that our most valuable "Champions" are highly dependent on credit (80.7% revenue share) with over 28% utilizing 6+ month installment plans. This conclusively proved to the Finance team that restricting high-installment plans to improve short-term cash flow would disproportionately damage our best, most defensible customer relationships.

04 [Profitability]

Margin-Bleed from Freight Ratios

Conducted a line-item profitability audit to locate structural margin degradation. By calculating shipping costs as a percentage of the item price across all states and categories, this query isolated "red zone" transactions (where freight consumes >30% of the price). The data highlighted severe margin bleeds—such as electronics in Espírito Santo (ES) hitting 95% freight ratios—providing the pricing team with the exact data needed to adjust minimum order thresholds.

4.png
5.png

05 [Strategy]

The Silent Defector Phenomenon

Addressed the board-level question of why seemingly highly satisfied customers churn. This advanced query isolated defectors who experienced flawless operations (zero late deliveries) and left glowing feedback (4-5 star reviews). The staggering output proved that 76.8% of enterprise churn (R$3.93M) was driven by external market forces, not internal failures. This successfully pivoted the executive strategy away from operations optimization and toward competitive pricing, product exclusivity, and loyalty programs.

Dynamic RFM 

Segmentation

Developed a dynamic Recency, Frequency, Monetary (RFM) model to categorize over 93,000 unique buyers into actionable cohorts. The SQL outputs isolated a massive structural insight: our dormant "At Risk" segment (14,329 customers) exactly mirrors our highest-value "Champions" in spending behavior. By identifying this specific cohort, the analysis exposed a R$4.49M revenue-recovery opportunity, allowing marketing to concentrate retention resources where they have the highest ROI.

RFM.png
DDL.png

Data Architecture

& Modeling

Architected a CRM-ready relational database schema from scratch using PostgreSQL. Engineered the core data model to ingest nine interconnected tables—including orders, payments, products, and geolocation data. By defining explicit primary and foreign key constraints, this DDL script enforces strict referential integrity across millions of rows, establishing a pristine, fully relational foundation required for advanced analytical querying.

Full Screen

Black-Arrow-PNG-HD-Image.png

View the Executive Summary

Strategic Intelligence Briefing 

Data is only as valuable as the decisions it drives. Download the comprehensive executive briefing below to see exactly how these SQL queries were translated into a clear, bottom-line strategy document designed for cross-functional leadership teams.

Bottom Front - SQL upscalled.jpeg
bottom of page