Case Study

MenuPortal
Nationwide Menu API

A restaurant menu database indexing 800K+ restaurants and 60M+ menu items—built from scratch with infrastructure costs 90% lower than incumbent APIs.

Node.jsElasticsearchReactStripePuppeteerREST API
🏪
0+
Restaurants
📋
0+
Menu Items
<150ms
API Latency
🔄
6hrs
Refresh Cycle
💰
90%
Cost Savings
The Challenge

Why Existing APIs Didn't Work

The only viable API charged $0.02 per call—$16,000 just to populate the dataset once.

01

Data Volume at Scale

Collect and manage data from 800K+ restaurants and 60M+ menu items without collapsing under storage or processing costs.

02

Data Freshness

Keep menus current by continuously detecting new restaurants and updating existing menus on a regular cycle.

03

Query Performance

Deliver search results fast enough to power real user-facing applications despite the dataset's size.

04

Cost Efficiency

Avoid infrastructure costs that would push the project into the same pricing territory as incumbents.

"Can we design an end-to-end system—data collection, search, and API—that delivers national-scale menu data at a price point early-stage products can actually use?"

Architecture

Owning the Pipeline End-to-End

Instead of paying per request, I built a system that scrapes, normalizes, indexes, and serves menu data at national scale.

🌐
Data Sources
⚙️
Scrapers
🔄
Normalization
🔍
Elasticsearch
📡
REST API
const pipeline = {
  collection: 'Distributed Node.js workers',
  storage: 'Elasticsearch indices',
  refresh: '~6 hour full cycle',
  api: 'Express.js REST endpoints'
};

Data Collection Strategy

  • Node.js backend orchestrating multiple scraping workers
  • Each worker handles a subset of restaurants
  • Horizontal scaling to meet refresh time targets
  • Continuous detection of new restaurants

Search & Storage Strategy

  • Elasticsearch for ultra-fast search over 60M+ items
  • Real-time indexing for immediate searchability
  • Optimized queries to reduce infrastructure load
  • Full dataset refresh every ~6 hours
Product

API, Dashboard & Monetization

Turning the data pipeline into a developer-facing product with self-service access control and subscription management.

📡

High-Performance REST API

  • Serve menu data at scale to thousands of users
  • Response times under 150ms typical
  • Complex queries under 400ms
  • Clear endpoints for search and retrieval
💳

Access & Monetization

  • Stripe subscriptions for plan management
  • Tiered pricing linked to API quotas
  • Proprietary request tracker for quotas
  • Transparent usage logs for users
📊

React Dashboard

  • Self-service subscription management
  • API request history and analytics
  • Real-time menu search and filtering
  • Elasticsearch-powered instant results

API Response Times

Standard Queries
<150ms
Complex Queries
<400ms
Results

Outcomes & Lessons Learned

What this project achieved and what it demonstrates about building scalable, cost-efficient data infrastructure.

800K+
Restaurants
60M+
Menu Items
6hr
Refresh
<150ms
Latency
90%
Cost Savings

Infrastructure Cost Comparison

Third-Party API
$16,000+
MenuPortal
~90% less

Skills Demonstrated

  • Scalable, high-performance APIs over large datasets
  • Distributed data collection balancing freshness and cost
  • Monetization and access control baked into architecture
  • Pragmatic tradeoffs between speed, cost, and reliability

Lessons & Future Directions

  • Tune refresh cycles based on actual usage and customer value
  • Continuously explore alternative cloud options for cost optimization
  • Expand beyond self-serve into enterprise licensing