Files
vibn-frontend/docs/MARKET_RESEARCH_TOOLS_SUMMARY.md

3.5 KiB

Market Research & Data Co-op System Summary

Overview: This document summarizes the "Business in a Box" market research pipeline built into the Vibn platform. It allows the AI to autonomously identify target markets, scrape leads, analyze competitor technology stacks, and pull SEO/Ad spend data to generate a complete Go-To-Market (GTM) strategy for users.

1. BigQuery Database Schema (vibn_market_data)

The data foundation is a highly scalable, relational model hosted in Google BigQuery (Montreal region for data residency):

  • gbp_categories: 4,000+ Google Business Profile categories (e.g., gcid:dentist).
  • software_categories: 800+ SMB-relevant software categories (e.g., dental-practice-management).
  • gbp_software_links: A junction table linking Main Street business types to the software they buy (19,000+ mapped rows).
  • market_leads: The "Data Co-op" table containing exact geospatial leads (name, address, phone, website, emails).
  • software_providers: Proprietary SaaS competitors mapped to software categories (e.g., "Curve Dental").
  • open_source_repos: MIT/Apache licensed GitHub starter kits mapped to software categories.

2. MCP Tools Added (lib/ai/vibn-tools.ts)

market_research_run

  • Purpose: Fetches a list of real-world business leads for a specific category and location.
  • Data Source: DataForSEO Business Listings Live API.
  • Guardrails:
    • Requires explicit user permission (user_explicitly_approved: true).
    • Geospatial Caching: Queries BigQuery using PostGIS (ST_DWithin) first. If leads exist within a 20km radius of the target coordinates, it serves them for $0.00 instead of hitting the paid API.
  • Data Co-op: Any newly fetched leads are automatically INSERTed into the BigQuery market_leads table.

tech_stack_analyze

  • Purpose: A free, native alternative to BuiltWith. Scans a list of URLs (up to 100) to determine what software, CMS, and tracking tools they use.
  • Intelligent Spidering: Loads the homepage, extracts high-intent links (/book, /contact), and dynamically crawls depth-2 subpages to find hidden booking widgets or portals.
  • Dynamic Competitor Injection: Reads the software_category_id, pulls all known competitors from BigQuery, and dynamically searches the target websites' source code for traces of those competitors.
  • Custom Checks: Allows the AI to pass a custom_checks array of custom strings/domains to look for on the fly.

market_seo_analyze

  • Purpose: Analyzes a competitor's domain for SEO and Google Ads metrics.
  • Data Source: DataForSEO Labs (Domain Metrics & Ranked Keywords APIs).
  • Output: Returns estimated organic traffic, paid Google Ads traffic, estimated monthly Ad Spend (USD), and their top paid keywords.

3. The "Business in a Box" Workflow

When a founder asks to build software for a specific niche (e.g., "Dentists in BC"):

  1. TAM & Leads: The AI runs market_research_run to get the Total Addressable Market and real contact info.
  2. Competitor Teardown: The AI identifies incumbents and runs market_seo_analyze to see their Ad Spend.
  3. Wedge Discovery: The AI runs tech_stack_analyze on the leads to find technological gaps (e.g., "70% use WordPress but lack a booking widget").
  4. Plan Generation: The AI writes a business plan to the dashboard, including a financial model, compliance warnings, a wedge strategy, and cold-email scripts.