# Market Research & Data Co-op System Summary > **Overview:** This document summarizes the "Business in a Box" market research pipeline built into the Vibn platform. It allows the AI to autonomously identify target markets, scrape leads, analyze competitor technology stacks, and pull SEO/Ad spend data to generate a complete Go-To-Market (GTM) strategy for users. ## 1. BigQuery Database Schema (`vibn_market_data`) The data foundation is a highly scalable, relational model hosted in Google BigQuery (Montreal region for data residency): * **`gbp_categories`**: 4,000+ Google Business Profile categories (e.g., `gcid:dentist`). * **`software_categories`**: 800+ SMB-relevant software categories (e.g., `dental-practice-management`). * **`gbp_software_links`**: A junction table linking Main Street business types to the software they buy (19,000+ mapped rows). * **`market_leads`**: The "Data Co-op" table containing exact geospatial leads (name, address, phone, website, emails). * **`software_providers`**: Proprietary SaaS competitors mapped to software categories (e.g., "Curve Dental"). * **`open_source_repos`**: MIT/Apache licensed GitHub starter kits mapped to software categories. ## 2. MCP Tools Added (`lib/ai/vibn-tools.ts`) ### `market_research_run` * **Purpose:** Fetches a list of real-world business leads for a specific category and location. * **Data Source:** DataForSEO Business Listings Live API. * **Guardrails:** * Requires explicit user permission (`user_explicitly_approved: true`). * **Geospatial Caching:** Queries BigQuery using PostGIS (`ST_DWithin`) first. If leads exist within a 20km radius of the target coordinates, it serves them for $0.00 instead of hitting the paid API. * **Data Co-op:** Any newly fetched leads are automatically `INSERT`ed into the BigQuery `market_leads` table. ### `tech_stack_analyze` * **Purpose:** A free, native alternative to BuiltWith. Scans a list of URLs (up to 100) to determine what software, CMS, and tracking tools they use. * **Intelligent Spidering:** Loads the homepage, extracts high-intent links (`/book`, `/contact`), and dynamically crawls depth-2 subpages to find hidden booking widgets or portals. * **Dynamic Competitor Injection:** Reads the `software_category_id`, pulls all known competitors from BigQuery, and dynamically searches the target websites' source code for traces of those competitors. * **Custom Checks:** Allows the AI to pass a `custom_checks` array of custom strings/domains to look for on the fly. ### `market_seo_analyze` * **Purpose:** Analyzes a competitor's domain for SEO and Google Ads metrics. * **Data Source:** DataForSEO Labs (Domain Metrics & Ranked Keywords APIs). * **Output:** Returns estimated organic traffic, paid Google Ads traffic, estimated monthly Ad Spend (USD), and their top paid keywords. ## 3. The "Business in a Box" Workflow When a founder asks to build software for a specific niche (e.g., "Dentists in BC"): 1. **TAM & Leads:** The AI runs `market_research_run` to get the Total Addressable Market and real contact info. 2. **Competitor Teardown:** The AI identifies incumbents and runs `market_seo_analyze` to see their Ad Spend. 3. **Wedge Discovery:** The AI runs `tech_stack_analyze` on the leads to find technological gaps (e.g., "70% use WordPress but lack a booking widget"). 4. **Plan Generation:** The AI writes a business plan to the dashboard, including a financial model, compliance warnings, a wedge strategy, and cold-email scripts.