4.3 KiB
4.3 KiB
Market Research & Data Co-op System Summary
Overview: This document summarizes the "Business in a Box" market research pipeline built into the Vibn platform. It allows the AI to autonomously identify target markets, scrape leads, analyze competitor technology stacks, and pull SEO/Ad spend data to generate a complete Go-To-Market (GTM) strategy for users.
1. BigQuery Database Schema (vibn_market_data)
The data foundation is a highly scalable, relational model hosted in Google BigQuery (Montreal region for data residency):
gbp_categories: 4,000+ Google Business Profile categories (e.g.,gcid:dentist).software_categories: 800+ SMB-relevant software categories (e.g.,dental-practice-management).gbp_software_links: A junction table linking Main Street business types to the software they buy (19,000+ mapped rows).market_leads: The "Data Co-op" table containing exact geospatial leads (name, address, phone, website, emails).software_providers: Proprietary SaaS competitors mapped to software categories (e.g., "Curve Dental").open_source_repos: MIT/Apache licensed GitHub starter kits mapped to software categories.
2. MCP Tools Added (lib/ai/vibn-tools.ts)
market_research_run
- Purpose: Fetches the exact Total Addressable Market (TAM) counts and extracts the raw lead data (emails, addresses, phones) for a specific category and location.
- Data Source: DataForSEO Business Listings Live API (using the
search/liveendpoint). - Quality Control: Automatically applies strict filters (
is_claimed: trueandcurrent_status <> "closed_forever") to ensure only verified, active businesses are returned. - Guardrails:
- Requires explicit user permission (
user_explicitly_approved: true). - Geospatial Caching: Queries BigQuery using PostGIS (
ST_DWithin) first. If leads exist within a 20km radius of the target coordinates, it serves them for $0.00 instead of hitting the paid API.
- Requires explicit user permission (
- Data Co-op: Any newly fetched leads are automatically
INSERTed into the BigQuerymarket_leadstable.
tech_stack_analyze
- Purpose: A free, native alternative to BuiltWith. Scans a list of URLs (up to 100) to determine what software, CMS, and tracking tools they use.
- Intelligent Spidering: Loads the homepage, extracts high-intent links (
/book,/contact), and dynamically crawls depth-2 subpages to find hidden booking widgets or portals. - Dynamic Competitor Injection: Reads the
software_category_id, pulls all known competitors from BigQuery, and dynamically searches the target websites' source code for traces of those competitors. - Custom Checks: Allows the AI to pass a
custom_checksarray of custom strings/domains to look for on the fly.
market_aggregate_insights
- Purpose: Fetches aggregated insights for a specific market niche to uncover qualitative data before building a product.
- Data Source: DataForSEO Categories Aggregation Live API.
- Output: Returns a breakdown of sub-niches (e.g., Pediatric vs Cosmetic), the total number of businesses with/without websites (technical debt indicator), and crucially, the Top Customer Review Topics (e.g., "receptionist", "price", "wait time"). The AI uses these pain points to write the Value Proposition and positioning strategy.
market_seo_analyze
- Purpose: Analyzes a competitor's domain for SEO and Google Ads metrics.
- Data Source: DataForSEO Labs (Domain Metrics & Ranked Keywords APIs).
- Output: Returns estimated organic traffic, paid Google Ads traffic, estimated monthly Ad Spend (USD), and their top paid keywords.
3. The "Business in a Box" Workflow
When a founder asks to build software for a specific niche (e.g., "Dentists in BC"):
- TAM & Leads: The AI runs
market_research_runto get the Total Addressable Market and real contact info. - Competitor Teardown: The AI identifies incumbents and runs
market_seo_analyzeto see their Ad Spend. - Wedge Discovery: The AI runs
tech_stack_analyzeon the leads to find technological gaps (e.g., "70% use WordPress but lack a booking widget"). - Plan Generation: The AI writes a business plan to the dashboard, including a financial model, compliance warnings, a wedge strategy, and cold-email scripts.