Methodology

OmniWebIntel is self-powered. We do not call paid intelligence APIs (Similarweb, Ahrefs, Semrush, Moz, BuiltWith, Wappalyzer, DataForSEO, PageSpeed, etc). Every number on a report is something we either directly observed or computed ourselves from observations.

Confidence labels

Every metric in a report carries a label so you always know what kind of evidence backs it:

  • live — directly observed in this scan (DNS records, response headers, parsed HTML).
  • discovered — found in our own crawler index (e.g. backlinks from another site we have crawled).
  • estimated — produced by our own model (always presented as a range, never an exact number).
  • inferred — a likely-conclusion from signals, not a measurement.
  • unavailable — we tried, we cannot know without owner cooperation.

Traffic estimation

Our self-powered traffic model (v1) takes seven inputs and produces a monthly visit range with explicit confidence:

  1. SEO score (technical + content quality) — weight 25%
  2. Content depth (word count) — weight 18%
  3. Internal architecture (link graph) — weight 7%
  4. Tech quality (modern frameworks, analytics, CDN) — weight 10%
  5. Backlink count from our own index — weight 20%
  6. Referring domains from our own index — weight 12%
  7. Domain age — weight 8%

We map a 0–1 base score to monthly visits via an exponential curve and widen the confidence band based on how many strong signals we have. Confidence is capped at 0.7 — we never claim high confidence on traffic without a rich backlink index.

Valuation

Valuation is computed from the traffic estimate, the inferred primary monetization model, and a multiple bracket adjusted for SEO quality, security posture, TLD strength, and domain age. Multiples are intentionally wide. This is not a financial appraisal. Real sale value depends on audited revenue, audited profit, and buyer fit — none of which we can know without owner cooperation.

Crawler ethics

Our crawler identifies itself as OmniWebIntelBot, respects robots.txt, rate-limits to 1 request/second/domain by default, and never attempts to authenticate or evade rate limiting. We never probe /wp-admin, /xmlrpc.php, or /wp-login.php — WordPress detection is purely passive from the public HTML the site itself emits. We never run port scans, directory enumeration, or credential probing.

SSRF protection

Every fetch resolves DNS and validates that all returned IPs are public before connecting. We block RFC 1918 ranges, loopback, link-local, cloud metadata endpoints (including 169.254.169.254), and CGNAT — and we re-check on every redirect hop to prevent DNS rebinding attacks.

AI commentary

AI-generated commentary uses a model from Anthropic. The model only sees the structured evidence packet we built from this scan — it does not access the open web. It is explicitly instructed to never invent traffic, revenue, backlink counts, or rankings, and to treat any unavailable metric as unavailable rather than guess.