Deep DiveFeb 15, 2026Abcas Security Research

Popularity Is Not a Proxy for Lower Risk: What a 10-Server MCP Sample Showed

Across a 10-server MCP sample, download counts and inspection verdicts did not move together. Popularity was useful as a distribution signal, but not as evidence of lower observed risk.

Terminology

Term	Meaning
Download count	Weekly npm download volume; a distribution metric
Inspection verdict	PASS / WARN / BLOCK based on inspection evidence
Risk proxy	An indirect signal used instead of direct security evidence
Provenance	Whether the publisher and distribution path can be verified

Lead

In real deployment decisions, teams often treat download count or GitHub stars as a shortcut for trust. The logic is familiar: if many people use a package, someone would have noticed serious problems by now.

This article examines a smaller but more useful question: when we looked at 10 MCP servers in detail, did popularity actually line up with inspection verdicts? In this sample, it did not. The important implication is not that popularity is meaningless, but that it cannot substitute for direct evidence about risk.

Key Findings

The 10-server sample ranged from 24 to 309,704 weekly downloads, a gap of about 12,900x.
No consistent relationship was observed between download count and inspection verdict.
A server with 300,000+ weekly downloads still received a WARN verdict.
Some low-download servers received PASS verdicts.
Popularity may expand distribution, but it can also increase attacker incentive.

Dataset

Item	Value
Sample size	10 MCP servers
Observation window	2026-03-26 to 2026-03-27
Metrics compared	Weekly downloads and final inspection verdict
Scope	Exploratory sample, not a formal correlation study

What We Actually Observed

The sample showed wide variation in popularity:

Metric	Value
Highest weekly downloads	309,704
Lowest weekly downloads	24
Gap	~12,900x

The corresponding inspection verdicts were:

Verdict	Count	Ratio
PASS	3	30%
WARN	6	60%
BLOCK	1	10%

The practical point is simple. High downloads did not map cleanly to lower-risk verdicts, and low downloads did not automatically map to higher-risk ones.

Observed examples:

A high-download server still landed at WARN.
Some low-download servers landed at PASS.
The ranking by downloads did not line up with the ranking by inspection outcome.

That is enough to reject the operational shortcut that "popular" can stand in for "already vetted."

Why Popularity Fails as a Risk Proxy

1. Downloads measure reach, not review quality

A download count answers "how often was this package installed?" It does not answer:

Was the package security-reviewed?
Was its runtime behavior verified?
Was the publisher identity checked?
Did users inspect the dangerous parts or just install it in CI and move on?

Distribution and review are different things.

2. MCP servers widen the consequence of trust mistakes

For ordinary libraries, compromise usually lives inside an application's existing permission boundary. MCP servers are different because they expose tools and runtime operations to AI agents. That means a trust mistake can translate directly into file access, outbound requests, or command execution.

So even if popularity did offer weak reassurance in a traditional package ecosystem, it is an even weaker proxy in the MCP context.

3. Popularity can increase attacker interest

Widely used packages are attractive because compromise scales efficiently. A takeover of a high-distribution package can deliver more impact per unit of effort than compromising a niche tool.

From that angle, popularity is not only unhelpful as a safety label. In some cases, it is part of the attacker's targeting logic.

What To Evaluate Instead

If download count is not enough, what should teams actually look at?

Provenance: Is the publisher identity verifiable, and does the distribution path make sense?
Declaration vs. behavior: Does the server do what it claims to do?
Historical inspection evidence: Has the server repeatedly passed or required review?
Dependency health: Are there known issues in the libraries it pulls in?
Capability profile: Does it combine execution, outbound transmission, or filesystem access in risky ways?

These are not popularity signals. They are direct inputs into deployment risk.

What This Article Does Not Claim

This article does not prove that popularity and risk are never related in any larger population. The sample is too small for that.

It supports a narrower and more practical conclusion:

In this 10-server sample, popularity was not a reliable guide to inspection outcome.
Therefore, popularity should not be used as a deployment shortcut.

That is a strong enough operational conclusion on its own.

Limitations

The sample size is 10, so this is not a formal statistical correlation study.
Only weekly download count was compared; other popularity signals such as stars or forks were not analyzed here.
The sample is exploratory and not designed as a representative census of the full MCP ecosystem.
Individual server names are omitted because the purpose is to explain the decision model, not to single out packages.

Conclusion

Across this 10-server MCP sample, download count did not function as a reliable proxy for lower observed risk. Popularity explained distribution, not inspection outcome.

For real deployment decisions, teams should stop asking "How popular is it?" as the first security question and instead ask "What evidence do we actually have?" Provenance, observed behavior, historical inspection results, dependency health, and capability concentration are the signals that matter.

MCP Guard evaluates MCP servers from direct inspection evidence, not popularity metrics.