Model Explainability¶
Understanding how Chain Sentinel's AI models make predictions.
Overview¶
Chain Sentinel uses explainable AI (XAI) techniques to show you why a token was classified as SCAM or LEGIT, not just the prediction itself.
Why Explainability Matters¶
Trust & Transparency¶
- See which features influenced the decision
- Understand the model's reasoning
- Verify predictions make sense
- Build confidence in AI decisions
Better Decision Making¶
- Identify key risk factors
- Understand token weaknesses
- Learn what makes tokens legitimate
- Make informed investment choices
Model Improvement¶
- Detect model biases
- Identify missing features
- Validate model logic
- Improve accuracy over time
SHAP Explanations¶
What is SHAP?¶
SHAP (SHapley Additive exPlanations) is a game-theoretic approach to explain machine learning predictions.
Key Concepts: - Feature Contribution: How much each feature pushed the prediction toward SCAM or LEGIT - Positive Values: Push toward LEGIT (blue bars) - Negative Values: Push toward SCAM (red bars) - Magnitude: Larger bars = stronger influence
Reading SHAP Values¶
Example SHAP Explanation:
Feature SHAP Value Direction
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
holder_count +0.15 → LEGIT
liquidity_usd +0.12 → LEGIT
transaction_count +0.08 → LEGIT
top_10_concentration -0.23 → SCAM
creator_risk_score -0.18 → SCAM
age_days +0.05 → LEGIT
Interpretation: - top_10_concentration (-0.23): High concentration of tokens in top 10 holders strongly suggests SCAM - creator_risk_score (-0.18): Creator has history of scams, pushes toward SCAM - holder_count (+0.15): Many holders is a positive sign, pushes toward LEGIT - liquidity_usd (+0.12): Good liquidity is positive, pushes toward LEGIT
SHAP Waterfall Plot¶
Visual representation showing how features combine to reach final prediction:
Base Value (50%)
↓ +15% (holder_count)
↓ +12% (liquidity_usd)
↓ +8% (transaction_count)
↓ +5% (age_days)
↓ -23% (top_10_concentration)
↓ -18% (creator_risk_score)
↓
Final Prediction: 49% (SCAM)
Feature Importance¶
Global Feature Importance¶
Which features matter most across all predictions:
| Rank | Feature | Importance | Description |
|---|---|---|---|
| 1 | top_10_concentration | 18.5% | Token distribution |
| 2 | creator_risk_score | 15.2% | Creator reputation |
| 3 | liquidity_usd | 12.8% | Available liquidity |
| 4 | holder_count | 11.3% | Number of holders |
| 5 | transaction_count | 9.7% | Trading activity |
Local Feature Importance¶
For a specific token, which features mattered most:
Example: SCAM Token
Feature Impact
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
top_10_concentration ████████████ 45%
creator_risk_score ████████ 30%
liquidity_usd ████ 15%
holder_count ██ 10%
Model Comparison¶
XGBoost vs GNN v2¶
| Aspect | XGBoost | GNN v2 |
|---|---|---|
| Accuracy | 95.59% | 96.06% |
| Explainability | ✅ SHAP values | ❌ Limited |
| Features | 20+ metrics | Graph structure |
| Speed | Fast (< 1s) | Slower (2-3s) |
| Best For | Individual tokens | Network analysis |
When to Trust Each Model¶
Trust XGBoost when: - You need to understand WHY - Token has clear metrics (holders, liquidity, etc.) - You want feature-level insights - Making investment decisions
Trust GNN v2 when: - Accuracy is critical - Token is part of complex network - Creator has multiple related tokens - Detecting scam rings
Feature Descriptions¶
Token Metrics¶
holder_count - Number of unique wallet addresses holding the token - Higher = more distributed = more legitimate - Scams often have few holders (< 100)
transaction_count - Total number of transactions involving the token - Higher = more activity = more legitimate - Scams often have low activity after initial pump
liquidity_usd - Total liquidity available in DEX pools (USD) - Higher = easier to buy/sell = more legitimate - Scams often have low liquidity (< $10K)
top_10_concentration - Percentage of supply held by top 10 holders - Lower = more distributed = more legitimate - Scams often have high concentration (> 80%)
age_days - Days since token was created - Older = more established = more legitimate - Scams often rug within 48 hours
Creator Metrics¶
creator_risk_score - Risk score of wallet that deployed the token (0-100) - Lower = safer creator = more legitimate - Based on creator's history of scams
creator_tokens_count - Number of tokens created by this wallet - Many tokens can be good (successful dev) or bad (serial rugger) - Context matters: check scam_rate
creator_scam_rate - Percentage of creator's tokens that were scams - Lower = trustworthy creator = more legitimate - > 50% = high risk creator
Network Metrics¶
cluster_size - Number of wallets in same cluster as creator - Large clusters can indicate scam rings - Legitimate projects may also have large teams
cluster_scam_rate - Percentage of tokens from this cluster that were scams - High rate = dangerous cluster - Used by GNN v2 model
Practical Examples¶
Example 1: Clear SCAM¶
Token: SCAM (SCAM)
Prediction: SCAM (98% confidence)
Model: XGBoost
Top Contributing Features:
1. top_10_concentration: 95% (-0.35) → SCAM
"Top 10 holders own 95% of supply"
2. creator_risk_score: 85 (-0.28) → SCAM
"Creator has 87% scam rate (13/15 tokens)"
3. liquidity_usd: $500 (-0.15) → SCAM
"Very low liquidity, hard to sell"
4. age_days: 0.5 (-0.08) → SCAM
"Token created 12 hours ago"
Recommendation: AVOID - Multiple critical red flags
Example 2: Clear LEGIT¶
Token: BONK (BONK)
Prediction: LEGIT (92% confidence)
Model: XGBoost
Top Contributing Features:
1. holder_count: 125,000 (+0.25) → LEGIT
"Large, distributed holder base"
2. liquidity_usd: $2.5M (+0.18) → LEGIT
"Excellent liquidity"
3. transaction_count: 5M (+0.15) → LEGIT
"High trading activity"
4. top_10_concentration: 28% (+0.12) → LEGIT
"Fair distribution"
Recommendation: SAFE - Strong fundamentals
Example 3: Uncertain Case¶
Token: MOON (MOON)
Prediction: SCAM (62% confidence)
Model: XGBoost
Top Contributing Features:
1. creator_risk_score: 45 (-0.12) → SCAM
"Creator has 1 previous scam (1/3 tokens)"
2. top_10_concentration: 55% (-0.10) → SCAM
"Moderate concentration"
3. holder_count: 500 (+0.08) → LEGIT
"Decent holder base"
4. liquidity_usd: $50K (+0.06) → LEGIT
"Adequate liquidity"
Recommendation: CAUTION - Mixed signals, do more research
Limitations¶
SHAP Limitations¶
Important Considerations
- Only for XGBoost: GNN v2 predictions don't have SHAP values
- Correlation ≠ Causation: High correlation doesn't prove causation
- Feature Interactions: SHAP shows individual features, not complex interactions
- Data Quality: Explanations are only as good as the data
Model Limitations¶
XGBoost: - Doesn't capture network relationships - May miss coordinated scam rings - Relies on feature engineering
GNN v2: - Less explainable (black box) - Requires more data - Slower inference
Best Practices¶
Using Explanations¶
Do's
- ✅ Read SHAP values to understand predictions
- ✅ Look for multiple red flags, not just one
- ✅ Consider feature magnitudes, not just direction
- ✅ Cross-reference with network graph
- ✅ Use explanations to learn about scam patterns
Don'ts
- ❌ Don't rely on single feature
- ❌ Don't ignore low-confidence predictions
- ❌ Don't assume model is always right
- ❌ Don't skip manual verification
- ❌ Don't invest based solely on AI
Support¶
Questions about model explainability?
- 📧 Email: support@chainsentinel.net
- 💬 Telegram: @chainsentinel_net
- 📖 FAQ: Frequently Asked Questions