
In the world of Internet infrastructure, an IP address is often just a number until it’s mapped to a physical location. However, static databases are notoriously brittle. During the AIORI-2 Hackathon, team HexaSentinel from the Heritage Institute of Technology developed GeoNex—a city-level IP geolocation system that moves beyond simple lookups by using Machine Learning and real-time network measurements calibrated against a suite of IETF RFCs.
1. The Core Architecture: Verifiable Geolocation
GeoNex doesn’t just guess where an IP is; it uses a “supervised” approach. By implementing RFC 2330 (Framework for Internet Measurement) and RFC 2681 (Round-Trip Delay Metric), we feed real-time network performance data into our ML models. This allows the system to verify if a self-published location (from RFC 8805) actually matches the physical reality of signal travel time.
Key Protocol Integrations:
- Active Probing (RFC 792/4443): We built a high-performance Go-based manager to send ICMP probes and collect RTT (Round-Trip Time) samples.
- Data Enrichment (RFC 9081–9083): Integrating RDAP allowed us to pull authoritative ownership and registration data directly into our feature set.
- Routing Context (RFC 4271): By extracting BGP origin ASNs, GeoNex understands the “neighborhood” of an IP, improving prediction accuracy even for previously unseen addresses.
2. ML Calibration: Moving from Points to Radii
One of the biggest failures of current geolocation is the lack of “trust metrics.” GeoNex addresses this by providing a Confidence Radius. Instead of just saying “Kolkata,” the system uses Isotonic Regression to calculate a radius (e.g., 35 km) within which the IP is statistically likely to reside.
| Metric | Result | Operational Insight |
|---|---|---|
| City-Level Accuracy | 77% | Significantly outperforms static database baselines (approx. 63%). |
| Median Geo Error | 35 km | High precision for city-level infrastructure planning. |
| Ground Truth Coverage | 92% | Validates that our “Confidence Radius” is statistically reliable. |
| Inference Latency | Fast (Go/FastAPI) | Optimized for real-time network monitoring and fraud detection. |
3. Technical Implementation & Sprints
The project was executed across four intensive sprints, moving from raw data acquisition to a fully dockerized MLOps pipeline.
- Ingestion: Mapping GeoLite2, RIPE Atlas, and RDAP data into a unified GeoJSON format.
- Model Training: Utilizing LightGBM to handle high-dimensional network features.
- Calibration: Applying isotonic regression to ensure predicted probabilities match real-world accuracy rates.
- Deployment: Exposing the model via FastAPI for easy integration into existing AIORI measurement nodes.
4. Challenges & Lessons Learned
Our primary hurdle was RTT normalization. Network latency varies wildly based on congestion and asymmetric routing. By adhering to RFC 7679 (One-way Delay Metric) principles, we learned to filter jitter and focus on the “minimum RTT,” which more accurately represents the physical distance.
“Data reproducibility isn’t optional—it’s Internet infrastructure hygiene. GeoNex proves that when you anchor ML in IETF standards, the results become auditable and trustworthy.” — Team HexaSentinel
5. Future Work: Standards Contribution
The team is currently drafting an Internet-Draft for the IETF IPPM Working Group titled “Confidence Metrics for City-Level IP Geolocation.” Our goal is to standardize how ML models report uncertainty, making it easier for network operators to use these tools in security and routing decisions.
Read the full report