Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model request: internet router geo-location based on hostname #53

Open
fujiapple852 opened this issue Dec 19, 2024 · 2 comments
Open

Model request: internet router geo-location based on hostname #53

fujiapple852 opened this issue Dec 19, 2024 · 2 comments

Comments

@fujiapple852
Copy link

fujiapple852 commented Dec 19, 2024

Hi,

I am the author of Trippy (https://github.com/fujiapple852/trippy) which is a OSS traceroute/mtr like tool (which also uses Ratatui!).

I would like to propose a model which is trained on a combination of DNS and GeoIp data and can be used for geo-locating IPs based on hostnames (and perhaps AS names).

Trippy currently supports GeoIp lookup (i.e. IP address -> geo location) using mmdb database files from MaxMind and IPinfo. These databases are useful but are both incomplete and inaccurate for many IPs.

Another technique often used to geo-locate IPs is to lookup the reverse DNS hostname (and sometimes the AS name) as these often contain clues as to the location. For example, the hostname xe-11-1-0.edge1.NewYork1.Level3.net is likely to be located in New York.

Typically these are interpreted by humans eyeballing the hostnames, and sometimes these are fuzzy matched by tools against sets of known country/city codes and/or hostname formats. This approach is high maintenance and has limited utility. See slide 11-16 of this presentation for examples of the types of codes used in hostnames for internet routers.

I believe it may be possible to train a model to do this using the large quality of DNS and GoeIp data available.

A large data set could be created which contains the following:

Field Example
IP 171.64.64.64
Hostname CS.stanford.edu
AS Name AS32 STANFORD, US
Geo Location Los Altos, California, United States, North America

This could then be used to train a model with Burn which is able to take a hostname (or hostname + AS name) and predict the geo-location to the country/city level.

Don't see the model you want? Don't hesitate to open an issue, and we may prioritize it.

I don't know if this is of interest to anyone, but I though i'd try my luck and ask! If such a model were to exist I would be keen to integrate the functionality into Trippy.

@laggui
Copy link
Member

laggui commented Dec 19, 2024

Hey 👋

Thanks for your interest and for sharing Trippy - I had not heard of the project before. Looks really cool!

This looks like a really interesting application, and it definitely could be built with Burn 😄

However, this request goes a little beyond the intended scope 😅 Model requests are intended for existing architectures or pre-trained models that could be added relatively easily.

Building a custom model from scratch — especially without an existing dataset — would require a lot more effort for development and training.

I'm going to keep this open (who knows, a community-driven effort might arise around the idea), but just wanted to let you know that we are not actively going to work on this. If someone decides to lead the effort, this can serve as a discussion board 🙂 And don't hesitate to ask questions on discord! We'll be happy to help and follow the progress.

@fujiapple852
Copy link
Author

Thanks a lot for responding and for leaving this open, appreciate the clarity!

For anyone who sees this (and is motivated to try), I can certainly help preparing the dataset and with the integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants