Skip to content

Latest commit

 

History

History
13 lines (8 loc) · 2.96 KB

README.md

File metadata and controls

13 lines (8 loc) · 2.96 KB

Applied Machine-Learning for IP Network Analysis

With more than 20 years of experience in service provider IP network infrastructural delivery and architecture design, I’m interested in integrating forefront technologies—big data and machine learning, with IP network analysis amid various aspects such as Cyber Security, Routing Policy, Quality of Service, etc.

Despite limited data available in the past, nowadays, reams of open datasets have been released for network architects to build a data-driven intelligent network. I gather them together—so you needn’t find these needles from the open-source haystack, load them in sample codes, and develop a particular result for each. And provide you with an intuitive cognitive of the dataset and the value you could excavate from it.

As I was, coding is not easy for network engineers or administrators to understand. That’s why I use python--the most straightforward and prevalent using program language in data science, and publish the codes in Jupyter Notebook format with a step-by-step execution and interactive output. And I hope it helps inspire you to get the most significance from the data, no matter how much program experience you have.

AARYAN VERMA released this sample notebook on Kaggle that demonstrated how to apply machine-learning models above the dataset and implement an anomaly-based detection model. In this example, the assaulting classes were typically condensed into a single category called “Attack,” which allowed us to train a binary classifier to detect future network violence using specific attributes. And it achieved a better result with Random Forests, reaching 82% accuracy, compared to Logistic Regression. I combined the code and dataset so you could use your laptop to execute the notebook locally. For you to have a better grasp of the model performance, I also added a confusion matrix at the end of it.

Network engineers barely dig into source-code-level details about deploying the algorithm and selecting the shortest path in the routers and switches. I came across it again while doing a case study on choosing the most optimized flying airlines between different airports by graphic machine learning. You could discover different trip plans by various criteria, for instance, based on flying distance or time. And I was curious about how to achieve the SPF by coding. The notebook was divided into two sections. It first demonstrated the SPF algorithm (Dijkstra's Algorithm) mechanism with Python. Then it illustrated how to use a Python graphic library--NetworkX, to obtain the best-flying airline with different considerations in the other.