Skip to content

Commit

Permalink
Update node_representation_learning.md
Browse files Browse the repository at this point in the history
in progress
  • Loading branch information
robertdhayanturner authored Jan 2, 2024
1 parent da0af71 commit 94c6346
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions docs/use_cases/node_representation_learning.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@

## Introduction

Of the different types of information - words, pictures, and connections between things - relationships in particular are interesting because they show how things interact and create networks. Using vectors, we can take advantage of relationship data to understand and describe things that exist in networks better.
Of the different types of information - words, pictures, and connections between things - relationships in particular are interesting; they show how things interact and create networks.
BoW approach
Using vectors, we can take advantage of relationship data to understand and describe things that exist in networks better.

Let's examine ...
a real-life example of how entities can be turned into vectors using their connections, a common practice in machine learning, to perform a classification problem...
Expand Down Expand Up @@ -58,9 +60,9 @@ Additionally, we wanted to see if citations show up in the BoW features. So, we

In this plot, we divided the groups (shown on the y-axis) to have about the same number of pairs in each. The only exception was the 0-0.04 group, where lots of pairs had no similar words - they couldn't be split into smaller groups.

From the plot, it's clear that connected nodes usually have higher cosine similarities. This means papers that cite each other often use similar words. But when we ignore zero similarities, papers that have not cited each other seem to have a wide range of common words.
From the plot, it's clear that connected nodes usually have higher cosine similarities. This means papers that cite each other often use similar words. But when we ignore zero similarities, papers that have not cited each other also seem to have a wide range of common words.

Even though some information about the connectivity is present in the BoW features, it is not sufficient to reconstruct the citation graph accurately. This might be problematic if the network structure contains additional information necessary for solving the paper classification problem. If we could extract that supplementary information, theoretically, we might be able to build a more accurate classifier. In the following sections, we will look at two methods for learning node representations that capture node connectivity more accurately.
Though some information about the connectivity is present in the BoW features, it is not sufficient to reconstruct the citation graph accurately. The BoW representation may miss additional information contained in the network structure that can be used to solve the paper classification problem. If we could extract that information, we may be able to build a more accurate classifier. In the following sections, we look at two methods for learning node representations that capture node connectivity more accurately.

## Learning node embeddings with Node2Vec

Expand Down

0 comments on commit 94c6346

Please sign in to comment.