Learn how to integrate KarateClub library with Neo4j to calculate various node and graph embeddings

Lately, I have been on a quest to learn as much as possible about node embedding techniques. The goal of node embedding is to encode nodes so that the similarity in the embedding space approximates similarity in the original network. In layman’s terms, we encode each node to a fixed size vector that preserves the similarity of the original network.

Graph Embedding — Representation Learning on Networks, snap.stanford.edu/proj/embeddings-www

Node embeddings are helpful when you want to capture network information in a fixed-size vector and use it in a downstream Machine Learning workflow.

I have come across the Karate Club package in my search for the implementation of various…


Learn how to combine Selenium and SpaCy to create a Neo4j knowledge graph of the Harry Potter universe

Most likely, you have already seen the Game of Thrones network created by Andrew Beveridge.

Andrew constructed a co-occurrence network of book characters. If two characters appear within some distance of text between each other, we can assume that they are somehow related or they interact in the book.

I decided to create a similar project but choose a popular book with no known (at least to me) network extraction. So, the project to extract a network of characters from the Harry Potter and the Philosopher’s Stone book was born.

I did a lot of experiments to decide the best…


Extract the value of relationships by using the FastRP embedding algorithm to produce features for a downstream node classification task

This is the third article in my Twitchverse series. The previous two are:

  1. Twitchverse: Constructing a Twitch Knowledge Graph in Neo4j
  2. Twitchverse: A network analysis of Twitch universe using Neo4j Graph Data Science

Don’t worry. This article is standalone, so you don’t need to examine the previous ones if you aren’t interested. However, if you are interested in how I constructed the Twitch knowledge graph and performed a network analysis, check them out. You can also follow along by loading the database dump in Neo4j and all the code is available as a Jupyter notebook.

Agenda

In this blog post, I…


Learn through a practical example how to use graph theory and algorithms to gain valuable insights from connected data

Graph data science focuses on analyzing the connections and relationships in data to gain valuable insights. Every day, massive amounts of data are generated, but the connections between data points are often overlooked in data analysis. With the rise of Graph Data Science tools, the ability to analyze connections is not limited anymore to huge technology companies like Google. In this blog post, I will present how to set up the Neo4j Graph Data Science environment on your computer and walk you through your (first) network analysis. We will be using the Twitch network dataset. In my previous blog post


Learn how to design and construct a knowledge graph in Neo4j that describes the Twitch universe

I was inspired by Insights from Visualizing Public Data on Twitch post. The author uses Gephi to perform graph analysis on the Twitch network and visualize the results. Twitch is an online platform that allows users to share their content via live stream. Twitch streamers broadcast their gameplay or activity by sharing their screen with fans who can hear and watch them live. I wondered what kind of analysis we could make if we used a graph database instead of Gephi to store the network information. This blog post will show you how to design and construct a knowledge graph…


Quickly inspect graph embedding algorithm results in Neo4j graph data science playground application NEuler.

NEuler is a graph data science playground application designed to help you execute and understand graph algorithms in Neo4j. With only a couple of clicks, you can import example data, execute various graph algorithms, and visualize their results. It is available as an extension to Neo4j Desktop, and you can also use it in combination with Neo4j Sandbox.

In this blog post, I will use the Movies sandbox project to demonstrate how to quickly visualize graph embedding results with a t-SNE scatter plot.

Setting Up the Neo4j Sandbox Environment

You can follow this link to automatically create a Movies sandbox project. If you choose to, you…


Implementation of information extraction pipeline that includes coreference resolution, entity linking, and relationship extraction techniques.

I am thrilled to present my latest project I have been working on. If you have been following my posts, you know that I am passionate about combining natural language processing and knowledge graphs. In this blog post, I will present my implementation of an information extraction data pipeline. Later on, I will also explain why I see the combination of NLP and graphs as one of the paths to explainable AI.

Information extraction pipeline

What exactly is an information extraction pipeline? To put it in simple terms, information extraction is the task of extracting structured information from unstructured data such as text.

Steps in my implementation of the IE pipeline. Image by author


How to combine Named Entity Linking with Wikipedia data enrichment to analyze the internet news.

A wealth of information is being produced every day on the internet. Understanding the news and other content-generating websites is becoming increasingly crucial to successfully run a business. It can help you spot opportunities, generate new leads, or provide indicators about the economy.

In this blog post, I want to show you how you can create a news monitoring data pipeline that combines natural language processing (NLP) and knowledge graph technologies.

The data pipeline consists of three parts. In the first part, we scrape articles from an Internet provider of news. Next, we run the articles through an NLP pipeline…


Learn how to use the GraphSAGE embeddings in Neo4j Graph Data Science library to improve your Machine Learning workflows

The use of knowledge graphs and graph analytics pipeline is getting more and more popular. If you keep an eye on the graph analytics field, you already know that graph neural networks are trending. Unfortunately, there aren’t many tutorials out there on how to use them in a practical application. For this reason, I have decided to write this blog post, where you will learn how to train a convolutional graph neural network and integrate it into your machine learning workflow to improve downstream classification model accuracy.

Agenda

In this example, you will reproduce the protein role classification task from the…


Learn how to import, clean, and analyze ArXiv dataset in Neo4j. In the last step, you will learn how to create a search and recommendation engine for articles.

In Europe, we are deep in the second wave of Covid lockdown. I’ve seen some motivational speakers talk about using this time and learning a new skillset. As a child, I’ve always liked nuclear experiments, so I decided to build a reactor in my basement and try some experiments. I’ve already got a basement, so now I only need to learn nuclear physics or maybe get some nuclear researchers to help me out.

I’ve got the idea from Estelle Scifo, who imported and analyzed the ArXiv dataset in Neo4j. We’ll take a detailed look at the nuclear experiments category of…

Tomaz Bratanic

Data explorer. Turn everything into a graph.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store