24
loading...
This website collects cookies to deliver better user experience
stream
: A Python script collects new retweets with the hashtag #christmas and sends them to a Kafka cluster.kafka
: A Kafka cluster consisting of one topic named retweets
. memgraph-mage
: The graph analytics platform where we store the incoming Twitter data from Kafka and perform dynamic PageRank and dynamic community detection on all Twitter users.backend-app
: A Flask server that sends all the data we query from memgraph-mage
to the frontend-app
. It also consumes the Kafka stream and sends it to the frontend-app
.frontend-app
: A React app that visualizes the Twitter network using the D3.js library.| docker-compose.yml
|
+---backend
| Dockerfile
| +---server
| +---tests
|
+---frontend
| | .dockerignore
| | Dockerfile
| | package.json
| | package-lock.json
| +---node_modules
| +---public
| +---src
|
+---memgraph
| | Dockerfile
| | requirements.txt
| +---procedures
| +---transformations
|
+---stream
| | Dockerfile
| | kafka_utils.py
| | produce.py
| | requirements.txt
| +---data
frontend
folder was created using the create-react-app
npm
package. If you are starting from scratch and want to create a React app follow these steps:npm install -g create-react-app
(if you don't want to install the latest version, you can specify the version of the create-react-app
, for example, [email protected]
).npm init react-app frontend --use-npm
, which will initialize the react-app
package in the frontend
folder.frontend
folder by running cd frontend
and start the app with npm start
.npx
- a package runner tool that comes with npm 5.2+
. Then you just have to run:npx create-react-app frontend
cd frontend
npm start
[email protected]
since I had issues with the latest version. I am going to explain the process on the CommunityDetection
component, since it's very similar to the PageRank
component. If you are running the frontend application locally, and not using the provided dockerized application, make sure to install the library by running:npm install [email protected]
Don't forget that Node.js is a prerequisite for using npm
.
import io from "socket.io-client"
var socket = io("http://localhost:5000/", {
transports: ["websocket", "polling"]
})
http://localhost:5000/
. Then we established the connection to the server with websocket
first. If websocket
is not available, connection to the server will be established with HTTP
long-polling
- successive HTTP
requests (POST
for writing, GET
for reading). Next, we need to handle different events with our socket are handled. When the connection is established, the socket is emitting the consumer
signal. This signal is also emitted on the server side, whenever a new message is sent. This configuration allows the socket to receive all messages related to consumer
signal.socket.on("connect", () => {
socket.emit('consumer')
console.log("Connected to socket ", socket.id)
});
socket.on("connect_error", (err) => {
console.log(err)
// try reconnecting
socket.connect()
});
socket.on("disconnect", () => {
console.log("Disconnected from socket.")
});
socket.on("consumer", (msg) => {
console.log('Received a message from the WebSocket service: ', msg.data);
}
socket.io
code within a React component. First, I initialized the socket in the component's constructor. After that I have placed the socket events in componentDidMount()
lifecycle method. This part of the React.Component
lifecyle is invoked once, immediately after a component is mounted. If you need to load data from a remote endpoint, this is a good place to instantiate the network request. This method is also a good place to set up any subscriptions. That's why I have decided to place all socket events there. On each consumer
emit, the state of the component will be updated and that will trigger an extra rendering which will happen before the browser updates the screen, so the user won’t see the intermediate state. Before I set up the socket, at the beginning of the componentDidMount()
, I have made a simple HTTP
request that will trigger the backend to start producing the needed data.firstRequest() {
fetch("http://localhost:5000/api/graph")
.then((res) => res.json())
.then((result) => console.log(result))
}
D3.js
in the initializeGraph()
method. By setting a new state of nodes and links with setState()
on each consumer
emit, componentDidUpdate()
lifecycle method will be called. In that method we are updating the graph by drawing new incoming nodes and links. This lifecycle method is not called for the initial render, and that's the reason why we initialized everything in the initializeGraph()
method.componentWillUnmount()
lifecycle method is called and the client disconnects from the server.componentWillUnmount() {
this.socket.emit('disconnect');
this.socket.disconnect();
}
If you want to learn more about React.Component
lifecycle methods, check out the React official docs.
svg
using D3.js
within the class component. We are going to do that by creating a reference in the component constructor which will be attached to the svg
via the ref
attribute. In constructor we have to use createRef()
method.constructor(props) {
super(props);
this.myReference = React.createRef();
this.state = {
nodes: [],
links: []
}
this.socket = io("http://localhost:5000/", { transports: ["websocket", "polling"] })
}
render()
method we are adding the ref
attribute with value this.myReference
to the svg
.render() {
return (<div>
<h1>Community Detection</h1>
<p>Number of users that retweeted so far: {this.state.nodes.length}</p>
<svg ref={this.myReference}
style={{
height: 500, //width: "100%"
width: 900,
marginRight: "0px",
marginLeft: "0px",
background: "white"
}}></svg></div>
);
}
svg
on which we are going to draw our graph.var svg = d3.select(this.myReference.current);
If you want to know how to use D3.js
within function component, check out one of my previous blog posts - Twitch Streaming Graph Analysis - Part 2.
updateGraph()
method we have to draw the nodes and relationships using D3.js
, where nodes will be colored depending on the community they belong to. We are receiving the community information through the cluster
property of each node.// Remove old nodes
node.exit().remove();
// Update existing nodes
node = node.data(nodes, (d) => d.id);
node = node
.enter()
.append('circle')
.attr("r", function (d) {
return 7;
})
.attr('fill', function (d) {
if (!clusterColors.hasOwnProperty(d.cluster)) {
clusterColors[d.cluster] = "#" + Math.floor(Math.random() * 16777215).toString(16)
}
return clusterColors[d.cluster]
})
.on("mouseover", function (d) {
tooltip.text(d.srcElement["__data__"]["username"])
tooltip.style("visibility", "visible")
})
.on("mousemove", function (event, d) {
return tooltip.style("top", (event.y - 10) + "px").style("left", (event.x + 10) + "px"); })
.on("mouseout", function (event, d) { return tooltip.style("visibility", "hidden"); })
.call(this.drag())
.merge(node);
node
value to the new nodes data. Next, we want each node to be a circle with radius 7 (that's just a random value which seemed quite okay to me). After that, we want each node to be colored depending on the cluster it belongs to. We have previously created a map of colors called clusterColors
. When new cluster appears, a new key value pair is created in the map, where key is the cluster number and value is a randomly generated color. If the cluster of the node already exists, then the color of the node will be the value of that cluster key in the clusterColors
map. Then if we want to see usernames on hover, we need mouseover
, mousemove
and mouseout
events. In the next line, we are calling the drag()
method which allows us to drag the nodes. At the end, new nodes are being merged with the old ones with the merge()
method. We will add the links between the nodes in a similar manner. All that's left to do is to create the simulation on updated nodes and links.try {
simulation
.nodes(nodes)
.force('link', d3.forceLink(links).id(function (n) { return n.id; }))
.force(
'collide',
d3
.forceCollide()
.radius(function (d) {
return 20;
})
)
.force('charge', d3.forceManyBody())
.force('center', d3.forceCenter(width / 2, height / 2));
} catch (err) {
console.log('err', err);
}
.attr('id', (d) => d.source.id + '-' + d.target.id)
to each link. That id is created from the id's of the nodes the certain link is connecting. Collide force is there so that the nodes are not overlapping, considering the size of their radius. Here we have set the radius to size 20, which is larger than 7 - the radius of the nodes. Charge force is causing the nodes in the graph to repel each other, that is, it prevents the nodes from overlapping each other in the visualization. In the end, we have a center force, which is forcing the nodes and links to appear at the middle of the svg
.GIF
below for the preview, and if you want to start the app all by yourself, follow the instructions at the README in the repository.node = node
.enter()
.append('circle')
.attr("r", function (d) {
return d.rank * 1000;
})
.attr('fill', 'url(#gradient)')
.on("mouseover", function (d) {
tooltip.text(d.srcElement["__data__"]["username"])
tooltip.style("visibility", "visible")
})
.on("mousemove", function (event, d) { return tooltip.style("top", (event.y - 15) + "px").style("left", (event.x + 15) + "px"); })
.on("mouseout", function (event, d) { return tooltip.style("visibility", "hidden"); })
.call(this.drag())
.merge(node);
r
is proportional to rank (calculated PageRank of each node). Also, the fill
attribute is determined by the gradient
created in the defineGradient()
method.React
, D3.js
and WebSocket
, but creating this demo application gave me a pretty good insight into the real-time visualization. It was fun playing with it, and I'm looking forward to learning more in the future. Also, I would like to emphasize that Reddit network explorer application, developed by my colleagues Ivan, David and Antonio, helped me a lot. There, you can find real-time visualization with frontend in Angular. For any feedback or questions ping me or Memgraph team at our Discord server.