1
Neo4J Lab (A)
Twitter Followers
Overview:
In this short Neo4J introduction lab you will create a small dataset to show Twitter users as nodes,
and set up some follower relationships. The main aim of the lab is to get used to creating your own
nodes and relationships, and carrying out some basic querying in a graph database.
You can run this lab on: Neo4J AURA, Neo4J Desktop, or the Linux VM [see separate Neo4J setup document for details]
2
1a. Create some User Nodes and Relationships
You will need to type all of this in one go and then run it, as the relationships link to Node labels
(a,b,c etc). To save time you can copy it from the PDF or download the text from
here: www.macs.hw.ac.uk/~pb56/neo4jtweets.txt
create (a:USER {name:"Greenpeace"})
create (b:USER {name:"YouTube"})
create (c:USER {name:"BBCNews"})
create (d:USER {name:"UN"})
create (e:USER {name:"WorldBank"})
create (f:USER {name:"Bob"})
create (g:USER {name:"G7"})
create (a)-[:FOLLOWS]->(b)
create (a)-[:FOLLOWS]->(c)
create (a)-[:FOLLOWS]->(d)
create (d)-[:FOLLOWS]->(e)
create (c)-[:FOLLOWS]->(d)
create (c)-[:FOLLOWS]->(a)
create (c)-[:FOLLOWS]->(e)
create (e)-[:FOLLOWS]->(g)
create (f)-[:FOLLOWS]->(a)
create (f)-[:FOLLOWS]->(b)
create (g)-[:FOLLOWS]->(b)
create (g)-[:FOLLOWS]->(c)
You should see a message like this:
Added 7 labels, created 7 nodes, set 7 properties, created 12 relationships, completed after 3 ms.
Check the graph using this cypher syntax:
MATCH (N:USER) RETURN N
It should look something like this:
3
1b. Create some Tweets
create (a:TWEET {id: 123, msg:"Climate Change greatest threat says Attenborough"})
create (b:TWEET {id: 300, msg:"Cabinet to discuss the deal this afternoon"})
create (c:TWEET {id: 462, msg:"Crazy weather out there"})
You can check a tweet as follows:
MATCH (t:TWEET) WHERE t.id=123 RETURN t
1c. Link the Tweets to the User
Here you are matching an existing TWEET to an existing USER, then creating a ‘sent_by’ relationship.
MATCH (t:TWEET{id: 123})
MATCH (u: USER {name: "Greenpeace"})
CREATE (t)-[:sent_by]->(u)
Repeat for the other tweets + users:
MATCH (t:TWEET{id: 300})
MATCH (u: USER {name: "BBCNews"})
CREATE (t)-[:sent_by]->(u)
MATCH (t:TWEET{id: 462})
MATCH (u: USER {name: "Bob"})
CREATE (t)-[:sent_by]->(u)
Check the Tweets messages are correctly linked (i.e. tweets sent by user) - note you can write this either way:
MATCH (t:TWEET)-[:sent_by]->(u:USER) RETURN u,t
OR
MATCH (u:USER)<-[:sent_by]-(t:TWEET) RETURN u,t
4
2. Query the data - examples
a) Show the graph of users that follow someone else
MATCH (u:USER) -[:FOLLOWS]->(g:USER) RETURN u
b) Show who 'Greenpeace' follows directly:
MATCH (u:USER {name:"Greenpeace"}) -[:FOLLOWS]->(g:USER) RETURN u,g
…and who ‘Greenpeace’ follows limited to between 1 to 2 hops:
MATCH (u:USER {name:"Greenpeace"}) -[:FOLLOWS*1..2]->(g:USER) RETURN u,g
Compare this to the output for 1 to 3 network links - which extra users are included? _____________
c) Find which user(s) sent a tweet with the word 'deal':
MATCH (t:TWEET)-[:sent_by]->(u:USER)
WHERE t.msg CONTAINS 'deal'
RETURN u,t
d) What is the length of the shortest route between "Greenpeace" and "G7"?
MATCH z= (u:USER {name:"Greenpeace"}) -[:FOLLOWS*]->(q:USER {name:'G7'})
return min(length(z))
e) Plot the shortest path route from Greenpeace to G7 via followers
MATCH (u:USER {name:"Greenpeace"}), (v:USER{name:"G7"}),
p=shortestPath((u)-[:FOLLOWS*]->(v))
RETURN p
f) Which users do not follow anyone?
MATCH (u:USER) WHERE NOT (u)-[:FOLLOWS]->() RETURN u.name
5
g) Which users have 2 or more followers?
MATCH (u:USER)-[:FOLLOWS]-> (z:USER)
WITH z,count(u) as rels
WHERE rels>=2
RETURN z.name
3. Add some properties to a Node
Let's now add the joined year to a few nodes.
MATCH (u:USER {name:"BBCNews"}) set u.joined=2007 return u
MATCH (u:USER {name:"WorldBank"}) set u.joined=2009 return u
MATCH (u:USER {name:"Bob"}) set u.joined=2012 return u
a) Which users joined before 2010?
MATCH (u:USER) where u.joined < 2010 return u
b) What users joined longer than 10 years ago?
MATCH (u:USER) where 2019- u.joined > 10 return u
c) Which user was first to join?
MATCH (u:USER)
RETURN u
ORDER BY u.joined LIMIT 1
6
Or you could also do it like this:
MATCH (u:USER)
WITH (min(u.joined)) as p
MATCH (z:USER) where z.joined = p
RETURN z
4. Questions to Try
a) Which users follow "YouTube"?
b) How many network steps are there between G7 and the UN?
c) Which user(s) follow both YouTube and Greenpeace?
d) Find all tweets that include the word 'Climate'.
5. To remove this dataset
You don't need to remove this dataset but just in case you wanted to know how to delete data in
Neo4J.
This may seem like the way to do it... but try it:
MATCH (N:USER) DELETE N
Now take a look at the dataset using: MATCH (N:USER) RETURN N Notice they weren't deleted as
there are relationships so you need to remove those as well. Try this instead:
MATCH (N:USER) DETACH DELETE N
MATCH (T:TWEET) DETACH DELETE T
Note: To delete a node by its internal id use this syntax (eg for TWEET node):
e.g.
MATCH (t:TWEET) where id(t)=368988 DETACH DELETE t
--replace the node type (eg USER) integer with the appropriate value which can be found from the User Interface
------ END OF LAB ----