37
loading...
This website collects cookies to deliver better user experience
apoc.load.json
procedure from the APOC standard library to import a JSON file using Cypher. First, let's just pull this file in to make sure we can parse it. This Cypher statement will parse the newest Lobsters submissions and return an array of objects that we can work with in Cypher:CALL apoc.load.json("https://lobste.rs/newest.json") YIELD value
RETURN value
apoc.load.json
to parse the JSON file, then use UNWIND
to iterate over this array of objects. We'll then use the MERGE
Cypher clause to add users, articles, and tags to the database. The MERGE
statement allows us to avoid creating duplicates in the graph - with MERGE
only patterns that don't already exist in the graph are created.CALL apoc.load.json("https://lobste.rs/newest.json") YIELD value
UNWIND value AS article
MERGE (s:User {username: article.submitter_user.username})
ON CREATE SET s.about = article.submitter_user.about,
s.created = DateTime(article.submitter_user.created_at),
s.karma = article.submitter_user.karma,
s.avatar_url = "https://lobsete.rs" + article.submitter_user.avatar_url
MERGE (i:User {username: article.submitter_user.invited_by_user})
MERGE (i)<-[:INVITED_BY]-(s)
MERGE (a:Article {short_id: article.short_id})
SET a.url = article.url,
a.score = article.score,
a.created = DateTime(article.created_at),
a.title = article.title,
a.comments = article.comments_url
MERGE (s)-[:SUBMITTED]->(a)
WITH article, a
UNWIND article.tags AS tag
MERGE (t:Tag {name: tag})
MERGE (a)-[:HAS_TAG]->(t)
.github/workflows/lobsters.yml
. In this YAML file we willactions/checkout@v2
actiongithubocto/flat@v2
Action to fetch our Lobsters JSON file and check it into our repo as newest.json
name: Lobsters Data Import
on:
push:
paths:
- .github/workflows/lobsters.yml
workflow_dispatch:
schedule:
- cron: '*/60 * * * *'
jobs:
scheduled:
runs-on: ubuntu-latest
steps:
- name: Check out repo
uses: actions/checkout@v2
- name: Fetch newest
uses: githubocto/flat@v2
with:
http_url: https://lobste.rs/newest.json
downloaded_filename: newest.json
newest.json
file with the data from Lobsters.NEO4J_USER
, NEO4J_PASSWORD
, and NEO4J_URI
with the connection credentials specific to the Neo4j Aura instance we created earlier.$value
so our Cypher import statement just needs to reference this Cypher parameter to work with the data fetched by Flat Data in the previous step of our Action.apoc.load.json
, but adapt it to use this convention. We'll also reference the Neo4j Aura connection credentials we defined as GitHub secrets. Let's update our lobsters.yml
file:name: Lobsters Data Import
on:
push:
paths:
- .github/workflows/lobsters.yml
workflow_dispatch:
schedule:
- cron: '*/60 * * * *'
jobs:
scheduled:
runs-on: ubuntu-latest
steps:
- name: Check out repo
uses: actions/checkout@v2
- name: Fetch newest
uses: githubocto/flat@v2
with:
http_url: https://lobste.rs/newest.json
downloaded_filename: newest.json
- name: Neo4j import
uses: johnymontana/flat-[email protected]
with:
neo4j-user: ${{secrets.NEO4J_USER}}
neo4j-password: ${{secrets.NEO4J_PASSWORD}}
neo4j-uri: ${{secrets.NEO4J_URI}}
filename: newest.json
cypher-query: >
UNWIND $value AS article
MERGE (s:User {username: article.submitter_user.username})
ON CREATE SET s.about = article.submitter_user.about,
s.created = DateTime(article.submitter_user.created_at),
s.karma = article.submitter_user.karma,
s.avatar_url = "https://lobsete.rs" + article.submitter_user.avatar_url
MERGE (i:User {username: article.submitter_user.invited_by_user})
MERGE (i)<-[:INVITED_BY]-(s)
MERGE (a:Article {short_id: article.short_id})
SET a.url = article.url,
a.score = article.score,
a.created = DateTime(article.created_at),
a.title = article.title,
a.comments = article.comments_url
MERGE (s)-[:SUBMITTED]->(a)
WITH article, a
UNWIND article.tags AS tag
MERGE (t:Tag {name: tag})
MERGE (a)-[:HAS_TAG]->(t)