Skip to content

Instantly share code, notes, and snippets.

@bahuma20
Last active July 7, 2023 22:19
Show Gist options
  • Save bahuma20/6c628228a12f47b594193495742abc39 to your computer and use it in GitHub Desktop.
Save bahuma20/6c628228a12f47b594193495742abc39 to your computer and use it in GitHub Desktop.
Import Mastodon posts into Neo4J using Cypher and apoc
// Required parameters:
// - instance -> The mastodon instance. For example "mastodon.social"
// - account -> Mastodon username. For example "bahuma20"
CALL apoc.periodic.commit('
// Create the platform node for this instance
MERGE (platform:Platform {instance: $instance, type: "mastodon"})
ON CREATE SET platform.instance = $instance,
platform.type = "mastodon"
// Load the latest post from this platform and save it in the variable "minId"
WITH platform OPTIONAL MATCH (existingPost:Post)-[:CREATED_ON_PLATFORM]->(platform)
WITH existingPost,platform ORDER BY existingPost.id DESC LIMIT 1
WITH coalesce(existingPost.id, 0) AS minId, platform
// Get user ID by username
CALL apoc.load.json("https://"+ $instance +"/api/v1/accounts/lookup?acct=" + $account) YIELD value AS userInfo
// Load posts from the user
WITH "https://"+ $instance +"/api/v1/accounts/"+ userInfo.id +"/statuses?limit=40&exclude_reblogs=1&exclude_replies=1&min_id=" + minId AS uri, platform
CALL apoc.load.json(uri) YIELD value AS p
// Create post itself.
// Note that ids are platform specific and therefore always have to be looked up with the CREATED_ON_PLATFORM relation
MERGE(post:Post {id: p.id})-[:CREATED_ON_PLATFORM]->(platform)
ON CREATE SET post.content = p.content,
post.visibility = p.visibility,
post.sourceUrl = p.uri,
post.created = datetime(p.created_at),
post.isSensitive = p.sensitive,
post.contentWarning = p.spoiler_text
MERGE (language:Language {name:p.language})
MERGE (post)-[:WRITTEN_IN_LANGUAGE]->(language)
MERGE (post)-[:CREATED_ON_PLATFORM]->(platform)
MERGE (account:Account {username: $account, instance: $instance, type:"mastodon"})
MERGE (post)<-[:CREATED_BY_ACCOUNT]-(account)
MERGE (account)-[:IS_ON_PLATFORM]->(platform)
// Thread support
FOREACH (ignoreMe IN CASE WHEN NOT p.in_reply_to_id IS NULL THEN [1] ELSE [] END |
MERGE (repliedToPost:Post {id: p.in_reply_to_id})-[:CREATED_ON_PLATFORM]->(platform)
MERGE (post)-[:IS_REPLY_TO]->(repliedToPost)
)
FOREACH (ignoreMe IN CASE WHEN NOT p.application IS NULL THEN [1] ELSE [] END |
MERGE (application:Application {name: p.application.name})
ON CREATE SET application.website = p.application.website
MERGE (post)-[:WAS_CREATED_WITH_APPLICATION]->(application)
)
FOREACH (t IN p.tags |
MERGE(tag:Tag{name:toLower(t.name)})
MERGE(post)-[:HAS_TAG]->(tag)
)
FOREACH (m IN p.media_attachments |
MERGE(attachment:Attachment{id: m.id})-[:WAS_UPLOADED_TO_PLATFORM]->(platform)
ON CREATE SET attachment.type = m.type,
attachment.sourceUrl = m.url,
attachment.sourcePreviewUrl = m.preview_url,
attachment.description = m.description
MERGE(attachment)-[:WAS_UPLOADED_TO_PLATFORM]->(platform)
MERGE(post)-[:HAS_ATTACHMENT]->(attachment)
)
RETURN count(p)
', {
instance: $instance,
account: $account
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment