Last active
April 8, 2018 02:01
-
-
Save jordelver/bdf6c7e91c3f4f6eedba to your computer and use it in GitHub Desktop.
Get all movies in your Letterboxd watchlist
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "mechanize" | |
USERNAME = ENV.fetch("USERNAME") do | |
puts "Letterboxd USERNAME environment variable must be supplied" | |
exit | |
end | |
WATCHLIST = "http://letterboxd.com/%s/watchlist/" % USERNAME | |
agent = Mechanize.new | |
root_page = agent.get(WATCHLIST) | |
# Get all pages of the watchlist | |
page_links = root_page.search(".paginate-page a") | |
pages = page_links.each_with_object([root_page]) do |link, memo| | |
url = "http://letterboxd.com%s" % link.attribute("href").value | |
memo << agent.get(url) | |
end | |
# Scrape all movie names from the watchlist pages | |
movies = pages.each_with_object([]) do |page, memo| | |
page.search("li.poster-container div img").map do |movie| | |
memo << movie.attribute("alt").value | |
end | |
end | |
# Sort | |
movies = movies.sort_by { |title| title.upcase } | |
# Output | |
movies.each do |movie| | |
puts movie | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment