Created
August 29, 2011 17:15
-
-
Save jarvist/1178861 to your computer and use it in GitHub Desktop.
Python script with Beautiful soup to rip Tweets from twitter account, prints as HTML for inclusion in webpage with Date/Time stamp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
#Original from here: http://code.activestate.com/recipes/576594/ | |
#rips Tweets from twitter account, prints as HTML for inclusion in webpage with Date/Time stamp | |
import time | |
from urllib2 import urlopen | |
from BeautifulSoup import BeautifulSoup | |
# Replace USERNAME with your twitter username | |
url = u'http://twitter.com/USERNAME?page=%s' | |
for x in range(10*10000): | |
f = urlopen(url % x) | |
soup = BeautifulSoup(f.read()) | |
f.close() | |
# print soup #WHAT'S GOING ON? | |
tweets = soup.findAll('span', {'class': 'entry-content'}) | |
dates = soup.findAll('span',{'class': 'published timestamp'}) | |
for index, tweet in enumerate(tweets): | |
print "<h3>",dates[index].renderContents(),"</h3><p>",tweet.renderContents(),"</p>" | |
# print tweet.renderContents() | |
if len(tweets) == 0: | |
break | |
# being nice to twitter's servers | |
time.sleep(5) |
No, you will not need authentication as you are just reading HTML that is publicly accessible. Twitter has just changed their site enough that the tags this script searches for no longer exist.
While not perfect, this SOF answer is more up to date. I would however recommend just getting Twitter API access.
Update: I've mashed the two together into a fully working parser that can handle links (SOF answer cannot). Here's the glist
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I get no results printed out, does this need authentication? I did use a user with public tweets for my test