Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save theblackturtle/3ecd15532fb167982b18861aa02cfaea to your computer and use it in GitHub Desktop.
Save theblackturtle/3ecd15532fb167982b18861aa02cfaea to your computer and use it in GitHub Desktop.
This code snippet takes a Github organization name as input, crawls for all its public repositories and returns a list of all the "Git clone URLs" for those repos.
import itertools
import re
import requests as rq
# Your Github organization (e.g. /Github)
organization = "/<company_name>"
response = rq.request("GET", "https://github.com{0}".format(organization))
pages = re.search(r"data-total-pages=\"(\d+)\">", response.text).group(1)
repositoryUrls = []
for page in range(1, int(pages) + 1):
response = rq.request("GET", "https://github.com{}?page={}".format(organization, str(page)))
repositoryUrls.append(re.findall(r"href=\"" + organization + "/(.*)\" itemprop", response.text))
repositoryUrls = list(itertools.chain.from_iterable(repositoryUrls))
repositoryUrls = ["https://github.com" + organization + "/{0}.git".format(repo) for repo in repositoryUrls]
print(repositoryUrls)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment