Last active
January 12, 2018 18:20
-
-
Save tjt263/8f34e0f5182f38173822c6712f48a777 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1.3 - The Bash Environment | |
The GNU Bourne-Again SHell (Bash)4 provides a powerful environment to work in, | |
and a scripting engine that we can make use of to automate procedures using existing | |
Linux tools. Being able to quickly whip up a Bash script to automate a given task is an | |
essential requirement for any security professional. In this module, we will gently | |
introduce you to Bash scripting with a theoretical scenario. | |
1.4 - Intro to Bash Scripting | |
1.4.1 - Practical Bash Usage – Example 1 | |
Imagine you are tasked with finding all of the subdomains listed on the cisco.com index | |
page, and then find their corresponding IP addresses. Doing this manually would be | |
frustrating, and time consuming. However, with some simple Bash commands, we can | |
turn this into an easy task. We start by downloading the cisco.com index page using the | |
wget command. | |
root@kali:~# wget www.cisco.com | |
--2013-04-02 16:02:56-- http://www.cisco.com/ | |
Resolving www.cisco.com (www.cisco.com)... 23.66.240.170, | |
Connecting to www.cisco.com (www.cisco.com)|23.66.240.170|:80... connected. | |
HTTP request sent, awaiting response... 200 OK | |
Length: 23419 (23K) [text/html] | |
Saving to: `index.html' | |
100%[=====================================>] 23,419 --.-K/s in 0.09s | |
2013-04-02 16:02:57 (267 KB/s) - `index.html' saved [23419/23419] | |
root@kali:~# ls -l index.html | |
-rw-r--r-- 1 root root 23419 Apr 2 16:02 index.html | |
Quickly looking over this file, we see entries which contain the information we need, | |
such as the one shown below: | |
<li><a href="http://newsroom.cisco.com/">Newsroom</a></li> | |
We start by using the grep command to extract all the lines in the file that contain the | |
string “href=”, indicating that this line contains a link. | |
root@kali:~# grep "href=" index.html | |
The result is still a swamp of HTML, but notice that most of the lines have a similar | |
structure, and can be split conveniently using the “/” character as a delimiter. To | |
specifically extract domain names from the file, we can try using the cut command with | |
our delimiter at the 3rd field. | |
root@kali:~# grep "href=" index.html | cut –d "/" –f 3 | |
The output we get is far from optimal, and has probably missed quite a few links on the | |
way, but let’s continue. Our text now includes entries such as the following: | |
about | |
solutions | |
ordering | |
siteassets | |
secure.opinionlab.com | |
help | |
Next, we will clean up our list to include only domain names. Use grep to filter out all | |
the lines that contain a period, to get cleaner output. | |
root@kali:~# grep "href=" index.html | cut –d "/" -f 3 | grep "\." | |
Our output is almost clean, however we now have entries that look like the following: | |
learningnetwork.cisco.com">Learning Network< | |
We can clean these out by using the cut command again, at the first delimeter. | |
root@kali:~# grep "href=" index.html | cut -d "/" -f 3 | grep "\." | cut -d '"' -f 1 | |
Now we have a nice clean list, but lots of duplicates. We can clean these out by using | |
the sort command, with the unique (-u) option. | |
root@kali:~# grep "href=" index.html | cut -d "/" -f 3 | grep "\." | cut -d '"' -f 1 | sort -u | |
blogs.cisco.com | |
communities.cisco.com | |
csr.cisco.com | |
developer.cisco.com | |
grs.cisco.com | |
home.cisco.com | |
investor.cisco.com | |
learningnetwork.cisco.com | |
newsroom.cisco.com | |
secure.opinionlab.com | |
socialmedia.cisco.com | |
supportforums.cisco.com | |
tools.cisco.com | |
www.cisco.com | |
www.ciscolive.com | |
www.meraki.com | |
An even cleaner way of doing this would be to involve a touch of regular expressions | |
into our command, redirecting the output into a text file, as shown below: | |
root@kali:~# cat index.html | grep -o 'http://[^"]*' | cut -d "/" -f 3 | sort –u > list.txt | |
Now we have a nice, clean list of domain names linked from the front page of cisco.com | |
Our next step will be to use the host command on each domain name in the text file | |
we created, in order to discover their corresponding IP address. We can use a Bash one- | |
liner loop to do this for us: | |
root@kali:~# for url in $(cat list.txt); do host $url; done | |
The host command gives us all sorts of output, not all of it relevant. We want to extract | |
just the IP addresses from all of this information, so we pipe the output into grep, | |
looking for the text “has address,” then cut and sort the output. | |
root@kali:~# for url in $(cat list.txt); do host $url; done | grep "has address" | cut -d " " -f 4 | sort -u | |
128.30.52.37 | |
136.179.0.2 | |
141.101.112.4 | |
… | |
206.200.251.19 | |
23.63.101.114 | |
23.63.101.80 | |
23.66.240.170 | |
23.66.251.95 | |
50.56.191.136 | |
64.148.82.50 | |
66.187.208.213 | |
67.192.93.178 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
My script: https://github.com/tjt263/nsextract/blob/master/nsextract.sh