-
Introduction
As mentioned in the previous lecture, penetration testing follows a fixed methodology. In this course, we use the methodology defined in The Basics of Hacking and Penetration Testing by P. Engebretson [1], named the Zero-Entry Hacking Methodology. Others exist, but they all follow a similar structure.
The first phase of this methodology is called Reconnaissance and it involves identifying as much information as possible regarding the target system or network. This is also described as Information Gathering.
In general Reconnaissance is a non-intrusive systematic method employed by hackers, ethical and non-ethical, to accumulate data about a specific target network passively (without their knowledge), usually with the goal of finding ways to intrude into the environment. Reconnaissance involves footprinting the target company (i.e. establish a blueprint of the security profile of a target).
This is without a doubt the most important phase of a penetration test. It is thought that an attacker spends 90% of the time in profiling an organization and the remaining 10% in launching the attack.
Why is it necessary? It is crucial to systematically and methodically ensure that all pieces of information related to the target are identified. The tester must harvest information to execute a focused attack. Information include: domain name, network services & applications, system architecture, intrusion detection systems, specific IP addresses, phone numbers, contact addresses, & authentication mechanisms.
Gathering information means collecting as much knowledge about the targets network as possible before any scanning tasks take place. The effectiveness of the information gathering process has a direct relation to the successfulness of an attack.
The initial information is collected by compiling information from open sources, either through running utilities, or manually researching public information about the target (e.g., website, trade papers, Usenet, financial databases, or even from disgruntled employees).
Information gathering can be both passive and active. Passive information gathering is done by finding out details that are freely available over the Internet and by various other techniques without directly coming in contact with the organisation's servers.
Reviewing the targets and other informative websites are exceptions as the information gathering activities carried out by an attacker do not raise suspicion.
Calling the help desk and attempting to social engineer them out of privileged information is an example of active information gathering. Also, the next phase of the methodology, Scanning, is a type of active reconnaissance. This will be covered in the next chapter. Here, focus will be given to passive techniques.
-
Open Source Intelligence
According to Michael Bazzell, author of the book Open Source Intelligence Techniques [2], Open Source Intelligence (OSINT) is defined as any evidence produced from publicly available information that is collected, exploited and disseminated in a timely manner to an appropriate audience for the purpose of addressing specific intelligence requirements. This definition is generic, but it can be applied in the context of penetration testing. During the reconnaissance phase, OSINT can be seen as any public information that can aid towards the completion of a penetration test. Open source intelligence is extracted by using any publicly available means such as Social Networks, Search Engines, Online Communities, Documents, Photographs, Maps, Videos and more. Have a look at the author’s web site [3] which provides an excellent reference for information on OSINT.
-
Social Networks
According to Michael Bazzell, author of the book Open Source Intelligence Techniques [2], Open Source Intelligence (OSINT) is defined as any evidence produced from publicly available information that is collected, exploited and disseminated in a timely manner to an appropriate audience for the purpose of addressing specific intelligence requirements. This definition is generic, but it can be applied in the context of penetration testing. During the reconnaissance phase, OSINT can be seen as any public information that can aid towards the completion of a penetration test. Open source intelligence is extracted by using any publicly available means such as Social Networks, Search Engines, Online Communities, Documents, Photographs, Maps, Videos and more. Have a look at the author’s web site [3] which provides an excellent reference for information on OSINT.
-
Google
Google searches provide a rich source of information to perform passive reconnaissance. When using Google advanced searches, specific data can be filtered to extract sensitive information. Google advance search strings are quite sophisticated but they can provide a range of interesting information. This is often defined as Google hacking. An excellent resource for Google Hacking [4] is maintained by the Offensive Security team (i.e. the creator of Kali Linux). Have a look at the reference section for further information.
-
Archive of a website
You can get all information of a company's website since the time it was launched at 🔗 www.archive.org. You can see update made to the website, look for employee's database, past products, press releases, contact information, and more.
-
Job Sites
Job sites are an excellent source of information, which in many cases would immediately provide details on the target’s company infrastructure. Look for company infrastructure postings such as "looking for system administrator to manage Solaris network". Look towards: job requirements, employee profile, hardware information and software information.
-
DNS
Domain Name System is a hierarchical database that stores data about domain names and IP addresses. DNS enumeration is the process of locating all the DNS servers and their records. A company may have internal and external DNS servers that can yield target information such as computer names and IP addresses. Several tools could be used to extract information from DNS serves. An example is given next.
NSLookup
Once a target’s DNS servers is known, an attacker can begin extracting information from it. Nslookup is a tool to query DNS servers for records, which displays information that can be used to diagnose Domain Name System (DNS) infrastructure.
Output can give useful information, such as system names and IP addresses of the systems.
To interrogate DNS servers, invoke nslookup by typing nslookup at the command prompt. When run, it displays the host name and IP address of the DNS server that is configured for the local system. In non-interactive mode is used to print the name and information for a host or domain. In interactive mode it can query name servers for data about hosts and domains. In a zone transfer, the nslookup program asks the DNS server to transmit all information it has about a given domain. The example in Figure 1 shows a search for the gcu.ac.uk domain.
Another popular DNS query tool (for Linux) is Dig.
-
Whois
The Whois databases contain information about the assignment of Internet addresses, domain names, registrars, & individual contacts. The Internet Network Information Center (lnterNIC) whois database system lists the registrars of websites based on the organisation's name or domain name for web sites. Once you have the registrar's name, you can go to the registrar‘s site and get more information, contact details of the administrators, registration dates & the addresses of its DNS servers.
Whois is usually the first step in reconnaissance, supplying the target's domain registrant, its administrative and technical contacts, together with a list of their domain servers, which can be used to gain information that can be used to perform DNS Enumeration.
Whois searches locate details on network' s autonomous system numbers, network-related handles, and other related points of contact. Whois is the primary tool used to navigate databases and query Domain Name Services. As domain allocation is deregulated, it is advisable to use different Whois tools to obtain a complete picture.
An example of a Whois lookup engine is given in the reference section [5]
Locating network range with Whois
Entering the IP address of the target's web server that you discovered earlier into the Europe based whois database can assist in the identification of the number and range of IP addresses (Figure 2).
-
Network Path
Traceroute can be employed to determine the path taken by packets across an IP network from source to destination, but also identifies the routers employed. Traceroute operates by sending an Internet Control Message Protocol (ICMP) echo to each hop (router or gateway) along the path, until the destination address is reached. Traceroute uses an IP header field time-to-live (TTL) field (used to limit IP datagrams) to illustrate the path packets travel between two hosts by sending out consecutive packets with ever-increasing TTLs. Trace route can reveal routers, their geographic location & the target' s DNS entries.
TTL functions as a counter to track each router hop as the packet travels to the target. Each hop that a packet passes through reduces the TTL field by one. If TTL reaches 0, the packet is discarded and a time exceeded in transit ICMP message is created to inform the source of the failure. By using this information, an attacker determines the layout of a network and the location of each device.
An example of a GUI alternative to Traceroute is VisualRoute. This is a graphical tool that determines where and how traffic is flowing on the route between the source and destination, by providing a geographical map of the route. It has the ability to identify the geographical location of routers, servers, and other IP devices. Another similar and popular tool is NeoTrace.
-
Email Tracking
E-mail provides another tool in the information gathering toolbox. E-Mail Spiders can collect e-mail addresses by searching the Internet - a web spidering tool picks up e-mail addresses and store them to a database. Primarily employed to track incoming e-mails to identify spam and limit fraud, e-mail tracking programs are useful in reconnaissance.
In addition to locating the real origin of an e-mail, it may be possible to identify the network provider and the ISP's IP address. A method to gain information about the target's e-mail server is to send an e-mail with an invalid recipient address that is designed to bounce if the server does not have a catch all account.
-
Defending against reconnaissance
Defending against the initial stages of an attack can be difficult as much of the information is open source data. The main objective, from a defensive point of view, is to avoid, when possible, publishing public information that can be used against a target. For instance, one example could be automatic messages from Web servers such as the "404 Not Found" error. If left default, it can publish information related to the type of server in use and sometimes even the port number it serves on. The message should be edited and restricted only to “Page Not found”.
Another important line of defence that any target system can adopt is proper configuration and implementation of their DNS. Inappropriate queries must be refused by the system thereby checking crucial information leakage.
If the organisation is a high security organisation, it can opt to register a domain in the name of a third party, as long as they agree to accept responsibility.
-
References and Further Readings
[1] – P. Engebretson, The Basics of Hacking and Penetration Testing: Ethical Hacking and Penetration Testing Made Easy, Syngress 2nd Ed., 2013.
[2] – Michael Bazzell, Open Source Intelligence Techniques, Resources for Searching and Analysing Online Information, 4th Ed., CCI Publishing, 2015.
[3] – IntelTechniques By Michael Bazzell, Online, Available at:
🔗 https://inteltechniques.com/index.html[4] – Offensive security, Google Hacking Database (GHDB). Available at:
🔗 https://www.offensive-security.com/community-projects/google-hacking-database/[5] – Whois Lookup. Website. Available at:
🔗 https://www.whois.com/whois/ -
Quiz
1. What is the difference between active and passive reconnaissance?
Answer:
Passive reconnaissance makes use of the vast amount of information available on the web. The type does not interact directly with the target and as such, the target has no way of knowing, recording, or logging our activity. In contrast, active reconnaissance includes interacting directly with the target, as such the target may record our IP address and log our activity.
2. What is it meant by Google hacking in the context of reconnaissance?
Answer:
Google hacking is a term that refers to the art of creating complex search engine queries in order to filter through large amounts of search results for information related to computer security. Information that can be obtained through Google Hacking includes: server vulnerabilities; error messages that contain too much information; files containing passwords; sensitive directories; pages containing logon portals; and pages containing network or vulnerability data such as firewall logs.
3. What role could job posting sites have within information gathering?
Answer:
Job postings often reveal very detailed information about the technology being used by an organization. Often it will define specific hardware and software.