Skip to content

Commit f2c7e4e

Browse files
committed
Bumped Python lib versions
1 parent 47bdd38 commit f2c7e4e

4 files changed

Lines changed: 39 additions & 39 deletions

File tree

README.md

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,46 +2,46 @@
22

33
## Introduction
44

5-
`pagodo` automates Google searching for potentially vulnerable web pages and applications on the Internet. It replaces
5+
`pagodo` automates Google searching for potentially vulnerable web pages and applications on the Internet. It replaces
66
manually performing Google dork searches with a web GUI browser.
77

8-
There are 2 parts. The first is `ghdb_scraper.py` that retrieves the latest Google dorks and the second portion is
8+
There are 2 parts. The first is `ghdb_scraper.py` that retrieves the latest Google dorks and the second portion is
99
`pagodo.py` that leverages the information gathered by `ghdb_scraper.py`.
1010

1111
The core Google search library now uses the more flexible [yagooglesearch](https://github.com/opsdisk/yagooglesearch)
12-
instead of [googlesearch](https://github.com/MarioVilas/googlesearch). Check out the [yagooglesearch
12+
instead of [googlesearch](https://github.com/MarioVilas/googlesearch). Check out the [yagooglesearch
1313
README](https://github.com/opsdisk/yagooglesearch/blob/master/README.md) for a more in-depth explanation of the library
1414
differences and capabilities.
1515

1616
This version of `pagodo` also supports native HTTP(S) and SOCKS5 application support, so no more wrapping it in a tool
17-
like `proxychains4` if you need proxy support. You can specify multiple proxies to use in a round-robin fashion by
17+
like `proxychains4` if you need proxy support. You can specify multiple proxies to use in a round-robin fashion by
1818
providing a comma separated string of proxies using the `-p` switch.
1919

2020
## What are Google dorks?
2121

2222
Offensive Security maintains the Google Hacking Database (GHDB) found here:
23-
<https://www.exploit-db.com/google-hacking-database>. It is a collection of Google searches, called dorks, that can be
23+
<https://www.exploit-db.com/google-hacking-database>. It is a collection of Google searches, called dorks, that can be
2424
used to find potentially vulnerable boxes or other juicy info that is picked up by Google's search bots.
2525

2626
## Terms and Conditions
2727

2828
The terms and conditions for `pagodo` are the same terms and conditions found in
2929
[yagooglesearch](https://github.com/opsdisk/yagooglesearch#terms-and-conditions).
3030

31-
This code is supplied as-is and you are fully responsible for how it is used. Scraping Google Search results may
32-
violate their [Terms of Service](https://policies.google.com/terms). Another Python Google search library had some
31+
This code is supplied as-is and you are fully responsible for how it is used. Scraping Google Search results may
32+
violate their [Terms of Service](https://policies.google.com/terms). Another Python Google search library had some
3333
interesting information/discussion on it:
3434

35-
* [Original issue](https://github.com/aviaryan/python-gsearch/issues/1)
36-
* [A response](https://github.com/aviaryan/python-gsearch/issues/1#issuecomment-365581431>)
37-
* Author created a separate [Terms and Conditions](https://github.com/aviaryan/python-gsearch/blob/master/T_AND_C.md)
38-
* ...that contained link to this [blog](https://benbernardblog.com/web-scraping-and-crawling-are-perfectly-legal-right/)
35+
- [Original issue](https://github.com/aviaryan/python-gsearch/issues/1)
36+
- [A response](https://github.com/aviaryan/python-gsearch/issues/1#issuecomment-365581431>)
37+
- Author created a separate [Terms and Conditions](https://github.com/aviaryan/python-gsearch/blob/master/T_AND_C.md)
38+
- ...that contained link to this [blog](https://benbernardblog.com/web-scraping-and-crawling-are-perfectly-legal-right/)
3939

4040
Google's preferred method is to use their [API](https://developers.google.com/custom-search/v1/overview).
4141

4242
## Installation
4343

44-
Scripts are written for Python 3.6+. Clone the git repository and install the requirements.
44+
Scripts are written for Python 3.6+. Clone the git repository and install the requirements.
4545

4646
```bash
4747
git clone https://github.com/opsdisk/pagodo.git
@@ -53,13 +53,13 @@ pip install -r requirements.txt
5353

5454
## ghdb_scraper.py
5555

56-
To start off, `pagodo.py` needs a list of all the current Google dorks. The repo contains a `dorks/` directory with the
56+
To start off, `pagodo.py` needs a list of all the current Google dorks. The repo contains a `dorks/` directory with the
5757
current dorks when the `ghdb_scraper.py` was last run. It's advised to run `ghdb_scraper.py` to get the freshest data
58-
before running `pagodo.py`. The `dorks/` directory contains:
58+
before running `pagodo.py`. The `dorks/` directory contains:
5959

60-
* the `all_google_dorks.txt` file which contains all the Google dorks, one per line
61-
* the `all_google_dorks.json` file which is the JSON response from GHDB
62-
* Individual category dorks
60+
- the `all_google_dorks.txt` file which contains all the Google dorks, one per line
61+
- the `all_google_dorks.json` file which is the JSON response from GHDB
62+
- Individual category dorks
6363

6464
Dork categories:
6565

@@ -124,7 +124,7 @@ dorks["category_dict"][1]["category_name"]
124124
### Using <span>pagodo.py</span> as a script
125125

126126
```bash
127-
python pagodo.py -d example.com -g dorks.txt
127+
python pagodo.py -d example.com -g dorks.txt
128128
```
129129

130130
### Using pagodo as a module
@@ -195,37 +195,37 @@ site:github.com
195195

196196
### Wait time between Google dork searchers
197197

198-
* `-i` - Specify the **minimum** delay between dork searches, in seconds. Don't make this too small, or your IP will
199-
get HTTP 429'd quickly.
200-
* `-x` - Specify the **maximum** delay between dork searches, in seconds. Don't make this too big or the searches will
201-
take a long time.
198+
- `-i` - Specify the **minimum** delay between dork searches, in seconds. Don't make this too small, or your IP will
199+
get HTTP 429'd quickly.
200+
- `-x` - Specify the **maximum** delay between dork searches, in seconds. Don't make this too big or the searches will
201+
take a long time.
202202

203203
The values provided by `-i` and `-x` are used to generate a list of 20 randomly wait times, that are randomly selected
204204
between each different Google dork search.
205205

206206
### Number of results to return
207207

208-
`-m` - The total max search results to return per Google dork. Each Google search request can pull back at most 100
208+
`-m` - The total max search results to return per Google dork. Each Google search request can pull back at most 100
209209
results at a time, so if you pick `-m 500`, 5 separate search queries will have to be made for each Google dork search,
210210
which will increase the amount of time to complete.
211211

212212
### Save Output
213213

214-
`-o [optional/path/to/results.json]` - Save output to a JSON file. If you do not specify a filename, a datetimestamped
214+
`-o [optional/path/to/results.json]` - Save output to a JSON file. If you do not specify a filename, a datetimestamped
215215
one will be generated.
216216

217-
`-s [optional/path/to/results.txt]` - Save URLs to a text file. If you do not specify a filename, a datetimestamped one
217+
`-s [optional/path/to/results.txt]` - Save URLs to a text file. If you do not specify a filename, a datetimestamped one
218218
will be generated.
219219

220220
### Save logs
221221

222-
`--log [optional/path/to/file.log]` - Save logs to the specified file. If you do not specify a filename, the default
222+
`--log [optional/path/to/file.log]` - Save logs to the specified file. If you do not specify a filename, the default
223223
file `pagodo.py.log` at the root of pagodo directory will be used.
224224

225225
## Google is blocking me!
226226

227-
Performing 7300+ search requests to Google as fast as possible will simply not work. Google will rightfully detect it
228-
as a bot and block your IP for a set period of time. One solution is to use a bank of HTTP(S)/SOCKS proxies and pass
227+
Performing 7300+ search requests to Google as fast as possible will simply not work. Google will rightfully detect it
228+
as a bot and block your IP for a set period of time. One solution is to use a bank of HTTP(S)/SOCKS proxies and pass
229229
them to `pagodo`
230230

231231
### Native proxy support
@@ -236,7 +236,7 @@ Pass a comma separated string of proxies to `pagodo` using the `-p` switch.
236236
python pagodo.py -g dorks.txt -p http://myproxy:8080,socks5h://127.0.0.1:9050,socks5h://127.0.0.1:9051
237237
```
238238

239-
You could even decrease the `-i` and `-x` values because you will be leveraging different proxy IPs. The proxies passed
239+
You could even decrease the `-i` and `-x` values because you will be leveraging different proxy IPs. The proxies passed
240240
to `pagodo` are selected by round robin.
241241

242242
### proxychains4 support
@@ -249,7 +249,7 @@ Install `proxychains4`
249249
apt install proxychains4 -y
250250
```
251251

252-
Edit the `/etc/proxychains4.conf` configuration file to round robin the look ups through different proxy servers. In
252+
Edit the `/etc/proxychains4.conf` configuration file to round robin the look ups through different proxy servers. In
253253
the example below, 2 different dynamic socks proxies have been set up with different local listening ports (9050 and
254254
9051).
255255

@@ -269,7 +269,7 @@ socks4 127.0.0.1 9050
269269
socks4 127.0.0.1 9051
270270
```
271271

272-
Throw `proxychains4` in front of the `pagodo.py` script and each *request* lookup will go through a different proxy (and
272+
Throw `proxychains4` in front of the `pagodo.py` script and each _request_ lookup will go through a different proxy (and
273273
thus source from a different IP).
274274

275275
```bash
@@ -278,10 +278,10 @@ proxychains4 python pagodo.py -g dorks/all_google_dorks.txt -o [optional/path/to
278278

279279
Note that this may not appear natural to Google if you:
280280

281-
1) Simulate "browsing" to `google.com` from IP #1
282-
2) Make the first search query from IP #2
283-
3) Simulate clicking "Next" to make the second search query from IP #3
284-
4) Simulate clicking "Next to make the third search query from IP #1
281+
1. Simulate "browsing" to `google.com` from IP #1
282+
2. Make the first search query from IP #2
283+
3. Simulate clicking "Next" to make the second search query from IP #3
284+
4. Simulate clicking "Next to make the third search query from IP #1
285285

286286
For that reason, using the built in `-p` proxy support is preferred because, as stated in the `yagooglesearch`
287287
documentation, the "provided proxy is used for the entire life cycle of the search to make it look more human, instead

ghdb_scraper.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
# Custom Python libraries.
1414

1515

16-
__version__ = "1.2.1"
16+
__version__ = "1.3.0"
1717

1818

1919
"""

pagodo.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
# Custom Python libraries.
2020

2121

22-
__version__ = "2.6.4"
22+
__version__ = "2.7.0"
2323

2424

2525
class Pagodo:

requirements.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
beautifulsoup4==4.13.4
2-
requests==2.32.3
1+
beautifulsoup4==4.13.5
2+
requests==2.32.5
33
yagooglesearch==1.10.0

0 commit comments

Comments
 (0)