unja 0.0.7

unja 0.0.7


Fetch Known Urls

What's Unja?
Unja is a fast & light tool for fetching known URLs from Wayback Machine, Common Crawl, Virus Total, UrlScan.io & AlienVault's Otx it uses a separate thread for each provider to optimize its speed and use Wayback resumption key to divide scan into multiple parts to handle a large scan & it uses direct filters on API to get only filtered data from API to do less work on your system.
Why Unja?

Supports Wayback/Common-Crawl/Virus-Total/Otx/UrlScan.io
Automatically handles rate limits and timeouts
Export results: text or detailed output with status,mime,length in JSON
MultiThreading: separate thread for each provider to fetch data simultaneously
Filters: apply filters dirtly on provider to avoid unnecessary data

Installing Unja
You can install Unja with pip as following:
pip3 install unja

or, by downloading this repository and running
python3 setup.py install

Updating Unja
You can update Unja with pip as following:
pip3 install unja -U

unja -h

This will display help for the tool.


unja -d ninjhacks.com

List of domains file seprated by new line
unja -f domains.txt

Include subdomain
unja --sub

Providers (wayback,commoncrawl,otx,virustotal,urlscan)
unja -p wayback

(default : statuscode:200 ~mimetype:html)
unja --wbf statuscode:200

(default : =status:200 ~mime:.*html)
unja --ccf =status:200

Wayback results per request (default : 10000)
unja --wbl 1000

Otx results per request (default : 500)
unja --otxl 500

Amount of retries for http client (default : 3)
unja -r 3

Enable verbose mode to show errors
unja -v

Enable json mode for detailed output in json format
unja -j

Silent mode don't print header
unja -s

Update CommonCrawl Index
unja --ucci

Change VirusTotal Api in config
unja --vtkey

Change UrlScan Api in config
unja --uskey

Output Methods
text = ( default ) Output urls only.
json = ( -j ) Output url,status,mime,length in json format it's can help you later filtering result based on those variables.
Filters directly apply on providers to get only useful filtered data from provider.


return only those urls which status code is 200

return only non 200 status code

return only those url which response type is text/html

return only non text/html response type

return all those url which have html word in response type

return all those url which have unja word in url

Get only urls with parameters & status code 200
unja -s -d target.com --sub -p wayback,commoncrawl --wbf 'statuscode:200 ~original:=' --ccf '=status:200 ~url:.*=' | anew | tee output

Looking for open redirects
unja -s -d target.com --sub -p wayback,commoncrawl --wbf '~statuscode:30 ~original:=http' --ccf '~status:30 ~url:.*=http' | anew | tee output

Clean result ( Exclude images,css,javascripts,woff & 404)
unja -s -d target.com --sub -p wayback,commoncrawl --wbf '!statuscode:404 ~!mimetype:image ~!mimetype:javascript ~!mimetype:css ~!mimetype:woff' --ccf '!=status:404 !~mime:.*image !~mime:.*javascript !~mime:.*css !~mime:.*woff' | anew | tee output

Let me know if you have any other good oneliner ./


