IbouBot - Ibou's Crawler

Technical Information

Bot type: Crawler (identifies itself)
Version: 1.0
Follows robots.txt: Yes
Follows crawl delay: Yes
Reverse DNS suffix: ibou.io

Ibou.io operates a crawler service named IbouBot which fuels and updates our graph representation of the World Wide Web. This database and all the metrics are used to provide a search engine. We do not train AI models with the data.

What is IbouBot doing on your website(s)?

IbouBot crawls URLs found on public pages and thus may be visiting each page which has been publicly cited somewhere.

Even redirect (301) or missing (404) pages?

Yes, we keep trying to crawl those pages just to be sure that a missing page doesn't reflect a temporary state or a faulty web server.

And what about nofollow links?

Google introduced nofollow links to let a site indicate that some pages must not be taken into account when computing web metrics. But it doesn't prevent a bot from crawling these pages.

Does it respect my robots.txt file?

Yes. We respect the robots.txt file and disallow directives. If you have the feeling that we do not respect your directive, please contact us.

How do I increase the interval between IbouBot queries?

We have a politeness policy of X seconds between two queries on the same host, and Y seconds between two queries on the same IP of the same domain. You can extend the crawl delay using the robots.txt file:

User-agent: IbouBot

Crawl-Delay: [delayInSec]

Note that the crawl delay only applies for a given host. If the same web server is hosting websites with different domains, the rules above will apply.

How do I prevent IbouBot from crawling part of my site?

The robots.txt file allows you to disallow IbouBot from crawling a part or the whole of your website using the disallow directive. For example, to prevent the WordPress admin section from being accessed by IbouBot:

User-agent: IbouBot

Disallow: /wp-admin/

My website blocks your bot, how to fix it?

Even if IbouBot crawls web pages with a reasonable delay (X or Y seconds between queries), it is sometimes mistaken for a DDoS or a brute force attack. If we find a URL containing session parameters, it could also be considered as a login attempt. For these reasons, IbouBot may be temporarily blacklisted. In this case, you may try to whitelist IbouBot directly in your plugin, or contact us if you can't.

IbouBot IP Range:
JSON with actual IP Range

IbouBot Crawler

Mozilla/5.0 (compatible; IbouBot/1.0; +bot@ibou.io; +https://ibou.io/iboubot.html)