V. 1.0.2
Beaty. A web scraper that grabs content from your website.
Beaty grabs content from your website and returns the data as a JavaScript object, JSON or CSV.
You must have a legal right to the content returned by Beaty, in that it should be your own content, open source, Creative Commons or similar.
Enter the URL of your website.

Getting started

Beaty is for small to medium-sized websites, and there's a limit of 70 pages. Currently, English is the only language supported.

Beaty can't hope to crawl every website, but at present we're running at around a 95% success rate, and we've noticed a few common issues you should avoid if possible.

Show your URLs

Sometimes Beaty can't find links from your website because they're hidden, possibly behind JavaScript or similar. Beaty looks for anchor tags, and Google does the same.

Use absolute rather than relative URLs

A relative URL is a URL that doesn't explicitly contain the protocol and domain.
Relative URLs are perfectly valid, but can be confusing for search engines and Beaty. Google recommends, where possible, absolute rather than relative URLs. You can use relative URLs, but they should have an obvious, unambiguous path.


Beaty supports the Open Graph protocol and takes the image url from the og:image property.