SEO & Robots
PicPeak hosts a private gallery — usually you do not want it indexed by Google, scraped by AI training crawlers, or pulled into social-media link previews without your control. The Settings → SEO & Robots tab gives you fine-grained control over robots.txt and the per-page <meta name="robots"> tags.
Indexing
| Setting | Default | What it does |
|---|---|---|
| Allow indexing | on | When off, robots.txt returns User-agent: * + Disallow: / for every crawler. Most respectful crawlers will skip your site entirely. |
| Sitemap URL | (empty) | If set, gets appended as Sitemap: <url> at the end of robots.txt. Only meaningful if you have a sitemap to point to. |
robots.txt is advisory. It blocks well-behaved crawlers (Google, Bing, modern archive bots) but does not stop a scraper that ignores it. If you have content that must not leave your control, also use the Image Security tab to set protection level enhanced or maximum on the gallery.
AI crawler blocking
The “AI crawlers” toggle blocks the user-agents commonly used for LLM training data ingestion (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, etc.).
| Setting | Default | What it does |
|---|---|---|
| Block AI crawlers | off | Adds Disallow: / rules for ~15 known AI training bots in robots.txt. |
| Custom AI agents | (empty) | Newline-separated list of additional UAs to block. Useful when a new training crawler appears between PicPeak releases. |
Social bot blocking
Blocks the bots that fetch URLs to generate link-preview cards (the OG/Twitter-card image you see in WhatsApp, Slack, etc.).
| Setting | Default | What it does |
|---|---|---|
| Block social bots | off | Adds Disallow: / for Twitterbot, facebookexternalhit, LinkedInBot, Slackbot, WhatsApp, TelegramBot, Discordbot. |
Blocking social bots also disables PicPeak’s gallery share-link previews — the /og/gallery/:slug handler relies on those user-agents to know it should serve the OG metadata page. Leave this off unless you specifically don’t want share previews.
Custom rules
For everything not covered by the toggles above, write your own:
| Setting | Default | What it does |
|---|---|---|
| Custom rules | [] | List of {userAgent, disallow: [...]} objects. Each becomes a User-agent: <ua>\nDisallow: <path> block in robots.txt. |
Example: block a specific scraper from a path
User-agent: BadBot
Disallow: /gallery/Per-page meta tags
In addition to robots.txt, PicPeak can inject <meta name="robots"> into every page so even crawlers that ignore robots.txt see the directive in the HTML they fetched.
| Setting | Default | What it does |
|---|---|---|
| noindex | off | Adds <meta name="robots" content="noindex">. Tells crawlers not to add the page to their index. |
| nofollow | off | Adds <meta name="robots" content="nofollow">. Tells crawlers not to follow links from the page. |
| noai / noimageai | off | Adds <meta name="robots" content="noai, noimageai">. Newer convention; honored by some AI training pipelines. |
Live preview
The tab includes a Preview panel that shows you the rendered robots.txt exactly as a crawler will receive it. Verify your config there before saving.
Where it’s served
robots.txtis served by the backend at/robots.txt(proxied through nginx). Cached for 1 hour; saving in the admin invalidates the cache immediately.- The meta tags are injected into the
<head>of the public landing page and the gallery pages.