# OpenGraph.io

OpenGraph.io provides a simple API to retrieve Open Graph data from websites, even those without properly defined Open Graph tags.

- **Category:** ai web scraping
- **Auth:** API_KEY
- **Composio Managed App Available?** N/A
- **Tools:** 4
- **Triggers:** 0
- **Slug:** `OPENGRAPH_IO`
- **Version:** 20260312_00

## Tools

### Capture Screenshot

**Slug:** `OPENGRAPH_IO_CAPTURE_SCREENSHOT`

Tool to capture high-quality screenshots of any webpage programmatically. Supports full-page captures, custom dimensions, device presets, element-specific screenshots, and quality settings. Screenshots are available for 24 hours after generation. Use when you need to capture visual snapshots of websites with specific rendering requirements or device simulations.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | The target webpage URL to capture. Must be a valid HTTP or HTTPS URL. |
| `format` | string ("jpeg" | "png" | "webp") | No | Image format options for the screenshot. |
| `quality` | integer | No | Image quality from 10 to 80 (rounded to nearest 10). Higher values mean better quality but larger file size. Defaults to 80. |
| `cache_ok` | boolean | No | Allow returning a cached screenshot if available. Set to false to force a fresh capture. Defaults to true. |
| `selector` | string | No | CSS selector to capture a specific element instead of the full page. Example: '.main-content' or '#header'. |
| `dark_mode` | boolean | No | Set browser preference to dark mode before capturing. Useful for capturing dark-themed versions of websites. Defaults to false. |
| `full_page` | boolean | No | Capture the entire scrollable page instead of just the viewport. Set to true for full-page screenshots. Defaults to false. |
| `use_proxy` | boolean | No | Use a proxy for accessing protected or restricted sites. Defaults to false. |
| `dimensions` | string ("xs" | "sm" | "md" | "lg") | No | Viewport size presets for different device types. |
| `capture_delay` | integer | No | Delay in milliseconds before capturing the screenshot. Useful for waiting for dynamic content to load. Range: 0-10000ms. |
| `exclude_selectors` | string | No | Comma-separated CSS selectors of elements to hide from the screenshot. Useful for removing ads, popups, or unwanted elements. Example: '.header,.footer,.ads'. |
| `navigationTimeout` | integer | No | Navigation timeout in milliseconds. Maximum time to wait for the page to load. Range: 1000-60000ms. Defaults to 30000ms (30 seconds). |
| `block_cookie_banner` | boolean | No | Automatically block known cookie consent banners from appearing in the screenshot. Defaults to true. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Extract Site Metadata

**Slug:** `OPENGRAPH_IO_EXTRACT_SITE`

Tool to extract site metadata. Use when you need to retrieve Open Graph and other meta signals from a website.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `site` | string | Yes | URL of the site to extract metadata from |
| `scrape` | boolean | No | Scrape all data instead of just Open Graph and meta tags |
| `use_proxy` | boolean | No | Use a proxy during extraction |
| `accept_lang` | string | No | Preferred language for extracted data (e.g., 'en-US') |
| `full_render` | boolean | No | Render the page in a browser before extraction |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Scrape Site

**Slug:** `OPENGRAPH_IO_SCRAPE_SITE`

Tool to scrape a site for its raw HTML and social/OpenGraph metadata. Use when you need the full page content and metadata. Use after confirming the URL.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `site` | string | Yes | The URL of the site to scrape. This will be URL-encoded by the action. E.g., 'https://example.com'. |
| `scrape` | boolean | No | If true, forces scraping regardless of cache. Otherwise may return cached result. |
| `cache_ok` | boolean | No | If false, forces a fresh scrape instead of returning cached data. Defaults to using cache if not set. |
| `full_render` | boolean | No | If true, performs a full page render (runs JavaScript). Defaults to a lightweight scrape if not set. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Scrape URL for HTML

**Slug:** `OPENGRAPH_IO_SCRAPE_URL`

Tool to scrape raw HTML content from a website with anti-bot protection and optional JavaScript rendering. Use when you need the full HTML source code of a page, especially for sites with bot detection or dynamic content.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | The target website URL to scrape. Must start with http:// or https://. This will be URL-encoded automatically. |
| `cache_ok` | boolean | No | Allow cached results. Set to false to force a fresh scrape. Defaults to true to improve performance. |
| `use_proxy` | boolean | No | Use standard datacenter proxy for the request. Helps bypass basic bot detection. Defaults to false. |
| `accept_lang` | string | No | Language header for localized content (e.g., 'en-US', 'fr-FR', 'es-ES'). Defaults to 'en-US'. |
| `full_render` | boolean | No | Enable JavaScript rendering. Required for Single Page Applications (SPAs) and dynamic content. Defaults to false for faster lightweight scraping. |
| `use_premium` | boolean | No | Use residential proxy for the request. Better for scraping protected sites with stronger bot detection. Defaults to false. |
| `use_superior` | boolean | No | Use mobile proxy for the request. Best for heavily protected sites with advanced bot detection. Defaults to false. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |
