> For the complete documentation index, see [llms.txt](https://agpt.co/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://agpt.co/docs/integrations/block-integrations/extract.md).

# Firecrawl Extract

Blocks for extracting structured data from web pages using Firecrawl's AI extraction.

## Firecrawl Extract

### What it is

Firecrawl crawls websites to extract comprehensive data while bypassing blockers.

### How it works

This block uses Firecrawl's extraction API to pull structured data from web pages based on a prompt or schema. It crawls the specified URLs and uses AI to extract information matching your requirements.

Define the data structure you want using a JSON schema for precise extraction, or use natural language prompts for flexible extraction. Wildcards in URLs allow extracting data from multiple pages matching a pattern.

### Inputs

| Input               | Description                                                                       | Type            | Required |
| ------------------- | --------------------------------------------------------------------------------- | --------------- | -------- |
| urls                | The URLs to crawl - at least one is required. Wildcards are supported. (/\*)      | List\[str]      | Yes      |
| prompt              | The prompt to use for the crawl                                                   | str             | No       |
| output\_schema      | A Json Schema describing the output structure if more rigid structure is desired. | Dict\[str, Any] | No       |
| enable\_web\_search | When true, extraction can follow links outside the specified domain.              | bool            | No       |

### Outputs

| Output | Description                            | Type            |
| ------ | -------------------------------------- | --------------- |
| error  | Error message if the extraction failed | str             |
| data   | The result of the crawl                | Dict\[str, Any] |

### Possible use case

**Product Data Extraction**: Extract structured product information (prices, specs, reviews) from e-commerce sites.

**Contact Scraping**: Pull business contact information from company websites in a structured format.

**Data Pipeline Input**: Automatically extract and structure web data for analysis or database population.

***


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://agpt.co/docs/integrations/block-integrations/extract.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.