Website Scrape
Configure a workflow step that scrapes a website URL and returns selected content types for later steps
Use Website Scrape when you want a workflow step to fetch a page from the web and return the parts of that page your workflow needs.
Configuration
Name
No
Label for the step in the workflow canvas.
URL
Yes
URL in URL. This field supports workflow variables through Insert Variable.
HTML Output
No
Controls whether HTML output is returned.
Markdown Output
No
Controls whether Markdown output is returned.
Links Output
No
Controls whether extracted Links are returned.
Subpages
No
Controls whether Subpages crawling is enabled.
Crawl Mode
No
Crawl behavior selected in Crawl Mode.
Max Characters
No
Maximum output size set in Max Characters.
Max Retries
No
Retry count in Max Retries.
Timeout (ms)
No
Timeout value in Timeout (ms).
Screenshot
No
Controls whether a page screenshot is captured.
Screenshot Type
No
Screenshot mode in Screenshot Type when screenshots are enabled.
When the step fails
No
Controls whether the workflow should Terminate Workflow or Continue if this step fails.
Add this step from the Research group in Search steps....
The URL field supports Insert Variable. Below that, the settings sheet lets you choose which output types to return: HTML, Markdown, Links, and Subpages.
Use Crawl Mode to control how Fetch Hive retrieves the page:
Preferred tries a live crawl first, then falls back to cache.
Always always uses a live crawl.
Fallback uses cache first, then crawls if needed.
Never only uses cache.
If you turn Screenshot on, Screenshot Type appears with Viewport and Full Page options.
Output
Click Run in the step header to test the step. Fetch Hive shows the scrape result in Output after the run completes.
Use the variable picker in a later step to insert the exact output path available for that run. The base reference is:
The exact fields depend on which outputs you enabled. For example, HTML, markdown, links, subpage data, and screenshot-related fields only appear when those outputs are turned on. Use the variable picker after a test run to inspect the returned fields.
Example
Add Website Scrape from the Research group in Search steps....
Set Name to something like Scrape product page.
Paste the page into URL. If the URL comes from an earlier workflow step, click Insert Variable and add that reference.
Turn on the outputs you need. For example, enable Markdown for clean content, Links for extracted links, and Subpages if you want one subpage crawled from the main page.
Choose a Crawl Mode, then set Max Characters, Max Retries, and Timeout (ms) for the run.
If you need a visual capture, turn Screenshot on and choose Viewport or Full Page in Screenshot Type.
Click Run and review the scraped result in Output before sending it to later workflow steps.
Notes
The returned result depends on which output toggles you enable, so inspect the variable picker after a run if you need exact field names.
The editor shows a warning for direct LinkedIn URLs. LinkedIn URLs are not supported for scraping.
Use Markdown when you want cleaner page content, HTML when you need raw markup, and Links when you only need extracted URLs.
See also: Creating and Editing, Testing and Iteration, and Error Handling
Last updated