Web scraper

Extract data and analyze website with the Web scraper.

Description

The Web scraper component allows you to collect and extract information from a web address using a URL. By combining the Web scraper with an AI model you can filter what interests you on a site or order specific tasks to be carried out on the internet page.

The Web scraper component has the identifier of scraper-X, where X represents the instance number of the Web scraper component.

Component settings

Parameter NameDescription

URL

This parameter specifies the URL or collection of URLs to source data from.

Advanced configurations

OptionsDescription

Enable caching

This option determines whether the results of the component are cached. This means that on the next run of the Flow, Diaflow will utilize the previous computed component output, as long as the inputs have not changed.

Caching time

Only applicable if the "Enable Caching" option has been enabled. This parameter controls how long Diaflow will wait before automatically clearing the cache.

Use case

Here is a simple use case of the Web scraper component, where the Web scraper component is being used to extract functions from a page that talk about HTML/CSS widgets.

Last updated