Web Scraper browser extension supports data export in CSV format while Web Scraper Cloud supports data export in CSV, XLSX and JSON formats. XLSX and JSON formats will be added to Web Scraper extension in a future release.
Download scraped data via
Export data as CSV menu selection under the
Sitemap menu. Data can be also downloaded while the scraper is running.
Download scraped data via website from
Sitemaps sections. Data can be also downloaded while the scraper is running.
Set up automated data export to
Data Export section. Currently exported data will be in CSV format. Data will be exported to
Apps/Web Scraper in your
Additionally you can download data via Web Scraper Cloud API in CSV or JSON formats.
Data in separate cells is limited to 32767 characters. Additional characters will be cut off. Use other export formats if large text contents are expected in a single cell. Row count is limited to 1 million rows. In case data set contains more than 1 million rows, the data will be split into multiple sub sheets.
JSON file format contains one JSON record per line.
New line characters found in data will be escaped as
\n character can be safely used as a record separator.
Note! Parsing the entire file as a JSON string will not work since all records are not wrapped in a JSON array. This was a design decision to make it easier to parse large files.
Comma Seperated Values files format is formatted as described in RFC 4180 standard.
Values are quoted in double quotes
" and in case when a double quote character is in text it is escaped with another double quote character.
Lines are separated with
Additionally CSV files include byte order mark (BOM)
U+FEFF characters at the beginning of the file to hint that the file will be in UTF-8 format.
New line characters are not escaped which means using
\r\n as a record separator can result in errors.
We recommend using a CSV reader library when reading CSV files programmatically.
We recommend using Libre Office Calc when opening CSV files. Microsoft office often is incorrectly interpreting CSV files formatted in RFC 4180 standard. Mostly this is related to text including new line characters.
In case when a CSV file is incorrectly opened by Microsoft Excel try using data import feature:
Choose From Text/CSV
Set up import settings - UTF-8 encoding, Comma delimiter, Do not detect data types