# Retrieve Data from Spreadsheet

## Retrieve Data from Spreadsheet

### What does it do?

The **Retrieve Data from Spreadsheet** node lets you ask questions about a spreadsheet and get back the matching data. Think of it like handing a spreadsheet to an analyst and saying, “Show me the top products,” without sorting or filtering the file yourself.

#### Smart Features

* **Reads common spreadsheet files**: Works with `.csv`, `.xls`, and `.xlsx` files.
* **Understands plain-English questions**: In **Use AI** mode, you can ask for totals, top items, or filtered results in everyday language.
* **Prepares the sheet for you**: If you do not define the column structure yourself, Diaflow figures it out automatically.
* **Cleans extra spaces**: It removes accidental spaces in text cells before running your question.
* **Avoids re-importing the same file**: If the same file is already prepared, the node reuses it instead of loading it again.

<figure><img src="/files/0zLT7f9d6zGY6g1k0G2f" alt=""><figcaption></figcaption></figure>

### Real-World Business Value

* Check which products, regions, or teams generated the highest revenue in a sales file.
* Pull only overdue invoices or pending orders from an uploaded report.
* Let operations teams ask spreadsheet questions in plain English instead of editing filters by hand.

### Step-by-Step Setup

* In the **Data** field, select the spreadsheet file from an earlier step using **@**.
* Make sure the file is a `.csv`, `.xls`, or `.xlsx` file.
* In **Action**, keep **Query data from table** selected.
* In **SQL Generate Method**, choose **Use AI** for the easiest setup.
* In **Provider**, leave the default option unless your team has told you to use a different one.
* In the **Input** field, type a plain-English question such as `Show the top 5 products by total revenue`.
* In **Schema**, leave it empty unless your team already prepared a custom table structure for this file.
* Only use **Manually enter SQL** if you already know exactly which custom query you want to run.
* Run the workflow, then open the output to review the returned rows and totals.

<figure><img src="/files/zkwZqsXuJYgewnEWr4zv" alt=""><figcaption></figcaption></figure>

### The Transformation: Before & After

**Before**

```
Spreadsheet file: sales_q1_2026.xlsx

Rows inside the file:
Date       | Region | Product   | Revenue
2026-01-04 | APAC   | Widget A  | 12000
2026-01-09 | EMEA   | Widget B  | 8700
2026-01-12 | APAC   | Widget A  | 14300

Question:
Show total revenue by product
```

**After**

```json
[
  {
    "Product": "Widget A",
    "total_revenue": 26300
  },
  {
    "Product": "Widget B",
    "total_revenue": 8700
  }
]
```

<figure><img src="/files/Cq5xSBTdCyLqfnyrTifg" alt=""><figcaption></figcaption></figure>

### Tips &  Warnings for First-Timers

* Use a real spreadsheet file, not a web link pasted into the question field.
* If you send multiple files, only the first file is used. Split them into separate steps if needed.
* Keep filenames clear and word-based, such as `regional-sales-europe.xlsx`. Avoid filenames that only differ by numbers or symbols.
* Very large spreadsheets can load slowly or fail. Start with a smaller file when testing.
* If your spreadsheet changes, re-upload it with a new file name or path before running again. This helps avoid older results.
* Date columns may behave like plain text. Ask simple date questions first, then confirm the output.
* In **Use AI** mode, the result size is kept short. If you need a longer list, narrow your question or use a more specific filter.
* If a column name in your file is unusual or misspelled, the result may fail. Check the spreadsheet headers first.
* Use **Manually enter SQL** only for advanced use. Most teams should stay with **Use AI**.

### Need help?

* Learn the basics in [How a node works](/getting-started/lets-start-with-the-basics/how-a-node-works.md)
* Build the full flow in [Create a workflow](/workflow-builder/create-a-workflow.md)
* Browse related nodes in [Component List](/workflow-builder/component-list.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.diaflow.io/workflow-builder/nodes/built-in-tools/retrieve-data-from-spreadsheet.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
