# Uploading Product Data

## Full Feeds

Once the format of the product files is complete, they will need to be uploaded to Fredhopper via the REST API. We recommend completing a full product data update on your production environment using this method at least once every 24hrs on an automated schedule. The files must be included in an archive file named 'data.zip' and an accompanying [md5 checksum](http://linux.die.net/man/1/md5sum) must also be generated. When the archive file has been uploaded successfully to the data repository, a trigger should be sent to instruct the system to load the new data into the relevant FAS instance. If a full data update is sent and triggered when an incremental data update is already running, the load-data job will be added to a queue and the system will process it when the other job has completed.

It is important to note that, when the Indexer node is completing a re-index of the product data, it will not be possible to access the Merchandising Studio or the pre-published query endpoint as these services are hosted on here. Access is restored as soon as the job has completed and the job starts to re-index the next node in the solution. The published query endpoint will never become inaccessible during this process because redundancy is built into the solution.

{% hint style="info" %}
File Name: `data.zip`

Example URL: `https://my.eu1.fredhopperservices.com/fas`
{% endhint %}

## Incremental Feeds

Once the format of the incremental product files is complete, they will need to be uploaded to Fredhopper via the REST API. A typical incremental update frequency would be every 15-60 mins. The files must be included in a 'data-incremental.zip' archive file and an [md5 checksum](http://linux.die.net/man/1/md5sum) must also be generated. When the archive file has been uploaded successfully to the data repository, a trigger should be sent to instruct the system to load the new data into the relevant FAS instance. If an incremental update is sent and triggered when a full data update is already running, the update job will be added to a queue and the system will process it when the other job has completed.

It is important to note that all services remain online and accessible during the incremental update re-index process.

{% hint style="info" %}
File Name: `data-incremental.zip`

Example URL: `https://my.eu1.fredhopperservices.com/fas`
{% endhint %}

## Upload Steps <a href="#fulldata-csv-stepone-createziparchive" id="fulldata-csv-stepone-createziparchive"></a>

These are the steps to follow when uploading data to Fredhopper, this process is the same for full and incremental feeds.

The file names and URL's to upload too will differ and are covered in the examples.

### 1. Create a ZIP archive <a href="#fulldata-csv-stepone-createziparchive" id="fulldata-csv-stepone-createziparchive"></a>

Create a zip file containing the data you need to upload to Fredhopper.

Full feeds should be named data.zip and incremental feeds named data-incremental.zip

{% tabs %}
{% tab title="Full Feed" %}

```
zip data.zip *.csv
```

{% endtab %}

{% tab title="Incremental Feed" %}

```
zip data-incremental.zip *.csv
```

{% endtab %}
{% endtabs %}

{% hint style="info" %}
The example above is for CSV files, for JSON files this is the same process, just specify the correct file type.
{% endhint %}

### 2. Create MD5 checksum <a href="#fulldata-csv-steptwo-createmd5checksum" id="fulldata-csv-steptwo-createmd5checksum"></a>

Generate an [md5 checksum](http://linux.die.net/man/1/md5sum) value zip file.

{% tabs %}
{% tab title="Full Feed" %}

```
md5sum data.zip > data.zip.md5
```

{% endtab %}

{% tab title="Incremental Feed" %}

```
md5sum data-incremental.zip > data-incremental.zip.md5
```

{% endtab %}
{% endtabs %}

This checksum we'll use to validate the upload.

### 3. Upload ZIP archive to Fredhopper <a href="#fulldata-csv-stepthree-uploadziparchivetofredhopper" id="fulldata-csv-stepthree-uploadziparchivetofredhopper"></a>

Upload the zip file to the Fredhopper Managed Services environment that is given to you by your Technical Consultant using the 'fas' service interface.

Please note to include the checksum value in the request, as per the example below which is directed to the test1 instance.

{% tabs %}
{% tab title="Full Feed" %}
{% code overflow="wrap" %}

```
curl -D - -k -u username:password -X PUT -H "Content-Type: application/zip" --data-binary @data.zip https://my.eu1.fredhopperservices.com/fas:test1/data/input/data.zip?checksum=bbb0ecb6182d6da0fde740d14e8ed9f7
```

{% endcode %}
{% endtab %}

{% tab title="Incremental Feed" %}
{% code overflow="wrap" %}

```
curl -D - -k -u username:password -X PUT -H "Content-Type: application/zip" --data-binary @data-incremental.zip https://my.eu1.fredhopperservices.com/fas:test1/data/input/data-incremental.zip?checksum=c4127cf7d913ecbe9fe4949706384f28
```

{% endcode %}
{% endtab %}
{% endtabs %}

Once the file has been uploaded, the system will send a simple HTTP response back with some important information contained within the header and body sections:

<pre data-title="Example" data-overflow="wrap" data-line-numbers><code><strong>HTTP/1.1 100 Continue
</strong>Via: 1.1 my.fredhopperservices.com
HTTP/1.1 201 Created
Date: Wed, 21 Feb 2018 15:14:25 GMT
Server: Apache-Coyote/1.1
Location: https://my.eu1.fredhopperservices.com/fas:test1/data/input/2018-02-21_15-14-20/
Content-Type: text/plain
Via: 1.1 my.fredhopperservices.com
Vary: Accept-EncodingConnection: close
Transfer-Encoding: chunked
 
data-id=2018-02-21_15-14-20
</code></pre>

{% hint style="info" %}
There is no different in the response between full and incremental feeds.
{% endhint %}

{% hint style="warning" %}
You should capture the 'data-id' section in the HTTP response body that you receive from the API as this will be used in subsequent steps. The 'data-id' value is unique to every request.
{% endhint %}

### 4. Trigger for data to be loaded <a href="#fulldata-csv-stepfour-triggerfordatatobeloaded" id="fulldata-csv-stepfour-triggerfordatatobeloaded"></a>

Thus far, we have only uploaded the data to the Fredhopper Managed Services environment. Now, we shall initiate a trigger instructing Fredhopper to re-index with the new data and you must use the 'data-id' value that you captured in the previous step.

{% code title="Example" overflow="wrap" %}

```
curl -D - -k -u username:password -X PUT -H "Content-Type: text/plain" --data-binary "data-id=2018-02-21_15-14-20" https://my.eu1.fredhopperservices.com/fas:test1/trigger/load-data
```

{% endcode %}

The HTTP response header that you receive back from our system at this stage contains a new 'Location' value, which we can use to monitor the status of the re-index.

{% code title="Example" overflow="wrap" lineNumbers="true" %}

```
HTTP/1.1 201 Created
Date: Wed, 21 Feb 2018 15:15:29 GMT
Server: Apache-Coyote/1.1
Location: https://my.eu1.fredhopperservices.com/fas:test1/trigger/load-data/2018-02-21_15-15-28
Content-Length: 0
Via: 1.1 my.fredhopperservices.com
Vary: Accept-Encoding
Connection: close
Content-Type: text/plain
```

{% endcode %}

### 5. Monitor the status of the re-index <a href="#fulldata-csv-stepfive-monitorthestatusofthere-index" id="fulldata-csv-stepfive-monitorthestatusofthere-index"></a>

The status value for the re-index job can be checked by sending the following command using the 'Location' value that you captured in the previous step.

{% code title="Example" overflow="wrap" %}

```
curl -D - -k -u username:password -X GET https://my.eu1.fredhopperservices.com/fas:test1/trigger/load-data/2018-02-22_15-15-28/status
```

{% endcode %}

Possible status codes returned are:

<table><thead><tr><th width="204">Status</th><th>Description</th></tr></thead><tbody><tr><td>Unknown</td><td>No known state yet: trigger has not yet been picked up</td></tr><tr><td>Scheduled</td><td>Trigger has been picked up, and will start execution soon</td></tr><tr><td>Running</td><td>Triggered job is running currently</td></tr><tr><td>Delayed</td><td>Triggered job is ready to run, but delayed (eg: due to insufficient capacity)</td></tr><tr><td>Success</td><td>Triggered job has finished successfully</td></tr><tr><td>Failure</td><td>Triggered job has failed</td></tr></tbody></table>

{% hint style="info" %}
When deciding on the frequency to poll the progress of the load-data processes from your-side, we recommend having a minimum interval of 60 secs in-between each check that you make.
{% endhint %}

### 6. Check the Data Quality Report <a href="#fulldata-csv-stepsix-checkthedataqualityreport" id="fulldata-csv-stepsix-checkthedataqualityreport"></a>

Fredhopper compiles a data quality summary report for each full load and incremental data job, which may be viewed once the process has completed successfully. We recommend that these reports are reviewed regularly over time, e.g. monthly or quarterly, to ensure that the product data continues to be processed fully by Fredhopper and there are no errors.

{% code title="Example" overflow="wrap" %}

```
curl -D - -k -u username:password -X GET https://my.eu1.fredhopperservices.com/dq:test1/trigger/analyze/2018-02-22_15-15-28-fas_load-data/logs/data-quality-summary-report.txt
```

{% endcode %}

{% hint style="info" %}
For each check that is listed in the summary report, a more detailed report exists within a .gz Gnu zip file to provide the cause of any issues and these can be accessed from the 'data-quality' sub-folder. Using the above example, the detailed reports can be accessed from the following location:

<https://my.eu1.fredhopperservices.com/dq:test1/trigger/analyze/2018-02-22_15-15-28-fas_load-data/logs/data-quality/>
{% endhint %}

## Process Diagram <a href="#fulldata-csv-processdiagram" id="fulldata-csv-processdiagram"></a>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://crownpeak.gitbook.io/product-discovery/sending-and-managing-product-data/flatfiles/uploading-prod-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
