Full and Incremental Feed Uploads

This page describes the the steps to follow when uploading a full or incremental item data feed to Fredhopper.

Full Feeds

Once the format of the product files is complete, they will need to be uploaded to Fredhopper via the REST API. We recommend completing a full product data update on your production environment using this method at least once every 24hrs on an automated schedule. The files must be included in an archive file named 'data.zip' and an accompanying md5 checksum must also be generated. When the archive file has been uploaded successfully to the data repository, a trigger should be sent to instruct the system to load the new data into the relevant FAS instance. If a full data update is sent and triggered when an incremental data update is already running, the load-data job will be added to a queue and the system will process it when the other job has completed.

It is important to note that, when the Indexer node is completing a re-index of the product data, it will not be possible to access the Merchandising Studio or the pre-published query endpoint as these services are hosted on here. Access is restored as soon as the job has completed and the job starts to re-index the next node in the solution. The published query endpoint will never become inaccessible during this process because redundancy is built into the solution.

File Name: data.zip

Example URL: https://my.eu1.fredhopperservices.com/fas

Incremental Feeds

Once the format of the incremental product files is complete, they will need to be uploaded to Fredhopper via the REST API. A typical incremental update frequency would be every 15-60 mins. The files must be included in a 'data-incremental.zip' archive file and an md5 checksum must also be generated. When the archive file has been uploaded successfully to the data repository, a trigger should be sent to instruct the system to load the new data into the relevant FAS instance. If an incremental update is sent and triggered when a full data update is already running, the update job will be added to a queue and the system will process it when the other job has completed.

It is important to note that all services remain online and accessible during the incremental update re-index process.

File Name: incremental-data.zip

Example URL: https://my.eu1.fredhopperservices.com/fas

Upload Steps

These are the steps to follow when uploading data to Fredhopper, this process is the same for full and incremental feeds.

The file names and URL's to upload too will differ and are covered in the examples.

1. Create a ZIP archive

Create a zip file containing the data you need to upload to Fredhopper.

Full feeds should be named data.zip and incremental feeds named data-incremental.zip

zip data.zip *.csv

zip data-incremental.zip *.csv

The example above is for CSV files, for JSON files this is the same process, just specify the correct file type.

2. Create MD5 checksum

Generate an md5 checksum value zip file.

md5sum data.zip > data.zip.md5

md5sum data-incremental.zip > data-incremental.zip.md5

This checksum we'll use to validate the upload.

3. Upload ZIP archive to Fredhopper

Upload the zip file to the Fredhopper Managed Services environment that is given to you by your Technical Consultant using the 'fas' service interface.

Please note to include the checksum value in the request, as per the example below which is directed to the test1 instance.

curl -D - -k -u username:password -X PUT -H "Content-Type: application/zip" --data-binary @data.zip https://my.eu1.fredhopperservices.com/fas:test1/data/input/data.zip?checksum=bbb0ecb6182d6da0fde740d14e8ed9f7

curl -D - -k -u username:password -X PUT -H "Content-Type: application/zip" --data-binary @data-incremental.zip https://my.eu1.fredhopperservices.com/fas:test1/data/input/data-incremental.zip?checksum=c4127cf7d913ecbe9fe4949706384f28

Once the file has been uploaded, the system will send a simple HTTP response back with some important information contained within the header and body sections:

Example

HTTP/1.1 100 Continue
Via: 1.1 my.fredhopperservices.com
HTTP/1.1 201 Created
Date: Wed, 21 Feb 2018 15:14:25 GMT
Server: Apache-Coyote/1.1
Location: https://my.eu1.fredhopperservices.com/fas:test1/data/input/2018-02-21_15-14-20/
Content-Type: text/plain
Via: 1.1 my.fredhopperservices.com
Vary: Accept-EncodingConnection: close
Transfer-Encoding: chunked
 
data-id=2018-02-21_15-14-20

There is no different in the response between full and incremental feeds.

You should capture the 'data-id' section in the HTTP response body that you receive from the API as this will be used in subsequent steps. The 'data-id' value is unique to every request.

4. Trigger for data to be loaded

Thus far, we have only uploaded the data to the Fredhopper Managed Services environment. Now, we shall initiate a trigger instructing Fredhopper to re-index with the new data and you must use the 'data-id' value that you captured in the previous step.

Example

curl -D - -k -u username:password -X PUT -H "Content-Type: text/plain" --data-binary "data-id=2018-02-21_15-14-20" https://my.eu1.fredhopperservices.com/fas:test1/trigger/load-data

The HTTP response header that you receive back from our system at this stage contains a new 'Location' value, which we can use to monitor the status of the re-index.

Example

HTTP/1.1 201 Created
Date: Wed, 21 Feb 2018 15:15:29 GMT
Server: Apache-Coyote/1.1
Location: https://my.eu1.fredhopperservices.com/fas:test1/trigger/load-data/2018-02-21_15-15-28
Content-Length: 0
Via: 1.1 my.fredhopperservices.com
Vary: Accept-Encoding
Connection: close
Content-Type: text/plain

5. Monitor the status of the re-index

The status value for the re-index job can be checked by sending the following command using the 'Location' value that you captured in the previous step.

Example

curl -D - -k -u username:password -X GET https://my.eu1.fredhopperservices.com/fas:test1/trigger/load-data/2018-02-22_15-15-28/status

Possible status codes returned are:

Status

Description

Unknown

No known state yet: trigger has not yet been picked up

Scheduled

Trigger has been picked up, and will start execution soon

Running

Triggered job is running currently

Delayed

Triggered job is ready to run, but delayed (eg: due to insufficient capacity)

Success

Triggered job has finished successfully

Failure

Triggered job has failed

When deciding on the frequency to poll the progress of the load-data processes from your-side, we recommend having a minimum interval of 60 secs in-between each check that you make.

6. Check the Data Quality Report

Fredhopper compiles a data quality summary report for each full load and incremental data job, which may be viewed once the process has completed successfully. We recommend that these reports are reviewed regularly over time, e.g. monthly or quarterly, to ensure that the product data continues to be processed fully by Fredhopper and there are no errors.

Example

curl -D - -k -u username:password -X GET https://my.eu1.fredhopperservices.com/dq:test1/trigger/analyze/2018-02-22_15-15-28-fas_load-data/logs/data-quality-summary-report.txt

For each check that is listed in the summary report, a more detailed report exists within a .gz Gnu zip file to provide the cause of any issues and these can be accessed from the 'data-quality' sub-folder. Using the above example, the detailed reports can be accessed from the following location:

https://my.eu1.fredhopperservices.com/dq:test1/trigger/analyze/2018-02-22_15-15-28-fas_load-data/logs/data-quality/

Process Diagram

PreviousIncremental Feed CSV Data Format NextSuggest Service Data Feeds

Last updated 9 months ago