Reconciliation

Reconciliation is the process used to resolve document _id and _index issues caused by data reingestion. Each document stored in Elasticsearch has an _id and _index, which Investigate uses to store document references internally. When data is reingested into Elasticsearch, new _id and _index values are assigned to the documents. This causes the internal references in Investigate to become obsolete, as they point to an _id or _index that no longer exists. This issue is more likely if regular reingestion is part of your data pipeline.

For example, saved graph objects store document _id and _index references. After reingestion, any graphs containing obsolete references can still be opened in the Graph Browser, but the affected nodes will be disconnected. Their associated documents will be inaccessible, so some operations like opening nodes in the Record Viewer will not be possible.

Reconciliation identifies obsolete _id and _index references and updates them to the corresponding values of the newly ingested documents. The old and new documents are identified by primary key (PK) fields whose values, when combined, form a unique identifier for a record. You can also select just one primary key field if it has a unique value for each record. When the primary key fields are set, the reconciliation API resolves the document issues.

Only administrators can run reconciliation.

Performing data reconciliation

  1. Ensure the primary key fields are set in the Info tab of the data model. See Creating entity tables and Primary key fields.

  2. Use the Reconciliation API to start the reconciliation process.

    Example:

    curl -H "Content-Type: application/json" -H 'kbn-xsrf: any' -u sirenadmin:password -L -XPOST 'http://localhost:5606/api/collections/reconciliation' -d '{}'

If your data has a reingestion pipeline, add the call to the Reconciliation API to your pipeline to automate the process.