Click or drag to resize
Motivating Example
Zet Universe

[This is preliminary documentation and is subject to change.]

In this example you can see a typical lifecycle of an Entity that was added to the Semantic Pipeline, and further enhanced by the Zet Universe built-in and third-party plug-ins.

Dissecting the Semantic Pipeline Process

Although the following example highlights entity extraction from text documents, the system as it is envisaged is capable of extracting arbitrary features from any binary stream.

Below is a somewhat artificial example of a set of information processors that may run as a new document is added to the system:

  1. The File System Watcher in the plugin Local Folders ZApp writes an entry containing a file reference to the "/Incoming/Files/Generic/" topic.

  2. The File Kind Extraction processor is notified about a new file, and maps the given document to a known Kind by its extension, and publishes this entity to the "/Classifications/Kinds/". At the same time, the Full-Text Extraction processor is also notified about a new file, and it extracts full text from the file, and publishes this entity to the "/Content/FullText/".

  3. The Local Folders ZApp plugin checks file contents of the newly added file and sends its corresponding entity for full-text extraction to the "/Incoming/Files/Documents/" topic.

  4. The Full-Text Extraction processor is notified about a newly added entity with an associated Document file, extracts its full-text and publishes entity to the "/Properties/FullText" topic.

  5. The Keyphrase Extraction processor is notified about a newly extracted full-text posted to the "/Properties/FullText" topic in the Semantic Pipeline, then it calculates and extracts high-frequency keyphrases in the document, and then publishes the Entity to the "/Properties/Keyphrases/" topic so that other processors could analyze the updated keyphrases.

  6. The Entity Extraction processor is notified about a newly extracted full-text posted to the "/Properties/FullText" topic in the Semantic Pipeline, and extracts mentioned Entities in the document, and then publishes the Entity to the "/Relationships/" topic so that other processors could analyze the updated relationships.

Semantic Pipeline Process Visualized as a Table

Incoming Topic

Description

Plugin

Outgoing Topic

New file found, but it's kind is unknown

Local Folders ZApp

/Incoming/Files/Generic/

/Incoming/Files/Generic/

File extension should be mapped to a known Kind

File Kind Extractor

/Classification/Kinds/

/Incoming/Files/Generic/

Entity links to a file, it's contents should be checked for full-text

Local Folders ZApp

/Incoming/Files/Documents/

/Incoming/Files/Documents/

Entity links to a document, text should be extracted

Full-Text Extractor

/Properties/FullText/

/Properties/FullText/

Keyphrases should be extracted from the full text

Keyphrase Extractor

/Properties/Keyphrases/

/Properties/FullText/

Entities should be extracted from the full text

Entity Extractor

/Relationships/

See Also

Concepts