[This is preliminary documentation and is subject to change.]
In this example you can see a typical lifecycle of an Entity that was added to the Semantic Pipeline, and further enhanced by the Zet Universe built-in and third-party plug-ins.
Although the following example highlights entity extraction from text documents, the system as it is envisaged is capable of extracting arbitrary features from any binary stream.
Below is a somewhat artificial example of a set of information processors that may run as a new document is added to the system:
The File System Watcher in the plugin Local Folders ZApp writes an entry containing a file reference to the "/Incoming/Files/Generic/" topic.
The File Kind Extraction processor is notified about a new file, and maps the given document to a known Kind by its extension, and publishes this entity to the "/Classifications/Kinds/". At the same time, the Full-Text Extraction processor is also notified about a new file, and it extracts full text from the file, and publishes this entity to the "/Content/FullText/".
The Local Folders ZApp plugin checks file contents of the newly added file and sends its corresponding entity for full-text extraction to the "/Incoming/Files/Documents/" topic.
The Full-Text Extraction processor is notified about a newly added entity with an associated Document file, extracts its full-text and publishes entity to the "/Properties/FullText" topic.
The Keyphrase Extraction processor is notified about a newly extracted full-text posted to the "/Properties/FullText" topic in the Semantic Pipeline, then it calculates and extracts high-frequency keyphrases in the document, and then publishes the Entity to the "/Properties/Keyphrases/" topic so that other processors could analyze the updated keyphrases.
The Entity Extraction processor is notified about a newly extracted full-text posted to the "/Properties/FullText" topic in the Semantic Pipeline, and extracts mentioned Entities in the document, and then publishes the Entity to the "/Relationships/" topic so that other processors could analyze the updated relationships.
New file found, but it's kind is unknown
Local Folders ZApp
File extension should be mapped to a known Kind
File Kind Extractor
Entity links to a file, it's contents should be checked for full-text
Local Folders ZApp
Entity links to a document, text should be extracted
Keyphrases should be extracted from the full text
Entities should be extracted from the full text