Entities are distinct, independent things about which Cybersyn provides data such as websites, companies, or geographies. This table for a given entity type contains all possible instances of that entity type, along with a unique identifier. The unique identifier,
_id, should be used to join across datasets. This table contains permanent or long-lived characteristics describing an entity.
Each row represents a distinct entity. The table is wide, in that immutable characteristics are expressed in their own fields.
Links between two entities. These links can be hierarchical (ie. a geography contained within another geography) or not (ie. a geography overlapping with another geography). Relationships can also be temporal – valid for an interval defined by specific start and end dates.
Each row represents a relationship or characteristic of an entity. The table is long, as every distinct attribute or characteristic related to an entity has its own row along with its associated metadata (e.g. the start and end date of the relationship). So, a distinct entity appears multiple times in this table, once for every characteristic and relationship it has.
Descriptors of an entity that are temporal. They have a start date and end date. For convenience, characteristics are included in the relationships table with the difference that there is no explicit entity id referenced. If a characteristic is immutable, then it can be included in wide form in the index table.
Timeseries are temporal statistics or measures centered around an entity and timestamp. Timeseries are abstract concepts (ie. a measure) rather than a concrete thing. Timeseries are identified by an id that can be used to join to their attributes table, that describes the timeseries in a structured form. A timeseries may have more than one
entity_id (e.g., geography + company).
Each row represents a distinct timeseries, date, and value. So, every timeseries id will have multiple rows, one for each value in the timeseries.
Attributes are descriptors of a timeseries. This table can be used to filter through time series IDs using structured, wide fields to filter on the desired timeseries Id. An attribute is the equivalent of a characteristic except for the abstract timeseries rather than the concrete entity.
Each row represents a distinct timeseries along with attributes that describe that timeseries in a wide format. There is a single row for each distinct timeseries.