T-Cypher is a temporal query language that extends Cypher with temporal constructs. The extension consists mainly of adding temporal predicates to conventional pattern matching queries with a user-friendly syntax.
Besides graph variables that are defined in conventional graph matching queries, T-Cypher includes temporal variables such that a wide range of temporal predicates can be applied to add constraints on these variables. For instance, a temporal variable refers to the validity time interval of a node, relationship or property value. In order to add temporal predicates on these variables, a list of temporal operators (BEFORE, OVERLAPS, etc.) and functions (add, substract, etc.) is supported with T-Cypher. Different types of Temporal paths are also supported in T-Cypher. Each type has a unique semantic and covers a range of real-world scenarios. A T-Cypher query starts with a time slice clause that prunes the seach space of the query to a given set of time intervals or instants.
Let’s consider the toy graph below that will be used throughout the documentation toclarify the key query constructs.
In the following, each temporal query construct will be detailed.
Time slice clause
A T-Cypher query starts with an optional trim clause used to specify time intervals or instants that are, by default, applied on all the variables of a query to filter those that were not valid during any of the time intervals or time instants defined in the clause.
The RANGE_SLICE is used to set a list of time intervals such that the query variables must be valid during at least one of these intervals. The LEFT_SLICE or RIGHT_SLICE tokens are followed by a time instant to indicate that variables should exist before or after that time instant, respectively. The SNAPSHOT token is followed by a set of time instants to indicate that all variables of a query must be valid at at least one of these time instants. It should be noted that, by default, if the trim clause is not specified the query is evaluated on the most recent snapshot of the graph which is equivalent to the trim clause ‘SNAPSHOT NOW’.
Besides, tokens R standing for relationships, N standing for nodes and P standing for properties can be optionally used to specify on which type of graph variables the constraint of the trim clause should be applied. By default, the trim clause is applied on all the variables of the query
Three types of temporal relationship patterns, whose illustration is provided above, are included in T-Cypher: continuous, sequential and pairwise-continuous. In non-temporal graphs, we consider two nodes as connected if there exists a path between these nodes. This statement, however, does not hold for temporal graphs since relationships are assigned with a set of time intervals. Hence, depending on the temporal relation between the relationships of a path, the semantic of a temporal path defers.
◊ In a Continuous path, the time interval of the relationships of a path should intersect and the time interval of the path is equal to the intersection of the intervals of its relationships.
In the toy graph example, the path composed of r3 valid during [t3, t8], r6 valid during [t4, t5] and r11 valid during [t3, t8] represents a continuous path valid during the time interval [t4, t5] representing the intersection between the time intervals of the path.
◊ A Sequential path (also known as a time-respecting path), is defined as a sequence of events (relationships) such that every outgoing relationship of a node should have occurred after the incoming relationships to the same node. This type of temporal paths finds applicability in logistic chains. For example, a product starting from a given station can reach another station if there exists a sequential path representing product transfers between the stations.
In the toy graph example, the path composed of r4 valid during [t2, t3], r7 valid during [t4, t5], r9 valid during [t6, t7] and r12 valid during [t9, t11] represents a sequential path valid during [t2, t11] representing the range of the time intervals of the relationships of the path.
◊ A Pairwise-continuous paths are defined as a more relaxed variant of continuous path since. For this type of paths, consecutive relationships (incoming and outgoing relationships of the same node) must be overlapping.
In the toy graph example, the path composed of r3 valid during [t10, t15], r6 valid during [t12, t15] and r11 valid during [t14, t15] represents a pairwise continuous path valid during the time interval [t10, t15] representing the range of the time intervals of the path.
Besides, a delta type and a duration can be optionally added. In continuous paths, the duration refers to the duration of the path. In sequential paths, the duration refers to the duration between the ending and starting time instants of every pair of consecutive relationships. In pairwise-continuous paths, the duration refers to that of the intersection between consecutive relationships. The delta type indicates whether the specified duration refers to the minimum or maximum delta duration. By default, the actual duration is considered.
Temporal operators and functions
A list of temporal functions and operators are also added and permit the expression of a wide range of temporal predicates on the query variables. For instance, the expression: a@T BEFORE b@T AND ELAPSED_TIME(a @T) > 2H indicates that a occurred two hours before the occurrence of b using the operator BEFORE and function ELAPSED_TIME.
The temporal operators defined in T-Cypher include Allen’s temporal relations.
- i BEFORE j evaluates to true if i ends before j starts.
- i AFTER j evaluates to true if i starts after j.
- i OVERLAPS j evaluates to true if j starts after the start of i and finishes after the end of i.
- i STARTS j evaluates to true if i and j starts at the same time instant and i ends before j.
- i DURING j evaluates to true if i starts after and ends before j.
- i FINISHES j evaluates to true if i and j end at the same time instant and i starts after the start time instant of j.
- i EQUALS j evaluates to true if i and j starts and ends at the same time instant.
- i MEETS j evaluates to true if j starts at the finishing time instant of i.
- i MET BY j evaluates to true if i starts when j finishes.
- i OVERLAPPED BY j evaluates to true if i start after the starting time of j and ends after the ending time of j.
- i STARTED BY j evaluates to true if i and j starts at the same time instant and j ends before i.
- i CONTAINS j evaluates to true if i starts before the start of j and finishes after the end of j.
- i FINISHED BY j evaluates to true if i and j end at the same time instant and j starts after i.
In the following, the semantics of each temporal function is provided.
♦ ELAPSED_TIME: Returns the duration between the ending instant of the first input time interval and the starting instant of the second one. For example:
ELASPED_TIME([2019-01-01T07:00:00Z, 2019-01-01T08:00:00Z], [2019-01-01T10:00:00Z, 2019-01-01T11:00:00Z]) returns 7200000 ms.
Warning!Note that, we consider a granularity of milliseconds.
♦ DURATION: Returns the difference between the starting and ending time instants of the input time interval. For example:
DURATION([2019-01-01T07:00:00Z, 2019-01-01T09:00:00Z]) returns 7200000 ms.
♦ START: Returns the starting time a time interval.
♦ END: Returns the starting time a time interval.
♦ ADD: Takes a time instant and a duration and returns the time interval starting at that time instant and ending at the time instant plus the input duration. For example:
ADD([2019-01-01T07:00:00Z, 2019-01-01T09:00:00Z], 2H) returns [2019-01-01T07:00:00Z, 2019-01-01T11:00:00Z]
♦ SUB: Takes a time instant and a duration and returns the time interval ending at that time instant and starting at the time instant minus the time input duration.
SUB([2019-01-01T07:00:00Z, 2019-01-01T09:00:00Z], 2H) returns [2019-01-01T05:00:00Z, 2019-01-01T09:00:00Z]
♦ RANGE: Returns a time interval that covers the entire time range of input time intervals.
RANGE([2019-01-01T04:00:00Z, 2019-01-01T07:00:00Z], [2019-01-01T05:00:00Z, 2019-01-01T08:00:00Z], [2019-01-01T06:00:00Z, 2019-01-01T09:00:00Z]) returns [2019-01-01T04:00:00Z, 2019-01-01T09:00:00Z].
♦ INTERSECTION: Returns a time interval representing the intersection between input time intervals.
INTERSECTION([2019-01-01T04:00:00Z, 2019-01-01T07:00:00Z], [2019-01-01T05:00:00Z, 2019-01-01T08:00:00Z], [2019-01-01T06:00:00Z, 2019-01-01T09:00:00Z]) returns [2019-01-01T06:00:00Z, 2019-01-01T07:00:00Z].
Temporal values and variables
Temporal values represent either a time interval, instant or duration.
♦ Time instants are expressed either as calendar date times or as the total number of chronons since epoch using the token EPOCH. The time instant ‘NOW’ refers to the system’s current time. Assuming that we define a time domain with a granularity of milliseconds and a default time zone GMT, then ‘2021-03-08T08:00:00Z’ is interpreted as a time instant whose value is equal to 1615190400000. Note that for calendar date times we use the ISO-8601 format.
♦ Time intervals are defined by the starting and ending time instants.
♦ Duration is expressed as the total number of time instants given a system’s granularity. For example, ‘2 H’ refers to a duration of two hours which is internally converted to the system’s granularity (7200000 if the system’s granularity is milliseconds).
Temporal variables referring either to the time interval of a node, relationship or value of a property are expressed with the token @T. For instance, a@T refers to the time interval of the variable a.