![]() WHERE measurement = 'foo' and time > now() - '1 hour' Then querying for the exponential moving average of the entire dataset can be rather simple:Įxponential_moving_average(value, 0.5) OVER (ORDER BY time) By comparison, computing a regular SUM would be an aggregate but not a window function, while comparing the delta between two rows would be a window function but not an aggregate. It is an aggregate function because it takes multiple input rows, and is a window function because it computes a value for each input row. (This is a simple SQL aggregate window function that computes the exponential moving average. (sfunc = exponential_moving_average_sfunc, stype = numeric) (state numeric, next_value numeric, alpha numeric)ĮLSE alpha * next_value + (1-alpha) * stateĬREATE AGGREGATE exponential_moving_average(numeric, numeric) SQL: Creating the custom aggregate CREATE OR REPLACE FUNCTION exponential_moving_average_sfunc The syntax in PostgreSQL is the following (via StackOverflow): Creating a custom aggregate is supported in most modern SQL implementations, including PostgreSQL. One can similarly write the same function in modern SQL, by creating a custom aggregate. The only way that Flux could support this function is if its creators wrote their own exponentialMovingAverage() function. We find this argument incorrect for a key reason. As a result, their argument is that common time-series related queries (such as computing an EMA) are quite complex in SQL. Yes, this SQL example is quite verbose, and only computes part of the desired output (i.e., it only computes a moving average, not an exponential moving average). 'epoch'::timestamp + '900 seconds'::interval * (extract(epoch from time_read)::int4 / 900) as interval_group, Id - row_number() over (partition by interval_group order by time_read) as group_nr The post authors then struggle to write this query in SQL, and instead produce a fairly complex query that computes a simple moving average:Īvg(temp) over (partition by group_nr order by time_read) as rolling_avg |> filter(fn: (r) => r._measurement = "foo") Here’s the canonical example InfluxDB uses in their post, which computes an exponential moving average (EMA), to support their claim: With that in mind, we’ll return to the main discussion.) So we believe the decade in which something was invented is irrelevant to this discussion. Also, common modern languages such as English were “invented” hundreds of years ago, and have also evolved over time, while other languages have fallen out of use. These technologies, like SQL, have evolved significantly since their initial invention, while other technologies have not. This point is odd, especially considering that other foundational technologies were also born in that same era: the relational database, the microprocessor, and the Internet. (There’s also a third reason stated in the post: that SQL was invented in the 1970s. ![]() ![]() flow-based functional processing: They believed that a time-series query language needed a flow-based functional model, as opposed to the relational algebra model in SQL. Query support: They concluded SQL could not easily handle the types of queries they wanted to run on time-series data.To analyze the differences between SQL and Flux, let’s first dig into why the InfluxDB team created Flux, as opposed to embracing SQL.Īccording to this post, InfluxDB created Flux primarily for two reasons: So how do these two query languages compare? (Read the Flux announcement and the Hacker News reaction.) This has allowed InfluxDB to create a new query language that its creators would argue overcomes some SQL shortcomings that they experienced. In contrast, InfluxDB began with a “SQL-like” query language (called InfluxQL), placing it in the middle of the spectrum, and has recently made a marked move towards the “custom” end with its new Flux query language. This has enabled TimescaleDB to require a zero learning curve for new users, and allowed it to inherit the entire SQL ecosystem of 3rd party tools, connectors, and visualization options, which is larger than that of any other time-series database. With time-series databases, the differences between these languages can be seen by comparing TimescaleDB and InfluxDB.įrom the beginning, TimescaleDB has firmly existed at the SQL end of the spectrum, fully embracing the language from day 1, and later further extending it to simplify time-series analysis. Generally in the world of database query languages, there have been two extremes: full SQL support on one end, and completely custom languages (sometimes known as “NoSQL”) on the other. An examination of the strengths and weaknesses of SQL, and why query planners exist, by way of comparing TimescaleDB and InfluxDB.
0 Comments
Leave a Reply. |