- 28 Aug, 2019 25 commits
-
-
Ante Kresic authored
Tags used to be only string values but with recent changes, they get their types correctly inserted into the schema. This commit does that for jsonb tagsets, using number values for corresponding values and strings for everything else.
-
Ante Kresic authored
This change should put the query on equal footing with the Timescale version of the query because the time interval handling was pushed to the query engine but now it should be handled in query generation.
-
Ante Kresic authored
This is a second pass at optimizing the queries, mainly includes filtering out trucks with NULL valued names for better performance.
-
Ante Kresic authored
The time format used for Influx was the same as RFC3339, so rather than redefining that format ourselves we just use the provided constant from the `time` package.
-
Blagoj Atanasovski authored
The GET method allows only read-only queries to be issued, for the iot use case we need to issue a SELECT INTO query. The POST method of the InfluxDB API accepts both write and read-only queries.
-
Ante Kresic authored
Added first pass of Influx queries to query generator with accompanying unit tests.
-
Blagoj Atanasovski authored
-
Blagoj Atanasovski authored
-
Blagoj Atanasovski authored
Tests are written to cover the new code and the file format for ClickHouse and Postgres/TimescaleDB is updated to have the tag types in the header
-
Blagoj Atanasovski authored
Changes are done to the Point interface where tags are no longer just strings, and serializers for the different databases are modified to serialize the tags depending on their type. For TimescaleDB and Postgres it doesn't matter which type is it, for InfluxDB the tag is serialized as a field.
-
Ante Kresic authored
NormFloat64 returns a value in the range of [-math.MaxFloat64, +math.MaxFloat64] which means the current load value can be negative which it should not be. Changing that function to be Float64 which should return a value in the range of [0.0,1.0).
-
Ante Kresic authored
This change fixes aforementioned flags whose implementations were commented out and not working properly.
-
Ante Kresic authored
Previous description referred hosts which are relevant only for the devops use case. With new use cases added, description had to be changed to include them as well.
-
Blagoj Atanasovski authored
A loading worker can be configured to have a minimum interval between two batches being inserted. Configuration is optional, if not configured batches are inserted ASAP. Intervals are expressed in seconds. Interval can also be configured to support range sleeping intervals. The sleep regulator is in charge of putting the calling goroutine to sleep
-
Ante Kresic authored
Previous implementation used the value `NULL` when inserting missing values into jsonb column. This commit fixes that to the correct value `null`.
-
Ante Kresic authored
Adding the necessary boilerplate for supporting the queries of the new IoT use case and implementing the first versions of them. A single pass of optimizations has been done but more optimization passes are needed by a TimescaleDB query specialist.
-
Rob Kiefer authored
Since strings.ReplaceAll was only added in 1.12, it is not usable for all the Go versions we currently support (1.9+). Even if we were to drop some of the older versions we support, like 1.9 or 1.10, it still would not compile on versions < 1 year old. So for now, we'll use the old way.
-
Blagoj Atanasovski authored
With the introduction of the possibility of missing values in the IoT use case, the serializer for InfluxDB needs to be made aware and skip those tags and fields in order to generate proper Line Protocol inserts
-
Ante Kresic authored
Fuel state range changed from 0.0 - 100.0 to 0.0 - 1.0 to better reflect real world (car gauges report from Empty to Full). We also add refueling the trucks when the state goes under minimum value.
-
Ante Kresic authored
Previous implementation assumed devops use case by default. Here we refactor to support multiple use cases and add the initial iot use case query generator for timescaledb database.
-
Ante Kresic authored
-
Ante Kresic authored
IoT data can contain empty field and tag values. We need to support that in the data loaders to be able to load the data correctly into the database, in this case Timescale database. We also add some tests to verify that empty field and tag values are stored correctly.
-
Ante Kresic authored
IoT data sets contain a lot of irregularities like lots of gaps, out of order entries, missing entries, zero values etc. This change updates the data generator so it can create data sets which contain these features in a deterministic way.
-
Ante Kresic authored
This first version of the data generator behaves similarly as the devops use case and does not contain any data irregularity features which will be added in a future commit.
-
Ante Kresic authored
This improves code quality by extracting the common parts of the logic which can be reused for multiple use cases. First step in to creating data generators for the next use case.
-
- 26 Jul, 2019 1 commit
-
-
Ante Kresic authored
When using the `pgx` sql driver, running the query does not wait for a response from the server. In order to verify that the query has returned complete results, we must run `Rows.Next()` until it returns false meaning we have fetched all the rows from the query result. Note that this behavior is different than the current implementation of the `pq` driver.
-
- 15 Jul, 2019 1 commit
-
-
Stephen Polcyn authored
Previously, the -n flag sent its data to the "max-queries" variable, which results in an unknown variable name when running the script because the python variable used to generate the run script is 'limit' (see line 163). "Max-queries" is only applicable as a flag for the tsbs_run_queries script, i.e., "--max-queries=###".
-
- 17 Jun, 2019 1 commit
-
-
Ruslan Kovalov authored
This includes support for data generation and querying for the devops use case.
-
- 28 May, 2019 1 commit
-
-
Blagoj Atanasovski authored
The statProcessor responsible for gathering statistics when executing queries was built as a struct. This commit will change it to an interface to make the BenchmarkRunner code more easier to test. This commit also adds some unit tests for the benchmark runner that check if proper argument checks are done, and if proper init happens when the Run method is called
-
- 24 May, 2019 1 commit
-
-
Ante Kresic authored
Covering query generation functions for Influx, ClickHouse and SiriDB databases. Tests are covering basic pre-generated outputs and provide visual sanity checks. More robust tests are left as a future task.
-
- 22 May, 2019 3 commits
-
-
Rob Kiefer authored
This interface is not tied to the devops use case in any way, so its naming was a misnomer. It is actually generic and can be used for any use case since, so this renaming reflects that.
-
Rob Kiefer authored
Previously the devops use case generation code used a call to log.Fatalf when something went wrong. This makes it awkward to test error conditions when generating queries from other packages, since we need a way to (a) replace the unexported call to log.Fatalf and (b) prevent the runtime from actually quitting. It is better for the library to actually return errors on calls that can fail, rather than either fataling or panicking. Now other packages can handle the errors themselves and also test error conditions in their packages as well. This refactor was pruned a bit to bubble the 'panic' up one level for now. When the actual generation code encounters the error during normal execution, it will panic. But these are easier to test for and don't require adding hooks to replace the 'fatal' path in the original package.
-
Rob Kiefer authored
Query generation and Cassandra's query running both used a type called TimeInterval that did roughly the same thing. This change combines the two into one type that can be used from the utils package in internal/. This improves code reuse and keeps the two representations in sync, and also increases the testability of the code.
-
- 20 May, 2019 1 commit
-
-
Ante Kresic authored
For now the tests are mainly matching the output against pre-generated/known outputs to ensure we have some coverage. A more robust checking, e.g., making sure the semantics of the query are actually correct, is a future task.
-
- 25 Apr, 2019 1 commit
-
-
Lee Hampton authored
This fixes a bug where the PostCreateDB function would exit early when the user set --do-create-db=false and/or --create-metrics-table=False. This early exit caused TSBS to skip the updating of some global caches, which broke assumptions in other parts of the codebase. This commit also refactors the PostCreateDB function to split the parsing of columns and the potential creation of tables and indexes into separate functions. This makes it easier to test the functions in isolation and cleaner to create the conditional create-table logic that is at the heart of this bug. While this does add tests to the parsing function, the create tables/index function remains untested. This is left for a later PR that will hopefully clean up global state and provide a more comprehensive framework for testing IO.
-
- 18 Apr, 2019 1 commit
-
-
Rob Kiefer authored
Unlike libpq/sqlx, pgx expects JSON/B fields in the copy command to be in the 'native' format, which is a map[string]interface{}, not a string in valid JSON format. Without this change, the copy would fail with "ERROR: unsupported jsonb version number 123".
-
- 09 Apr, 2019 1 commit
-
-
Rob Kiefer authored
This PR continues on the work in the previous one that changed tsbs_generate_data to use a new internal/inputs package. This PR adds a new Generator type for query generation called QueryGenerator. Now that these two generators share some common code, they both become much more robust and easier to test and manage. Previously tsbs_generate_queries had no test coverage, but with this change it will actually have quite high coverage. There are still some rough spots with this refactor. In particular, how the useCaseMatrix is handled needs some more thought, especially if we are going to add more use cases going forward. Additionally, the database specific flags like TimescaleUseJSON could probably be handled in a cleaner way as well.
-
- 04 Apr, 2019 1 commit
-
-
Rob Kiefer authored
For a long time, our two generation binaries -- tsbs_generate_data and tsbs_generate_queries -- have shared (roughly) a fair bit of code, especially when it comes to flags and validation. However, they were never truly in sync and combining them has been a long wanted to-do. Similarly, to enable better tooling around TSBS, it would be beneficial if more of its functionality was exposed as a library instead of a CLI that needs to be called. To those ends, this PR is a first step in addressing both of them. It introduces the internal/inputs package, which can eventually be moved to pkg/inputs when we are ready for other things to consume its API. This package will contain the structs, interfaces, and functions for generating 'input' to other TSBS tools. For now, that only includes generating data files (for tsbs_load_* binaries) and query files (for tsbs_run_queries_* binaries). This PR starts by introducing these building blocks and converting tsbs_generate_data to use it. The idea is that each type of input (e.g., data, queries) is handled by a Generator, which is customized by a GeneratorConfig. The config contains fields such as the PRNG seed, number of items to generate, etc, which are used by the Generator to control the output. These GeneratorConfigs come with a means of easily adding their fields to a flag.FlagSet, making them work well with CLIs while also not restricting their use to only CLIs. Once configured, this GeneratorConfig is passed to a Generator, which then produces the output. This design has a few other nice features to help cleanup TSBS. One, it uses an approach of bubbling up errors and passing them back to the caller, allowing for more graceful error handling. CLIs can output them to the console, while other programs using the library can pass them to another error handling system if they desire. Two, Generators should be designed with an Out field that allows the caller to point to any io.Writer it wants -- not just the console or a file. The next step will be to convert tsbs_generate_queries to use this as well, which will be done in a follow up PR.
-
- 28 Mar, 2019 1 commit
-
-
niksa authored
Using binary format when talking to TimescaleDB means less data being sent back and forth. Config option is added to force TEXT format if needed (binary is default). PGX driver is used for binary and PQ driver for TEXT. Based on some benchmarks binary should increase write throughput by 5-10% and result in about 35% faster queries.
-
- 20 Mar, 2019 1 commit
-
-
Ante Kresic authored
on the longer side but it makes a clear point not to confuse it with a similar flag used for specifying the benchmark database name.
-