ETL end to end testing with docker, NoSQL, RDBMS and Big Query.

End to end testing

ETL

Docker

ETL application

curl -d "@transfer.json" -X POST http://localhost:8080/v1/api/transfer
{

"Mode": "insert",
"Source": {
"DriverName": "mysql", ... other connection details
"Query": "SELECT * FROM source_table"
},


"Dest": {
"DriverName": "bigquery", ... other connection details
"Table": "target_table"
}

}
curl http://127.0.0.1:8080/v1/api/tasks

A system application environment setup.

Data store schema and static data setup.

Application build and deployment.

  • Building app
cd /tmp/ && git clone https://github.com/adrianwit/dstransfer 
cd dstransfer
docker build -t adrianwit/dstransfer:0.1.0 .
  • Starting app
cd /tmp/dstransfer/config/ && docker-compose up  -d

Test plan

End to end testing implementation

Testing project

https://endly-external.appspot.com/
https://github.com/adrianwit/dstransfer-e2e/tree/master/initial

The first run

cd ~/Download/dstransfer/e2e
endly -t='init,test'
docker ps
  • Aerospike
docker exec -it endly_db1  aql
show sets
SELECT * FROM dummy_type
SELECT * FROM db2.dummy_type
  • MySQL
docker exec -it endly_db3 mysql -uroot -p 
USE db3;
SELECT * FROM dummy_type
  • Application status
curl http://127.0.0.1:8080/status

Initial regression workflow

  • each use case stores: setup, expected data and test instruction in dedicated folder under regression/use_cases/001_xxxxx
https://github.com/adrianwit/dstransfer-e2e/tree/master/initial/dstransfer/e2e/regression
  • regression.csv define a regression test workflow using neatly format.
  1. Register data store driver, set database IP by inspecting corresponding docker container, and optionally sets initial data state: data.yaml

Test design

{
"id": 1,
"name": "user 1",
"visited":"2018-01-15 08:02:23 UTC",
"perf": {
"rank": 100,
"score": 6.5
},
"quiz": {
"1": {
"id": 1,
"score": 10,
"taken": "2018-01-10 16:02:01 UTC"
},
"2": {
"id": 2,
"score": 3,
"taken": "2018-01-15 08:02:23 UTC"
}
}
}
CREATE OR REPLACE TABLE users (
id INT64 NOT NULL,
name STRING NOT NULL,
visited TIMESTAMP,
perf STRUCT<
rank INT64,
score FLOAT64
>,
quiz ARRAY<STRUCT<
key STRING,
value STRUCT<
id INT64,
score INT64,
taken TIMESTAMP
>>>
);
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
perf_rank INT,
perf_score DECIMAL(7, 2),
quiz TEXT,
visited TIMESTAMP
);
{
"Method": "post",
"Request": "$request",
"URL": "http://localhost:8080/v1/api/transfer",
"Expect": {
"Status": "ok"
}
]
}
{
"Mode": "insert",
"Source": {
"DriverName": "bigquery",
"Query": "SELECT * FROM users"
},
"Dest": {
"DriverName": "mysql",
"Table": "users"
}
}
{
"Method": "get",
"URL": "http://localhost:8080//v1/api/task/${taskId}",
"Expect": {
"Status": "done"
},
"Variables": [
{
"Name": "status",
"From": "Status"
}
],
"Repeat": 10,
"SleepTimeMs": 3000,
"Exit": "$status:!/running/"
}

Use cases

https://github.com/adrianwit/dstransfer-e2e/tree/master/final/dstransfer/e2e/regression/use_cases
  1. Test a basic transfer within the same data store type
  2. Stress test transfer with 40 000 records transfer
  3. Transfer data from NoSQL to Big Query with the following SQL:
SELECT id, name, visited, perf, ARRAY(quiz) AS quiz FROM users
SELECT id, name, TIMESTAMP(visited) AS visited, perf.rank AS perf_rank, perf.score AS perf_score, TO_JSON_STRING(quiz) AS quiz  FROM users
SELECT id, name, FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', visited) AS visited, perf.score AS perf_score, perf.rank AS perf_rank, TO_JSON_STRING(quiz) AS quiz FROM users

Working with endly

  • Running all tasks
endly 
  • System, data store initialization and application build and deployment.
endly -t=init
  • Running all tests
endly -t=test
  • Running individual test
endly -t=test -i=rdbms_basic
  • Running individual test with debug logs
endly -t=test -d=true -i=nosql_to_bigquery
Endly runner output

End to end testing made easy

End to end testing — next steps

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Web-scraping: a beginner’s tips on how to inspect websites using google chrome and extract…

What is Google Cloud Interconnect?

Liferay Enterprise Portal Software — A Hot Topic of Discussion

Everything is on fire…and that’s ok: HA architecture and disaster recovery with AWS

Don’t waste your money. Keep your GCP budget.

Life beyond “learn to code”

How To: Building a Debouncer…in Java

Tkinter Entry Widget in Python

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adrian Witas

Adrian Witas

More from Medium

Install and Configure mysqlbackup in Mysql Enterprise Edition

SQL Server Integration Services (SSIS) Deployment Strategies

MongoDB — Unions 🇬🇧

OAuth 2.0 explaining with simple example.