Skyminer Time Series Python Connector

Skyminer Time Series Python Connector is a Python3 library used to access the Skyminer API.
You can retrieve data in your Python application.
Please refer to the python library documentation for more details.

Install

  • pip install httplib2 pandas matplotlib

  • Copy the SkyminerTS and SkyminerTSPlot directories in your project.

  • Import it in your Python script with from KairosAPI import <Modules>

Optional (Update aggregators)

./Scripts/Update.py -u <api_url>

Import

from SkyminerTS import MODULE1, MODULE2 ...
from SkyminerTSPlot import MODULE3 ...

Example :

from SkyminerTS import MetricBuilder, QueryBuilder, TimeUnit, STSAPI, QueriesToDataframe
from SkyminerTSPlot import plot_skyminer_dataframe

Quickstart

Get the datapoints of the last hour

from SkyminerTS import MetricBuilder, QueryBuilder, TimeUnit, STSAPI, QueriesToDataframe, TimeRelative

# Init the API
API = STSAPI.init("http://url-to-skyminer/api/v1/")
# Build our metric
MB = MetricBuilder("kairosdb.datastore.cassandra.client.requests_timer.min")
# Build our query
QB = QueryBuilder()
# Define the timerange
QB.with_start_relative(1, TimeUnit.HOURS)
# Put the metric in the query
QB.with_metric(MB)
# Retrieve the result
result = API.get_data_points(QB.build())
# Convert the result as Dataframes
DF = QueriesToDataframe(result)
# Print Dataframes
print(DF)

Plot the result of a Skyminer query

from SkyminerTS import STSAPI
from SkyminerTSPlot import line_plot_dataframe

# Init the API
skyminerAPI = STSAPI.init(SERVER_URL + "/api/v1")

# Get the result of the query
queryResult = skyminerAPI.get_data_points(WEB_ARGS["query"])

# Convert the result of the query to a Panda Dataframe
dataframe = QueriesToDataframe(queryResult)

# Plot the dataframe
plot_skyminer_dataframe(dataframe=dataframe, one_plot_per_group=False, plot_width=14, plot_height=9)

Modules

STSAPI

STSAPI is the interface with the Skyminer Server. You pass QueryBuilders to the STSAPI.

STS = STSAPI.init("http://ip_skyminer/api/v1")
MB = MetricBuilder("my_metric")
QB = QueryBuilder()
QB.with_start_relative(1, TimeUnit.HOURS)
QB.with_metric(MB)
result = API.get_data_points(QB.build())
init(url, charset='utf-8', timeout\_ms=100, disable\_ssl\_certificate\_validation=False, ca\_certs=None)

Change the timeout of requests :

STS = STSAPI.init("http://ip_skyminer/api/v1", timeout_ms=1000)

You can disable the ssl certificate validation :

STS = STSAPI.init("http://ip_skyminer/api/v1", disable_ssl_certificate_validation=True)

You can use your own self-signed certificate :

STS = STSAPI.init("http://ip_skyminer/api/v1", ca_certs="/certs/mycert.crt")
Use get_data_points to get data points

from the Skyminer server.

STS.get_data_points(query)

QueryBuilder

Use QueryBuilder to build query and send it to the Skyminer server with STSAPI . You can pass MetricBuilders to QueryBuilder to create your query.

MB = MetricBuilder("my_metric")
QB = QueryBuilder()
QB.with_metric(MB)
  • with_start_absolute(start_absolute)

    QB = QueryBuilder()
    QB.with_start_absolute(1563800009514)
    
  • with_start_relative(value, unit)

    QB = QueryBuilder()
    QB.with_start_relative(1, TimeUnit.HOURS)
    
  • with_end_absolute(end_absolute)

    QB = QueryBuilder()
    QB.with_end_absolute(1563810009514)
    
  • with_end_relative(value, unit)

    QB = QueryBuilder()
    QB.with_end_relative(1, TimeUnit.HOURS)
    
  • with_time_zone(timezone)

    The time zone for the time range of the query. If not specified, UTC is used.

    QB = QueryBuilder()
    QB.with_time_zone("Asia/Kabul")
    
  • with_metrics(metrics)

    QB = QueryBuilder()
    MB1 = MetricBuilder("my_metric1")
    MB2 = MetricBuilder("my_metric2")
    QB.with_metrics([MB1, MB2, ...])
    
  • with_metric(metricBuilder)

    QB = QueryBuilder()
    MB = MetricBuilder("my_metric")
    QB.with_metric(MB)
    
  • build() Build the metric in JSON format for STSAPI

MetricBuilder

Use MetricBuilder to build your metric.

MB = MetricBuilder("my_metric")
  • with_aggregator(aggregator)

    AvgAgg = AvgAggregator()
    AvgAgg.with_sampling(TimeRelative(1, TimeUnit.MINUTES))
    MB = MetricBuilder("my_metric")
    MB.with_aggregator(AvgAgg)
    
  • with_aggregators(aggregators)

    Aggregators are processed in the order specified. The output of an aggregator is passed to the input of the next until all have been processed.

    AvgAgg = AvgAggregator()
    AvgAgg.with_sampling(TimeRelative(1, TimeUnit.HOURS))
    CountAgg = CountAggregator()
    CountAgg.with_sampling(TimeRelative(5, TimeUnit.MINUTES))
    MB = MetricBuilder("my_metric")
    MB.with_aggregators([AvgAgg, CountAgg])
    
  • with_tag_filter(key, values)

    MB = MetricBuilder("my_metric")
    MB.with_tag_filter("city", ["Toulouse", "Paris"])
    
  • with_tags_filter(tags)

    MB = MetricBuilder("my_metric")
    MB.with_tags_filter({"city": ["Toulouse", "Paris"], "type" : "road"})
    
  • exclude_tags(boolean=True)

    By default, the result of the query includes tags and tag values associated with the data points. If excludetags_ is set to true, the tags will be excluded from the response.

  • with_limit(limit)

    Limits the number of data points returned from the data store. The limit is applied before any aggregator is executed.

    MB = MetricBuilder("my_metric")
    MB.with_limit(10000)
    
  • with_order(order)

    Orders the returned data points. Values for order are QueryDataOrder.ASC for ascending or QueryDataOrder.DESC for descending. Defaults to ascending. This sorting is done before any aggregators are executed.

    MB = MetricBuilder("my_metric")
    MB.with_order(QueryDataOrder.ASC)
    
  • with_group_by(group_by)

    The resulting data points can be grouped by one or more tags, a time range, or by value, or by a combination of the three.

    GB = GroupByTag(["source"])
    MB = MetricBuilder("my_metric")
    MB.with_group_by(GB)
    
  • with_group_bys(group_bys)

    GB1 = GroupByTag(["source"])
    GB2 = GroupByValue(10)
    MB = MetricBuilder("my\_metric")
    MB.with\_group\_bys([GB1, GB2])
    

TimeRelative(value, TimeUnit)

TR = TimeRelative(1, TimeUnit.MINUTES)

TimeUnit

TimeUnit is an enum of the time unit.

  • MILLISECONDS

  • SECONDS

  • MINUTES

  • HOURS

  • DAYS

  • WEEKS

  • MONTHS

  • YEARS

QueryDataOrder

QueryDataOrder is an enum for order.

  • ASC

  • DESC

QueriesToDataframe

Use QueriesToDataframe to convert the result of your queries in dataframes.

QB = QueryBuilder()
STS = STSAPI.init("http://ip_skyminer/api/v1")
queries = STS.get_data_points(QB.build())

# Dataframes
DF = QueriesToDataframe(queries)

Example of result :

[          timestamp  value             cluster            host                                                 retry_type
0     1563801780000      0  [skyminer_cluster]  [de331f397f5d]  [read_timeout, request_error, unavailable, write_timeout]
1     1563801840000      0  [skyminer_cluster]  [de331f397f5d]  [read_timeout, request_error, unavailable, write_timeout]
2     1563801900000      0  [skyminer_cluster]  [de331f397f5d]  [read_timeout, request_error, unavailable, write_timeout]
3     1563801960000      0  [skyminer_cluster]  [de331f397f5d]  [read_timeout, request_error, unavailable, write_timeout]
....
[1068 rows x 5 columns]]

DataFrameToDataPointBuilder(dataframe, metric_name, value_field=’value’, tags={})

/!\ Your dataframe should have a Timestamp (ms) index as the output of QueriesToDataframe /!\

Use DataFrameToDataPointBuilder to convert a dataframe into DataPointBuilder.

# Dataframes
DF = my dataframe
Builder = DataFrameToDataPointBuilder(DF, 'EIRP', tags={"from" : "SAT1"})

By default DataFrameToDataPointBuilder use the value column as values but you can change it with the value_field property.

GroupByBin

The Bin grouper groups data point values into bins or buckets. Values are placed into groups based on a list of bin values. For example, if the list of bins is 10, 20, 30, then values less than 10 are placed in the first group, values between 10-19 into the second group, and so forth.

# Init with a bin
GB = GroupByBin([500000]
# Add another bin
GB.with_bin(600000)
# Add multiple bins
GB.with_bins([800000, 1000000])
MB = MetricBuilder("my_metric")
MB.with_group_by(GB)

GroupByTag

You can group results by specifying one or more tag names. For example, if you have a customer tag, grouping by customer would create a resulting object for each customer.

Multiple tag names can be used to further group the data points.

# Init with a tag
GB = GroupByTag(["source"])
# Add another tag
GB.with_tag("host")
# Add multiple tags
GB.with_tags(["flag", "packet_type"])
MB = MetricBuilder("my_metric")
MB.with_group_by(GB)

GroupByTime

The time grouper groups results by time ranges. For example, you could group data by day of week.

Note that the grouper calculates ranges based on the start time of the query. So if you wanted to group by day of week and wanted the first group to be Sunday, then you need to set the query’s start time to be on Sunday.

# GroupByTime(count, value, TimeUnit)
# - count = The number of groups.
# This would typically be 7 to group by day of week.
# - value = The number of units for the aggregation buckets.
# - TimeUnit = The time unit for the sampling rate.

GB = GroupByTime(7, 1, TimeUnit.HOURS)
MB = MetricBuilder("my_metric")
MB.with_group_by(GB)

GroupByValue

The value grouper groups by data point values. Values are placed into groups based on a range size. For example, if the range size is 10, then values between 0-9 are placed in the first group, values between 10-19 into the second group, and so forth.

# Init with a value
GB = GroupByValue(10)
MB = MetricBuilder("my_metric")
MB.with_group_by(GB)

Examples

Get a dataframe of a metric group by tags with an average aggregator and plot it

# Import
from SkyminerTS import MetricBuilder, QueryBuilder, TimeUnit, STSAPI, QueriesToDataframe, TimeRelative, GroupByTag
from SkyminerTS.Aggregators.AvgAggregator import AvgAggregator

# Plot library
import matplotlib.pyplot as plt
plt.interactive(False)

# (Optional) Better visualization of Dataframes
import pandas as pd
pd.options.display.max_columns = 1000
pd.options.display.max_rows = 1000
pd.options.display.max_colwidth = 199
pd.options.display.width = None

# Init the API
API = STSAPI.init("http://url-to-skyminer/api/v1/")

# Build our metric
MB = MetricBuilder("kairosdb.datastore.cassandra.client.requests_timer.min")

# Add the group by tags
MB.with_group_by(GroupByTag(["cluster", "host"]))

# Add the average aggregator
AVG = AvgAggregator()
AVG.with_sampling(TimeRelative(1, TimeUnit.HOURS))
MB.with_aggregator(AVG)

# Build our query
QB = QueryBuilder()
# Define the timerange
QB.with_start_relative(10, TimeUnit.DAYS)
# Put the metric in the query
QB.with_metric(MB)

# Retrieve the result
result = API .get_data_points(QB.build())

# Convert the result as Dataframes

DF = QueriesToDataframe(result)

# Print Dataframes
print(DF)

# Graph
DF.plot()
plt.show()

Result example :

[        timestamp          value             cluster            host                                                          group_by
0   1563800400000  408573.162162  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
1   1563804000000  549544.783333  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
2   1563807600000  390316.450000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
3   1563811200000  361768.000000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
4   1563814800000  346524.700000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
5   1563818400000  332155.350000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
6   1563822000000  330671.000000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
7   1563825600000  342598.533333  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
8   1563829200000  319996.700000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
9   1563832800000  321140.000000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
10  1563836400000  336191.600000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
11  1563840000000  326488.200000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
12  1563843600000  324164.666667  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
13  1563847200000  331242.350000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
14  1563850800000  305792.500000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
15  1563854400000  313840.666667  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
16  1563858000000  336570.750000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
17  1563861600000  324286.833333  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
18  1563865200000  376431.183333  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
19  1563868800000  373827.616667  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}
20  1563872400000  351905.000000  [skyminer_cluster]  [de331f397f5d]  {'tag': {'cluster': 'skyminer_cluster', 'host': 'de331f397f5d'}}]