pycast.common.timeseries

Normalization Levels

A TimeSeries instance can be normalized by different time granularity levels. Valid values for normalization levels required by pycast.common.TimeSeries.normalize() are stored in pycast.common.timeseries.NormalizationLevels.

Those levels include:

  • “second”
  • “minute”
  • “hour”
  • “day”
  • “week”
  • “2week”
  • “4week”

Fusion Methods

Fusion methods that can be used to fusionate multiple data points within the same time bucket. This might sort the list it is used on. Valid values for fusion methods required by pycast.common.TimeSeries.normalize() are stored in pycast.common.timeseries.FusionMethods.

Valid fusion methods are:

  • “sum”: Sums up all valid values stored in the specific time bucket
  • “mean”: Calculates the mean value within the time bucket
  • “median”: Calculates the median of the given time bucket values. In the case the number of entries within that bucket is even, the larger of the both values will be chosen as median.

Interpolation Methods

Interpolation methods that can be used for interpolation missing time buckets. Valid values for interpolation methods required by pycast.common.TimeSeries.normalize() are stored in pycast.common.timeseries.InterpolationMethods.

Valid values for interpolation methods are:

  • “linear”: Use linear interpolation to calculate the missing values

TimeSeries

class pycast.common.timeseries.TimeSeries(isNormalized=False, isSorted=False)[source]

Bases: pycast.common.pycastobject.PyCastObject

A TimeSeries instance stores all relevant data for a real world time series.

Warning:TimeSeries instances are NOT thread-safe.
__add__(otherTimeSeries)[source]

Creates a new TimeSeries instance containing the data of self and otherTimeSeries.

Parameters:otherTimeSeries (TimeSeries) – TimeSeries instance that will be merged with self.
Returns:Returns a new TimeSeries instance containing the data entries of self and otherTimeSeries.
Return type:TimeSeries
__copy__()[source]

Returns a new clone of the TimeSeries.

Returns:Returns a TimeSeries containing the same data and configuration as self.
Return type:TimeSeries
__eq__(otherTimeSeries)[source]

Returns if self and the other TimeSeries are equal.

TimeSeries are equal to each other if:
  • they contain the same number of entries
  • each data entry in one TimeSeries is also member of the other one.
Parameters:otherTimeSeries (TimeSeries) – TimeSeries instance that is compared with self.
Returns:True if the TimeSeries objects are equal, False otherwise.
Return type:boolean
__getitem__(index)[source]

Returns the item stored at the TimeSeries index-th position.

Parameters:index (integer) – Position of the element that should be returned. Starts at 0
Returns:Returns a list containing [timestamp, data] lists.
Return type:list
Raise:Raises an IndexError if the index is out of range.
__init__(isNormalized=False, isSorted=False)[source]

Initializes the TimeSeries.

Parameters:
  • isNormalized (boolean) – Within a normalized TimeSeries, all data points have the same temporal distance to each other. When this is True, the memory consumption of the TimeSeries might be reduced. Also algorithms will probably run faster on normalized TimeSeries. This should only be set to True, if the TimeSeries is really normalized! TimeSeries normalization can be forced by executing TimeSeries.normalize().
  • isSorted (boolean) – If all data points added to the time series are added in their ascending temporal order, this should set to True.
__iter__()[source]

Returns an iterator that can be used to iterate over the data stored within the TimeSeries.

Returns:Returns an iterator for the TimeSeries.
Return type:Iterator
__len__()[source]

Returns the number of data entries stored in the TimeSeries.

Returns:Returns an Integer representing the number on data entries stored within the TimeSeries.
Return type:integer
__ne__(otherTimeSeries)[source]

Returns if self and the other MultiDimensionalTimeSeries are equal.

__setitem__(index, value)[source]

Sets the item at the index-th position of the TimeSeries.

Parameters:
  • index (integer) – Index of the element that should be set.
  • value (list) – A list of the form [timestamp, data]
Raise:

Raises an IndexError if the index is out of range.

__str__()[source]

Returns a string representation of the TimeSeries.

Returns:Returns a string representing the TimeSeries in the format:

“TimeSeries([timestamp, data], [timestamp, data], [timestamp, data])”.

Return type:string
_check_normalization()[source]

Checks, if the TimeSeries is normalized.

Returns:Returns True if all data entries of the TimeSeries have an equal temporal distance, False otherwise.
add_entry(timestamp, data)[source]

Adds a new data entry to the TimeSeries.

Parameters:
  • timestamp – Time stamp of the data. This has either to be a float representing the UNIX epochs or a string containing a timestamp in the given format.
  • data (numeric) – Actual data value.
apply(method)[source]

Applies the given ForecastingAlgorithm or SmoothingMethod from the pycast.methods module to the TimeSeries.

Parameters:method (BaseMethod) – Method that should be used with the TimeSeries. For more information about the methods take a look into their corresponding documentation.
Raise:Raises a StandardError when the TimeSeries was not normalized and hte method requires a normalized TimeSeries
classmethod convert_epoch_to_timestamp(timestamp, format)[source]

Converts the given float representing UNIX-epochs into an actual timestamp.

Parameters:
  • timestamp (float) – Timestamp as UNIX-epochs.
  • format (string) – Format of the given timestamp. This is used to convert the timestamp from UNIX epochs. For valid examples take a look into the time.strptime() documentation.
Returns:

Returns the timestamp as defined in format.

Return type:

string

classmethod convert_timestamp_to_epoch(timestamp, format)[source]

Converts the given timestamp into a float representing UNIX-epochs.

Parameters:
  • timestamp (string) – Timestamp in the defined format.
  • format (string) – Format of the given timestamp. This is used to convert the timestamp into UNIX epochs. For valid examples take a look into the time.strptime() documentation.
Returns:

Returns an float, representing the UNIX-epochs for the given timestamp.

Return type:

float

classmethod from_twodim_list(datalist, format=None)[source]

Creates a new TimeSeries instance from the data stored inside a two dimensional list.

Parameters:
  • datalist (list) – List containing multiple iterables with at least two values. The first item will always be used as timestamp in the predefined format, the second represents the value. All other items in those sublists will be ignored.
  • format (string) – Format of the given timestamp. This is used to convert the timestamp into UNIX epochs, if necessary. For valid examples take a look into the time.strptime() documentation.
Returns:

Returns a TimeSeries instance containing the data from datalist.

Return type:

TimeSeries

initialize_from_sql_cursor(sqlcursor)[source]

Initializes the TimeSeries’s data from the given SQL cursor.

You need to set the time stamp format using TimeSeries.set_timeformat().

Parameters:sqlcursor (SQLCursor) – Cursor that was holds the SQL result for any given “SELECT timestamp, value, ... FROM ...” SQL query. Only the first two attributes of the SQL result will be used.
Returns:Returns the number of entries added to the TimeSeries.
Return type:integer
is_normalized()[source]

Returns if the TimeSeries is normalized.

Returns:Returns True if the TimeSeries is normalized, False otherwise.
Return type:boolean
is_sorted()[source]

Returns if the TimeSeries is sorted.

Returns:Returns True if the TimeSeries is sorted ascending, False in all other cases.
Return type:boolean
normalize(normalizationLevel='minute', fusionMethod='mean', interpolationMethod='linear')[source]

Normalizes the TimeSeries data points.

If this function is called, the TimeSeries gets ordered ascending automatically. The new timestamps will represent the center of each time bucket. Within a normalized TimeSeries, the temporal distance between two consecutive data points is constant.

Parameters:
  • normalizationLevel (string) – Level of normalization that has to be applied. The available normalization levels are defined in timeseries.NormalizationLevels.
  • fusionMethod (string) – Normalization method that has to be used if multiple data entries exist within the same normalization bucket. The available methods are defined in timeseries.FusionMethods.
  • interpolationMethod (string) – Interpolation method that is used if a data entry at a specific time is missing. The available interpolation methods are defined in timeseries.InterpolationMethods.
Raise:

Raises a ValueError if a normalizationLevel, fusionMethod or interpolationMethod hanve an unknown value.

sample(percentage)[source]

Samples with replacement from the TimeSeries. Returns the sample and the remaining timeseries. The original timeseries is not changed.

Parameters:percentage (float) – How many percent of the original timeseries should be in the sample
Returns:A tuple containing (sample, rest) as two TimeSeries.
Return type:tuple(TimeSeries,TimeSeries)
Raise:Raises a ValueError if percentage is not in (0.0, 1.0).
set_timeformat(format=None)[source]

Sets the TimeSeries global time format.

Parameters:format (string) – Format of the timestamp. This is used to convert the timestamp from UNIX epochs when the TimeSeries gets serialized by TimeSeries.to_json() and TimeSeries.to_gnuplot_datafile(). For valid examples take a look into the time.strptime() documentation.
sort_timeseries(ascending=True)[source]

Sorts the data points within the TimeSeries according to their occurrence inline.

Parameters:ascending (boolean) – Determines if the TimeSeries will be ordered ascending or descending. If this is set to descending once, the ordered parameter defined in TimeSeries.__init__() will be set to False FOREVER.
Returns:Returns self for convenience.
Return type:TimeSeries
sorted_timeseries(ascending=True)[source]

Returns a sorted copy of the TimeSeries, preserving the original one.

As an assumption this new TimeSeries is not ordered anymore if a new value is added.

Parameters:ascending (boolean) – Determines if the TimeSeries will be ordered ascending or descending.
Returns:Returns a new TimeSeries instance sorted in the requested order.
Return type:TimeSeries
to_gnuplot_datafile(datafilepath)[source]

Dumps the TimeSeries into a gnuplot compatible data file.

Parameters:datafilepath (string) – Path used to create the file. If that file already exists, it will be overwritten!
Returns:Returns True if the data could be written, False otherwise.
Return type:boolean
to_twodim_list()[source]

Serializes the TimeSeries data into a two dimensional list of [timestamp, value] pairs.

Returns:Returns a two dimensional list containing [timestamp, value] pairs.
Return type:list

Table Of Contents

Previous topic

pycast.common

Next topic

pycast Smoothing and Forecasting Methods

This Page