windML

The importance of wind in smart grids with a large number of renewable energy resources is increasing. With the growing infrastructure of wind turbines and the availability of time-series data with high spatial and temporal resolution, the application of data mining techniques comes into play. The windML framework provides an easy-to-use access to wind data sources within the Python world, building upon numpy [1], scipy [1], sklearn [3], and matplotlib [2]. As a machine learning module, it provides versatile tools for various learning tasks like time-series prediction, classification, clustering, dimensionality reduction, and related tasks.

For an installation guide, an overview of the architecture, and the functionalities of windML, please visit the Getting Started page. For a formal description of the applied techniques, see section techniques. The examples gallery illustrates the main functionalities.

./_images/knn_regression_turbine_1_thumb.png
./_images/svr_regression_turbine_1_thumb.png
./_images/forecast_horizon_1_thumb.png

Brief Example

In the following, we give a brief example of wind time-series forecasting based on K nearest neighbors (KNN) regression. For a further list of examples with plots, we refer to the examples page.

from windml.datasets.nrel import NREL
from windml.mapping.power_mapping import PowerMapping
from sklearn.neighbors import KNeighborsRegressor
import math

windpark = NREL().get_windpark(NREL.park_id['tehachapi'], 3, 2004, 2005)
target = windpark.get_target()

feature_window, horizon = 3, 3
mapping = PowerMapping()
X = mapping.get_features_park(windpark, feature_window, horizon)
Y = mapping.get_labels_turbine(target, feature_window, horizon)
reg = KNeighborsRegressor(10, 'uniform')

train_to, test_to = int(math.floor(len(X) * 0.5)), len(X)
train_step, test_step = 5, 5
reg = reg.fit(X[0:train_to:train_step], Y[0:train_to:train_step])
y_hat = reg.predict(X[train_to:test_to:test_step])

Wind Power Prediction

The model predicts wind power exclusively based on past wind power measurements. For this task, one can formulate the prediction as regression problem examplary for a single turbine. The wind power measurement \mathbf{x} = p(t) (pattern) is mapped to the power production at target time y = p(t+\lambda) (label). For the regression model, we assume to have N of such pattern label pairs (\mathbf{x}_i,y_i) that are basis of our training set T=\{(\mathbf{x}_1,y_1),\ldots,(\mathbf{x}_N,y_N)\} and allow via a regression to predict the label for unknown patterns. It can be expected that the model yields better predictions, if more information of the times series is employed. For this reason, we extend the patterns with \mu \in
\mathbb{N^+} past measurements to \mathbf{x} = p(t), p(t - 1),\ldots,
p(t - \mu). The implementation of this approach is called Power Mapping.

General Times Series Model

Furthermore, we test, if taking into account differences of measurements p(t)-p(t-1), \ldots, p\big(t-(\mu-1)\big) - p(t-\mu) further improves the results. The absolute values and their differences result in patterns with a dimension of d_{st}=(2\mu+1), see Power Diff Mapping. Most prediction tasks require the construction of a pattern which consists of wind power time series of turbines in the neighborhood of the target turbine. See the corresponding figure below. A wind park is defined by a target wind turbine and a certain radius r. Wind power values can be aggregated to a single value or can seperately be used in the pattern vector.

Neighborhood of a turbine

Contributors

The windML framework has initially been developed by the Computational Intelligence Group of the University in Oldenburg. The contributors are Nils André Treiber, Jendrik Poloczek, Oliver Kramer, Justin Philipp Heinermann, Fabian Gieseke. For questions and feedback contact us via email.

License

The windML framework is released under the open source BSD 3-clause license. The LICENSE file is available here.

References

[1](1, 2) Travis E. Oliphant (2007). Python for Scientific Computing. Computing in Science & Engineering 9, IEEE Soc., pp. 10-20.
[2]Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering 9, IEEE Soc., pp. 90-95.
[3]Pedregosa et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research (JMLR) 12, pp. 2825-2830.

Contents

This Page

Fork me on GitHub