LUT University Energy Consumption/Production Dataset

Kuvaus

The Data was collected at LUT University, Lappeenranta, Finland. Save for measurement or API errors, all variables were sampled at an hourly rate and logged using UTC timestamps. The dataset comprises: - The aggregated energy consumed by an entire building [kW]. - The aggregated electricity generated by the PV panel array [kW]. - Day-ahead (ELSPOT) prices for the Finnish Market [€/MWh]. - This information is made publicly available by [ENTSO-e](https://newtransparency.entsoe.eu/market/energyPrices). The version found in `raw/elspot.parquet` comprises only the relevant years and uses UTC timestamps instead of local time. - Meteorological variables measured 6 km away from campus at the Lappeenranta airport by the [Finnish Meteorological Institute](https://en.ilmatieteenlaitos.fi/) (see table below). | Variable | Unit | | --- | --- | | Air Temperature | ◦C | | Cloud Amount | Okta | | Dew Point Temperature | ◦C | | Global/Diffuse Radiation | W/m2 | | Gust Speed | m/s | | Horizontal Visibility | m | | Pressure | hPa | | Relative Humidity | % | | Sunshine | % | | Wind Direction | ◦ | | Wind Speed | m/s | The production and consumption columns are stored in two separate files (`raw/{consumption/production}.parquet`); thus, for the sake of consistency, the datasets were clipped to their overlapping period and joined into a single table. Discrepancies between duplicated columns arise from missing values in one of the two sources; a robust average (averaged if not null) was set as the consensus value for the redundant measurements. The hourly timestamps were first enforced via upscaling without interpolation. A graphical analysis of the raw data revealed that the measurements are naturally split by missing values into three segments: 1. From 30.09.2017 to 30.12.2017 (2208 samples, or 11.2% of the dataset), found in `partitioned/dataset_0.parquet`. 2. From 05.02.2018 to 06.10.2018 (5856 samples after previous-day interpolation for the missing data bump in the middle of the segment, or 29.7% of the dataset), found in `partitioned/dataset_1.parquet`. 3. From 16.11.2018 to 14.03.2020 (11640 samples, or 59.1% of the dataset), found in `partitioned/dataset_2.parquet`. Finally, the script that transforms the raw data into the partitioned tables is provided as a Jupyter Notebook (`dataset_integration.ipynb`).
Näytä enemmän

Julkaisuvuosi

2025

Aineiston tyyppi

Tekijät

Computational Engineering

Sergio Mauricio Vanegas Arias Orcid -palvelun logo - Kuraattori, Muu tekijä, Julkaisija

Lasse Lensu Orcid -palvelun logo - Muu tekijä

Samuli Honkapuro Orcid -palvelun logo - Muu tekijä

Kimmo Huoman - Tekijä

Ville Tikka Orcid -palvelun logo - Tekijä

Projekti

Muut tiedot

Tieteenalat

Tietojenkäsittely ja informaatiotieteet; Ympäristötiede; Sähkö-, automaatio- ja tietoliikennetekniikka, elektroniikka

Kieli

englanti

Saatavuus

Avoin

Lisenssi

Creative Commons Nimeä JaaSamoin 4.0 Kansainvälinen (CC BY SA 4.0)

Avainsanat

Weather observations, weather, forecasting, time series, Solar Panels, Building, Electricity Production, PV Solar, Electricity Consumption

Asiasanat

Ajallinen kattavuus

undefined

Liittyvät aineistot