Energy Consumption Calculation Based on Differnt Time and Price (Tarrif) Using Python

Python Calculation and Time Manipulation

Posted by Jiashu Miao on August 22, 2021

Project Overview

Objective

Your assignment is to clean data obtained from an electrical meter and calculate the total of a utility bill.

Tariff Description

There are three components for this specified tariff: demand charges, energy charges, and other charges. Demand charges are dependent on the highest energy usage in any 15 min interval (12:00 - 12:15, 12:15 - 12:30, 12:30 - 12:45, etc.). This value is in kW (power). Energy charges are based on energy usage during different time of use periods defined below, using the unit kWh. Other charges include a customer charge, which is charged for every day in the billing period, and a tax on total electricity consumed.

Data

Stored in data.csv is data received from an electrical meter. Columns: epochMilliseconds = UTC time in milliseconds. cumulativeEnergy = cumulative energy consumption in kWh Notes: Convert times into Pacific time. cumulativeEnergy may need to undergo cleaning processes.

###Tariff Details

Demand Charges
$14.31 /kW
Energy Charges
Peak : $0.19472 /kWh
Hours : 16,17,18,19,20
Months : 1,2,3,4,5,6,7,8,9,10,11,12
Off Peak : $0.15924 /kWh Hours : 9,10,11,12,13 Months : 3,4,5
Super Off Peak : $0.1229 /kWh
Hours : all other energy usage not included in peak and off peak usage
Other Charges
Customer Charge : $5.47664 /day Tax : $0.003 /kWh

Project Report

Data Analysis Answer

Jiashu Miao 8/22/2021

DATA PREPROCESSING AND EXPLORATION

Import Packages and Change Setting

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import numpy as np
import pandas as pd
import seaborn as sns
import datetime

Load and Inspect the Data


energy = pd.read_csv('data.csv')
energy['date'] = energy['epochMilliseconds']
energy.ndim
energy.shape
energy.size
energy.head(10)
energy.describe()

2






(73756, 3)






221268
epochMilliseconds cumulativeEnergy date
0 1612166412948 162.05 1612166412948
1 1612166444265 162.05 1612166444265
2 1612166472972 162.05 1612166472972
3 1612166502960 162.05 1612166502960
4 1612166534365 162.05 1612166534365
5 1612166564315 162.05 1612166564315
6 1612166592978 162.05 1612166592978
7 1612166624265 162.05 1612166624265
8 1612166654255 162.05 1612166654255
9 1612166682983 162.05 1612166682983
epochMilliseconds cumulativeEnergy date
count 7.375600e+04 73756.000000 7.375600e+04
mean 1.614570e+12 132.536504 1.614570e+12
std 1.355367e+09 48.314213 1.355367e+09
min 1.612166e+12 0.150000 1.612166e+12
25% 1.613307e+12 120.650000 1.613307e+12
50% 1.614599e+12 158.850000 1.614599e+12
75% 1.615731e+12 161.850000 1.615731e+12
max 1.617260e+12 190.450000 1.617260e+12

Check Special Values, Missings, Nulls

# check null
energy.isnull().sum()
# check na
energy.isna().sum()
# check 0 and None
energy.isin([0,None]).sum()
epochMilliseconds    0
cumulativeEnergy     0
date                 0
dtype: int64






epochMilliseconds    0
cumulativeEnergy     0
date                 0
dtype: int64






epochMilliseconds    0
cumulativeEnergy     0
date                 0
dtype: int64

Data Exploration on distribution, frequency, etc.

sns.distplot(energy.cumulativeEnergy, hist=True, kde=False, 
             color = 'blue',
             hist_kws={'edgecolor':'black'})

<AxesSubplot:xlabel='cumulativeEnergy'>

png

energy[['cumulativeEnergy']].value_counts().iloc[0:10]
cumulativeEnergy
158.85              12874
161.85               9206
120.65               6385
164.85               6332
159.05               6295
151.65               1211
10.85                 810
156.65                793
177.65                730
155.05                694
dtype: int64

Data Operation and Manipulation

Convert epoch time to UTC first of all

energy['newdate'] = energy['epochMilliseconds']/1000
for i in range(0,len(energy)):
    energy['newdate'][i] = datetime.datetime.utcfromtimestamp(energy['newdate'][i]).strftime('%Y-%m-%d %H:%M:%S')


/Users/MichaelMiao/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
/Users/MichaelMiao/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py:670: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)
energy.head(3)
energy.tail(3)
epochMilliseconds cumulativeEnergy date newdate
0 1612166412948 162.05 1612166412948 2021-02-01 08:00:12
1 1612166444265 162.05 1612166444265 2021-02-01 08:00:44
2 1612166472972 162.05 1612166472972 2021-02-01 08:01:12
epochMilliseconds cumulativeEnergy date newdate
73753 1617260336317 70.45 1617260336317 2021-04-01 06:58:56
73754 1617260366397 70.45 1617260366397 2021-04-01 06:59:26
73755 1617260396158 70.45 1617260396158 2021-04-01 06:59:56

Convert Time Zone to Pacific

#Modules needed
import datetime
import pytz

import pytz
energy.index = pd.to_datetime(energy.newdate)

pacific = pytz.timezone('US/Pacific')
energy.index = energy.index.tz_localize(pytz.utc).tz_convert(pacific)

Find Total Energy Consumption

  • find max cumulative energy if there is energy use in that day

energy_max_day=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('8:00','23:59').resample("1440T").max().fillna(0)
energy_min_day=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].resample("1440T").min().fillna(0)
energy_new_day = (energy_max_day - energy_min_day)
energy_new_day = energy_new_day.where(energy_new_day['cumulativeEnergy']==0,energy_max_day)
energy_new_day = energy_new_day.where(energy_max_day > energy_min_day,0)
pd.set_option('display.max_rows', None)
energy_new_day.sum()
cumulativeEnergy    2827.95
dtype: float64
ee = pd.DataFrame()
ee[['max']]= energy_max_day
ee[['min']]= energy_min_day
ee[['day']]= energy_new_day
ee
pd.set_option('display.max_rows', 10)
max min day
2021-02-15 00:00:00-08:00 161.85 161.85 0.00
2021-02-16 00:00:00-08:00 171.25 0.65 171.25
2021-02-17 00:00:00-08:00 150.65 0.15 150.65
2021-02-18 00:00:00-08:00 145.25 0.45 145.25
2021-02-19 00:00:00-08:00 0.00 145.25 0.00
2021-02-20 00:00:00-08:00 0.00 0.00 0.00
2021-02-21 00:00:00-08:00 0.00 0.00 0.00
2021-02-22 00:00:00-08:00 183.25 0.35 183.25
2021-02-23 00:00:00-08:00 125.85 0.45 125.85
2021-02-24 00:00:00-08:00 140.45 0.65 140.45
2021-02-25 00:00:00-08:00 150.85 0.45 150.85
2021-02-26 00:00:00-08:00 164.85 0.65 164.85
2021-02-27 00:00:00-08:00 164.85 164.85 0.00
2021-02-28 00:00:00-08:00 164.85 164.85 0.00
2021-03-01 00:00:00-08:00 135.25 0.25 135.25
2021-03-02 00:00:00-08:00 136.45 0.35 136.45
2021-03-03 00:00:00-08:00 144.45 0.45 144.45
2021-03-04 00:00:00-08:00 155.25 0.45 155.25
2021-03-05 00:00:00-08:00 120.65 0.65 120.65
2021-03-06 00:00:00-08:00 120.65 120.65 0.00
2021-03-07 00:00:00-08:00 120.65 120.65 0.00
2021-03-08 00:00:00-08:00 166.45 0.65 166.45
2021-03-09 00:00:00-08:00 150.05 0.15 150.05
2021-03-10 00:00:00-08:00 153.25 0.45 153.25
2021-03-11 00:00:00-08:00 123.25 0.15 123.25
2021-03-12 00:00:00-08:00 158.85 0.45 158.85
2021-03-13 00:00:00-08:00 158.85 158.85 0.00
2021-03-14 00:00:00-08:00 158.85 158.85 0.00
2021-03-15 01:00:00-07:00 151.65 0.25 151.65

Find total peak hour usage

energy_max_peak=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('16:00','20:59').resample("1440T").max().fillna(0)
energy_min_peak=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('16:00','20:59').resample("1440T").min().fillna(0)
energy_peak = (energy_max_peak - energy_min_peak)
pd.set_option('display.max_rows', 10)
energy_peak.sum()
cumulativeEnergy    19.5
dtype: float64

Find total offpeak usage

energy_max_offpeak=energy[['cumulativeEnergy']].loc['2021-03-01':'2021-03-15'].between_time('09:00','13:59').resample("1440T").max().fillna(0)
energy_min_offpeak=energy[['cumulativeEnergy']].loc['2021-03-01':'2021-03-15'].between_time('09:00','13:59').resample("1440T").min().fillna(0)
energy_offpeak = (energy_max_offpeak - energy_min_offpeak)
pd.set_option('display.max_rows', 10)
energy_offpeak.sum()
cumulativeEnergy    0.0
dtype: float64

Find the other super offpeak usage

energy_max_spoffpeak=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-02-28'].between_time('09:00','13:59').resample("1440T").max().fillna(0)
energy_min_spoffpeak=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-02-28'].between_time('09:00','13:59').resample("1440T").min().fillna(0)
energy_spoffpeak = (energy_max_spoffpeak - energy_min_spoffpeak)
pd.set_option('display.max_rows', 10)
energy_spoffpeak.sum()
cumulativeEnergy    0.0
dtype: float64
energy_max_spoffpeak2=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('00:00','08:59').resample("1440T").max().fillna(0)
energy_min_spoffpeak2=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('00:00','08:59').resample("1440T").min().fillna(0)
energy_spoffpeak2 = (energy_max_spoffpeak2 - energy_min_spoffpeak2)
pd.set_option('display.max_rows', 10)
energy_spoffpeak2.sum()
cumulativeEnergy    1.2
dtype: float64
energy_max_spoffpeak3=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('21:00','23:59').resample("1440T").max().fillna(0)
energy_min_spoffpeak3=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('21:00','23:59').resample("1440T").min().fillna(0)
energy_spoffpeak3 = (energy_max_spoffpeak3 - energy_min_spoffpeak3)
pd.set_option('display.max_rows', 10)
energy_spoffpeak3.sum()

cumulativeEnergy    2805.8
dtype: float64
energy_max_spoffpeak4=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('14:00','15:59').resample("1440T").max().fillna(0)
energy_min_spoffpeak4=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].between_time('14:00','15:59').resample("1440T").min().fillna(0)
energy_spoffpeak4 = (energy_max_spoffpeak4 - energy_min_spoffpeak4)
pd.set_option('display.max_rows', 10)

energy_spoffpeak4.sum()
cumulativeEnergy    0.0
dtype: float64

Find the number of days

energy_count_day=energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].resample("1440T").max().fillna(0).count()
energy_count_day
cumulativeEnergy    29
dtype: int64

Find the highest usage for each 15-min

  • highest usage in each 15 min each day as the demand
  • count the total demand
demand = energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].resample("15T").max().fillna(0) - energy[['cumulativeEnergy']].loc['2021-02-15':'2021-03-15'].resample("15T").min().fillna(0)
total_demand= demand.groupby([demand.index.date]).max().sum()
total_demand
cumulativeEnergy    444.3
dtype: float64

Calculate the Results

Find total cost

total_cost =total_demand * 14.31 + energy_peak.sum() * 0.19472 + (energy_spoffpeak2.sum() + energy_spoffpeak3.sum()) * 0.1229  + energy_count_day * 5.47664 + energy_new_day.sum() * 0.003
print("The total cost for the energy would be: ", "$",round(total_cost[0],2),"","\n")
The total cost for the energy would be:  $ 6874.02  

edited by Jiashu miao edited by Jiashu Miao :) 08/22/2021