3 weeks left

DAT264x: Identifying Appliances from Energy Use Spectrograms
Hosted By Microsoft


Problem Description

About the Data

Your goal is to predict types of appliances from spectrograms of current and voltage measurements. A spectrogram is a visual representation of the various frequencies of sound as they vary with time. These spectrograms were generated from current and voltage measurements sampled at 30 kHz from 11 different appliance types present in more than 60 households in Pittsburgh, Pennsylvania, USA. Data collection took place during the summer of 2013, and winter of 2014. Each appliance type is represented by dozens of different instances of varying make/models. For more information on spectrograms, see the home page.

This is what the data directory looks like for the data you are given:

├── train
│   ├── 1000_c.png
│   ├── 1000_v.png
│   ├── 1001_c.png
│   ├── 1001_v.png
│   ├── ...
│   ├── 1575_c.png
│   └── 1575_v.png
├── test
│   ├── 1576_c.png
│   ├── 1576_v.png
│   ├── 1577_c.png
│   ├── 1577_v.png
│   ├── ...
│   ├── 1959_c.png
│   └── 1959_v.png
├── submission_format.csv
└── train_labels.csv
  • One folder of images is labeled train. You have the true labels for these images in train_labels.csv.
  • One folder of images is labeled test. You don't know the true labels for these.

Your job is to:

  1. Train a model using the images in train and the labels train_labels.csv
  2. Predict appliance labels for the images in test for which you don't know the true appliances.
  3. Output your predictions in a format that matches submission_format.csv exactly.
  4. Upload your predictions to this competition in order to get an accuracy score.
  5. Export your grading token (click the "Export Score for EdX" tab) and paste it into the assignment grader on edX to get your course grade.

For each observation, you are given a spectrogram of current and a spectrogram of voltage. The files are therefore named with the conventions {id}_c.png and {id}_v.png, where _c corresponds to current and _v corresponds to voltage. The {id} in the filename matches the id column in train_labels.csv for the training data and in submission_format.csv for the test data.

The spectrograms have been scaled so that they are 128x176. Each image only has one channel so if it is properly loaded, the shape should be (128, 176). Using scikit-image, loading a file looks like:

from skimage import io
import matplotlib.pyplot as plt

my_image = io.imread('1000_c.png', as_gray=True)

# look at the image

Note that you are given two images for each observation. To start, you might want to first work with just the current OR the voltage files. Then, once you have a baseline model working, consider training a multi-channel network that can handle stacked inputs.

Target Variable

The appliance labels correspond to the following appliances:

  • 0: Heater
  • 1: Fridge
  • 2: Hairdryer
  • 3: Microwave
  • 4: Air Conditioner
  • 5: Vacuum
  • 6: Incandescent Light Bulb
  • 7: Laptop
  • 8: Compact Fluorescent Lamp
  • 9: Fan
  • 10: Washing Machine

Submission Format

The format for the submission file is CSV with a header row (id,appliance). Each row contains an image id (an integer) and the appliance value (an integer) separated by a comma. The image id corresponds to the ids of the images in the test/ folder, i.e. the number before the _c or _v suffix in the filename. The appliance value is the target label, one of {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Note that your appliance labels must be integers.

Note: Except for the actual prediction values in the appliance column, your submission must exactly match the submission_format.csv file provided, including the order of the rows.

For example, you could simply guess a "Vacuum" appliance for all of the images by predicting a value of 5 in every row:

id appliance
1576 5
1577 5
1578 5
1579 5
1580 5

Your .csv file that you submit would look like:


Performance Metric

To measure your model's performance by looking at prediction error, we'll use the simple classification rate accuracy metric (often referred to simply as "accuracy"). This is a quantity in |$[0,1]$| where a higher value is better, and is given by the following formula:

$$ \mathrm{classification\ rate}( \hat y, y) = \frac{ N_\mathrm{correct} }{ N_\mathrm{predictions} } $$

In this case, |$N_\mathrm{predictions}=384$| because you are being asked to classify 384 rows in the test set.

Hint for How to Approach this Problem

One of the most common example problems for teaching introductory AI is the MNIST handwritten digits classification task. It is widely used as the "hello world" of deep learning problems for beginners. See, for example, this one from Keras.

In the MNIST problem you are trying to classify a set of 28x28 black and white images of handwritten digits into 10 label classes (values 0 through 9 which represent their respective numerals).

In this capstone challenge you are trying to classify a set of 128x176 grayscale images into 11 label classes (values 0 through 10 which represent the different appliances).

Sample current spectrograms by appliance label

See the similarity? This problem is a bit different but the concepts carry over beautifully, so if you're not sure where to start, we recommend finding a good MNIST tutorial explained with the same modeling package that you will be using. Once you gain a deep understanding of that problem, it should be clear how to approach this modeling task.


The Plug Load Appliance Identification Dataset (PLAID) dataset on which this challenge data is based was originally prepared and released in the following paper:

Jingkun Gao, Suman Giri, Emre Can Kara, and Mario Bergés. 2014. PLAID: a public dataset of high-resoultion electrical appliance measurements for load identification research: demo abstract. In Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Efficient Buildings (BuildSys '14). ACM, New York, NY, USA, 198-199. DOI=http://dx.doi.org/10.1145/2674061.2675032