CIFAR10 with fastai¶

A few years ago I spent some time playing around with CIFAR10. This is a dataset consisting of 10 classes, 50,000 for training and 10,000 for testing. Each image is 32x32 RGB. Back then I had an old CUDA card that didn't support any of the popular frameworks for machine learning. So I ended up writing my own CNN code using CUDA from scratch. This took a bit of time but it was fun. My results were okay back then, a bit over 80% accuracy for validationa and testing. In this post I'm revisting CIFAR10 but this time I'm going to use fastai. I'm expecting to get much better results with less effort this time!

  1. Load the CIFAR10 dataset
  2. Training and evaluation
  3. Some thoughts

1. Load the CIFAR10 dataset¶

We can use fastai's builtin function to load CIFAR10. This is a convenient one liner.

In [1]:
from fastai.vision.all import *
from fastai.callback.fp16 import *

path = untar_data(URLs.CIFAR)
100.00% [168173568/168168549 00:15<00:00]

2. Training and evaluation¶

I'm going to use a pre-trained resnet18 and do transfer learning. In an attempt to speed things up (since I'm paying by the second on Paperspace!) I'll use float16 for training. This is done by calling .to_fp16() on the cnn_learner. Resnet was originally trained on images with size 224x224, but CIFAR10 images are smaller 32x32. This isn't a blocker, as we can train on any image size but it may not give good results. As an experiment I'll train on images of size 32x32, 64x64, 128x128 and 224x224.

The training parameters are pretty simple. There's no data augmentation or any other tricks. The head is trained for 3 epochs and the rest for 10 epochs.

In [21]:
xs = []
ys = []
best_s = []
best_acc = 0
best_learn = None

for s in [32, 64, 128, 224]:
    dblock = DataBlock(blocks=(ImageBlock(), CategoryBlock()),
                       get_items=get_image_files,
                       get_y=parent_label,
                       item_tfms=Resize(s))

    dls = dblock.dataloaders(os.path.join(path.__str__(), "train"), bs=64)

    learn = vision_learner(dls, models.resnet18, metrics=accuracy).to_fp16()
    learn.fine_tune(10, freeze_epochs=3)
    learn.save(f"cifar10_{s}")
    
    # run on test set
    test_files = get_image_files(path / "test")
    label = TensorCategory([dls.vocab.o2i[parent_label(f)] for f in test_files])

    pred = learn.get_preds(dl=dls.test_dl(test_files))
    acc = accuracy(pred[0], label).item()
    print(f"{s}x{s}, test accuracy={acc}")    
    
    if acc > best_acc:
        best_s = s
        best_acc = acc
        best_learn = learn
        
    xs.append(s)
    ys.append(acc)
    
plt.figure(figsize=(5,5))
plt.plot(xs, ys, 'o-', markersize=10)
plt.xlabel("image size NxN")
plt.ylabel("accuracy")
plt.title("CIFAR10 accuracy on test set vs image size");    

interp = ClassificationInterpretation.from_learner(best_learn)
interp.plot_confusion_matrix(figsize=(5,5))
interp.plot_top_losses(49, figsize=(30,30))
epoch train_loss valid_loss accuracy time
0 3.602131 2.830318 0.108400 00:10
1 2.896018 2.366538 0.178700 00:09
2 2.258265 1.974556 0.308800 00:09
epoch train_loss valid_loss accuracy time
0 2.181820 1.926084 0.323500 00:10
1 2.036602 1.842784 0.355300 00:10
2 1.918048 1.735874 0.394600 00:14
3 1.802964 1.636276 0.430300 00:16
4 1.711634 1.541966 0.454900 00:15
5 1.631888 1.488092 0.474200 00:14
6 1.600144 1.448889 0.488200 00:14
7 1.554854 1.436399 0.491600 00:14
8 1.589388 1.427464 0.495800 00:15
9 1.551987 1.418042 0.499800 00:13
32x32, test accuracy=0.49950000643730164
epoch train_loss valid_loss accuracy time
0 3.375597 2.665712 0.144100 00:12
1 2.577458 1.987854 0.322000 00:13
2 1.713933 1.359496 0.553300 00:14
epoch train_loss valid_loss accuracy time
0 1.576168 1.293082 0.580300 00:16
1 1.454389 1.169415 0.615600 00:12
2 1.313278 1.026540 0.658300 00:13
3 1.170163 0.933572 0.686100 00:16
4 1.099742 0.862677 0.705800 00:14
5 1.028216 0.829689 0.716400 00:12
6 0.999216 0.788660 0.731500 00:14
7 0.960109 0.776576 0.733300 00:15
8 0.981314 0.772111 0.735600 00:16
9 0.975562 0.772309 0.737500 00:11
64x64, test accuracy=0.7325000166893005
epoch train_loss valid_loss accuracy time
0 3.292679 2.478759 0.185500 00:13
1 2.117097 1.446210 0.522800 00:16
2 1.152057 0.796655 0.744800 00:13
epoch train_loss valid_loss accuracy time
0 1.077487 0.745378 0.760000 00:20
1 0.950641 0.663845 0.788200 00:17
2 0.849631 0.575981 0.814800 00:17
3 0.744194 0.514916 0.833000 00:19
4 0.688671 0.478813 0.843700 00:16
5 0.658266 0.460058 0.848100 00:17
6 0.628235 0.445286 0.853700 00:17
7 0.617357 0.435487 0.857300 00:17
8 0.629916 0.430896 0.857900 00:17
9 0.611062 0.438232 0.855300 00:19
128x128, test accuracy=0.8483999967575073
epoch train_loss valid_loss accuracy time
0 3.428684 2.559396 0.155800 00:29
1 2.198105 1.578945 0.462100 00:31
2 1.360745 0.948656 0.692500 00:31
epoch train_loss valid_loss accuracy time
0 1.221270 0.898740 0.709200 00:39
1 1.125785 0.805118 0.740400 00:37
2 0.994369 0.705828 0.769800 00:39
3 0.866615 0.631008 0.794900 00:37
4 0.803173 0.573332 0.813500 00:37
5 0.773740 0.544131 0.822400 00:37
6 0.744626 0.523706 0.831600 00:37
7 0.728357 0.512607 0.832600 00:37
8 0.740440 0.512368 0.832800 00:37
9 0.725927 0.512318 0.834100 00:37
224x224, test accuracy=0.8312000036239624

As the image size increases so does the test accuracy. The accuray plateaus after 128x128 at around 85%. The best accuracy for CIFAR10 is >99% according to benchmarks.ai.

3. Some thoughts

I got 85% accuracy with not too much effort. Adding augmentation and longer training time should boost it a bit more.