vision.learner
Learner
support for computer vision
Computer Vision Interpret
vision.interpret
is the module that implements custom Interpretation
classes for different vision tasks by inheriting from it.
class
SegmentationInterpretation
[source][test]
SegmentationInterpretation
(learn
:Learner
,preds
:Tensor
,y_true
:Tensor
,losses
:Tensor
,ds_type
:DatasetType
=<DatasetType.Valid: 2>
) ::Interpretation
No tests found forSegmentationInterpretation
. To contribute a test please refer to this guide and this discussion.
Interpretation methods for segmenatation models.
top_losses
[source][test]
top_losses
(sizes
:Tuple
,k
:int
=None
,largest
=True
) No tests found fortop_losses
. To contribute a test please refer to this guide and this discussion.
Reduce flatten loss to give a single loss value for each image
_interp_show
[source][test]
_interp_show
(ims
:ImageSegment
,classes
:Collection
[T_co
]=None
,sz
:int
=20
,cmap
='tab20'
,title_suffix
:str
=None
) No tests found for_interp_show
. To contribute a test please refer to this guide and this discussion.
Show ImageSegment with color mapping labels
show_xyz
[source][test]
show_xyz
(i
,classes
:list
=None
,sz
=10
) No tests found forshow_xyz
. To contribute a test please refer to this guide and this discussion.
show (image, true and pred) from self.ds with color mappings, optionally only plot
_generate_confusion
[source][test]
_generate_confusion
() No tests found for_generate_confusion
. To contribute a test please refer to this guide and this discussion.
Average and Per Image Confusion: intersection of pixels given a true label, true label sums to 1
_plot_intersect_cm
[source][test]
_plot_intersect_cm
(cm
,title
='Intersection with Predict given True'
) No tests found for_plot_intersect_cm
. To contribute a test please refer to this guide and this discussion.
Plot confusion matrices: self.mean_cm or self.single_img_cm generated by _generate_confusion
Let’s show how SegmentationInterpretation
can be used once we train a segmentation model.
train
camvid = untar_data(URLs.CAMVID_TINY)
path_lbl = camvid/'labels'
path_img = camvid/'images'
codes = np.loadtxt(camvid/'codes.txt', dtype=str)
get_y_fn = lambda x: path_lbl/f'{x.stem}_P{x.suffix}'
data = (SegmentationItemList.from_folder(path_img)
.split_by_rand_pct()
.label_from_func(get_y_fn, classes=codes)
.transform(get_transforms(), tfm_y=True, size=128)
.databunch(bs=16, path=camvid)
.normalize(imagenet_stats))
data.show_batch(rows=2, figsize=(7,5))
learn = unet_learner(data, models.resnet18)
learn.fit_one_cycle(3,1e-2)
learn.save('mini_train')
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 10.024513 | 3.442348 | 00:15 |
1 | 6.325253 | 2.343699 | 00:03 |
2 | 4.759998 | 2.108100 | 00:02 |
Warning: Following results will not make much sense with this underperforming model but functionality will be explained with ease
interpret
interp = SegmentationInterpretation.from_learner(learn)
Since FlattenedLoss of CrossEntropyLoss()
is used we reshape and then take the mean of pixel losses per image. In order to do so we need to pass sizes:tuple
to top_losses()
top_losses, top_idxs = interp.top_losses(sizes=(128,128))
(tensor([3.3195, 3.1692, 2.6574, 2.5976, 2.4910, 2.3759, 2.3710, 2.2064, 2.0871,
2.0834, 2.0479, 1.8645, 1.8412, 1.7956, 1.7013, 1.6126, 1.6015, 1.5470,
1.4495, 1.3423]),
tensor([12, 4, 17, 13, 19, 18, 7, 8, 10, 1, 15, 0, 2, 9, 16, 11, 14, 5,
6, 3]))
Next, we can generate a confusion matrix similar to what we usually have for classification. Two confusion matrices are generated: mean_cm
which represents the global label performance and single_img_cm
which represents the same thing but for each individual image in dataset.
Values in the matrix are calculated as:
begin{align} CM_{ij} & = IOU(Predicted , True | True) end{align}
Or in plain english: ratio of pixels of predicted label given the true pixels
learn.data.classes
array(['Animal', 'Archway', 'Bicyclist', 'Bridge', 'Building', 'Car', 'CartLuggagePram', 'Child', 'Column_Pole',
'Fence', 'LaneMkgsDriv', 'LaneMkgsNonDriv', 'Misc_Text', 'MotorcycleScooter', 'OtherMoving', 'ParkingBlock',
'Pedestrian', 'Road', 'RoadShoulder', 'Sidewalk', 'SignSymbol', 'Sky', 'SUVPickupTruck', 'TrafficCone',
'TrafficLight', 'Train', 'Tree', 'Truck_Bus', 'Tunnel', 'VegetationMisc', 'Void', 'Wall'], dtype='<U17')
mean_cm, single_img_cm = interp._generate_confusion()
((32, 32), (20, 32, 32))
_plot_intersect_cm
first displays a dataframe showing per class score using the IOU definition we made earlier. These are the diagonal values from the confusion matrix which is displayed after.
NaN
indicate that these labels were not present in our dataset, in this case validation set. As you can imagine it also helps you to maybe construct a better representing validation set.
df = interp._plot_intersect_cm(mean_cm, "Mean of Ratio of Intersection given True Label")
label | score |
---|---|
Sky | 0.851616 |
Road | 0.793361 |
Building | 0.274023 |
Tree | 0.00469498 |
Void | 6.70092e-05 |
Animal | 0 |
Pedestrian | 0 |
VegetationMisc | 0 |
Truck_Bus | 0 |
TrafficLight | 0 |
SUVPickupTruck | 0 |
SignSymbol | 0 |
Sidewalk | 0 |
ParkingBlock | 0 |
Archway | 0 |
OtherMoving | 0 |
Misc_Text | 0 |
LaneMkgsDriv | 0 |
Fence | 0 |
Column_Pole | 0 |
Child | 0 |
CartLuggagePram | 0 |
Car | 0 |
Bicyclist | 0 |
Wall | 0 |
Bridge | NaN |
LaneMkgsNonDriv | NaN |
MotorcycleScooter | NaN |
RoadShoulder | NaN |
TrafficCone | NaN |
Train | NaN |
Tunnel | NaN |
Next let’s look at the single worst prediction in our dataset. It looks like this dummy model just predicts everything as Road
:)
i = top_idxs[0]
df = interp._plot_intersect_cm(single_img_cm[i], f"Ratio of Intersection given True Label, Image:{i}")
label | score |
---|---|
Road | 0.999367 |
Sky | 0.405882 |
Building | 0.0479275 |
Tree | 0.00365813 |
Bicyclist | 0 |
Void | 0 |
TrafficLight | 0 |
SUVPickupTruck | 0 |
Sidewalk | 0 |
Pedestrian | 0 |
OtherMoving | 0 |
Misc_Text | 0 |
LaneMkgsDriv | 0 |
Column_Pole | 0 |
CartLuggagePram | 0 |
Car | 0 |
Wall | 0 |
Animal | NaN |
Archway | NaN |
Bridge | NaN |
Child | NaN |
Fence | NaN |
LaneMkgsNonDriv | NaN |
MotorcycleScooter | NaN |
ParkingBlock | NaN |
RoadShoulder | NaN |
SignSymbol | NaN |
TrafficCone | NaN |
Train | NaN |
Truck_Bus | NaN |
Tunnel | NaN |
VegetationMisc | NaN |
Finally we will visually inspect this single prediction
interp.show_xyz(i, sz=15)
Warning: With matplotlib colormaps the max number of unique qualitative colors is 20. So if len(classes) > 20 then close class indexes may be plotted with the same color. Let’s fix this together :)
{'Animal': 0,
'Archway': 1,
'Bicyclist': 2,
'Bridge': 3,
'Building': 4,
'Car': 5,
'CartLuggagePram': 6,
'Child': 7,
'Column_Pole': 8,
'Fence': 9,
'LaneMkgsDriv': 10,
'LaneMkgsNonDriv': 11,
'Misc_Text': 12,
'MotorcycleScooter': 13,
'OtherMoving': 14,
'ParkingBlock': 15,
'Pedestrian': 16,
'Road': 17,
'RoadShoulder': 18,
'Sidewalk': 19,
'SignSymbol': 20,
'Sky': 21,
'SUVPickupTruck': 22,
'TrafficCone': 23,
'TrafficLight': 24,
'Train': 25,
'Tree': 26,
'Truck_Bus': 27,
'Tunnel': 28,
'VegetationMisc': 29,
'Void': 30,
'Wall': 31}
class
ObjectDetectionInterpretation
[source][test]
ObjectDetectionInterpretation
(learn
:Learner
,preds
:Tensor
,y_true
:Tensor
,losses
:Tensor
,ds_type
:DatasetType
=<DatasetType.Valid: 2>
) ::Interpretation
No tests found forObjectDetectionInterpretation
. To contribute a test please refer to this guide and this discussion.
Interpretation methods for classification models.
Warning: ObjectDetectionInterpretation is not implemented yet. Feel free to implement it :)
©2021 fast.ai. All rights reserved.
Site last generated: Jan 5, 2021