What does backbone mean in a neural network?

Question

I am getting confused with the meaning of "backbone" in neural networks, especially in the DeepLabv3+ paper. I did some research and found out that backbone could mean

the feature extraction part of a network

DeepLabv3+ took Xception and ResNet-101 as its backbone. However, I am not familiar with the entire structure of DeepLabv3+, which part the backbone refers to, and which parts remain the same?

A generalized description or definition of backbone would also be appreciated.

I think that it is just a concept used in the paper https://arxiv.org/pdf/1703.06870.pdf. It is nothing special. just the first block of their image. — Peyman, Jan 22 '20 at 21:57

score 39 · Accepted Answer · edited Feb 09 '21 at 14:25

In my understanding, the "backbone" refers to the feature extracting network which is used within the DeepLab architecture. This feature extractor is used to encode the network's input into a certain feature representation. The DeepLab framework "wraps" functionalities around this feature extractor. By doing so, the feature extractor can be exchanged and a model can be chosen to fit the task at hand in terms of accuracy, efficiency, etc.

In case of DeepLab, the term backbone might refer to models like the ResNet, Xception, MobileNet, etc.

Mathias Müller · Answer 2 · 2020-06-26T07:46:19.567

28

TL;DR Backbone is not a universal technical term in deep learning.

(Disclaimer: yes, there may be a specific kind of method, layer, tool etc. that is called "backbone", but there is no "backbone of a neural network" in general.)

If authors use the word "backbone" as they are describing a neural network architecture, they mean

feature extraction ( a part of the network that "sees" the input), but this interpretation is not quite universal in the field: for instance, in my opinion, computer vision researchers would use the term to mean feature extraction, whereas natural language processing researchers would not.
in informal language, that this part in question is crucial to the overall method.

edited Jun 26 '20 at 07:46

answered Jan 28 '20 at 09:57

Mathias Müller

22,203
13
58
75

https://journals.sagepub.com/doi/full/10.1177/00368504211011343, https://arxiv.org/pdf/1904.01169.pdf, – ladofa Dec 09 '21 at 03:08
1

@ladofa Why did you add references to these papers? – Mathias Müller Dec 09 '21 at 08:47
I just missunderstand your post and I add just refer and no comment becouse of my poor english. Sorry, just forget it ^^ – ladofa Dec 17 '21 at 09:09

score 18 · Answer 3 · answered Mar 31 '20 at 10:31

Backbone is a term used in DeepLab models/papers to refer to the feature extractor network. These feature extractor networks compute features from the input image and then these features are upsampled by a simple decoder module of DeepLab models to generate segmented masks. The authors of DeepLab models have shown performance with different feature extractors (backbones) like MobileNet, ResNet, and Xception network.

score 6 · Answer 4 · answered Jun 23 '21 at 10:06

CNNs are used for extracting features. Several CNNs are available, for instance, AlexNet, VGGNet, and ResNet(backbones). These networks are mainly used for object classification tasks and have evaluated on some widely used benchmarks and datasets such as ImageNet. In image classification or image recognition, the classifier classifies a single object in the image, outputs a single category per image, and gives the probability of matching a class. Whereas in object detection, the model must be able to recognize several objects in a single image and provides the coordinates that identify the location of the objects. This shows that the detection of objects can be more difficult than the classification of images.

source and more info: https://link.springer.com/chapter/10.1007/978-3-030-51935-3_30

What does backbone mean in a neural network?

4 Answers4

Linked