百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 编程字典 > 正文

「强基固本」常用卷积神经网络巡礼(论文详解+代码实现)

toyiye 2024-06-06 22:12 14 浏览 0 评论

“强基固本,行稳致远”,科学研究离不开理论基础,人工智能学科更是需要数学、物理、神经科学等基础学科提供有力支撑,为了紧扣时代脉搏,我们推出“强基固本”专栏,讲解AI领域的基础知识,为你的科研学习提供助力,夯实理论基础,提升原始创新能力,敬请关注。


历史文章:人工智能前沿学生论坛

本文向大家介绍ResNet, MobileNet, EfficientNet 三种经典的卷积神经网络(CNN),它们在CNN发展和应用历程中具有里程碑式的意义。

ResNet通过引入快捷连接(shortcut)从而实现以残差(residual)为优化目标从而能够构造层数更多的深层网络,其在CNN的应用就像一战时期大幅提升海军火力的战列舰。

MobileNet通过使用深度级可拆分卷积(Deepwise Separable Convolution)实现输入各channels的独立卷积,以小幅下降准确率为代价实现运算量缩小8-9倍。其对CNN的价值就像近代海战中的巡洋舰或现代驱逐舰。

EfficientNet通过寻找共参数的复合缩放策略实现了在模型优化的三个维度(depth, width, resolution)进行协同优化寻找最优解。其在CNN的地位就像二战以来的航空母舰。

两点注意:

1. EfficientNet中的深度depth和MobileNet深度级可拆分卷积的深度不是一个概念:前者是指神经网络的层级深度,后者是指输入特征图的通道数即EfficientNet中的宽度width,要注意加以区分!(细心的读者可能发现本人将MobileNet的深度级可拆分英文改成了Deepwise而没有使用depthwise,就是为了对两个模型的深度定义加以区分)

2. 由于EfficientNet继承了前人模型的优点,同时拥有强大的复合优化策略,部分卷友将EfficientNet誉为迄今为止的最强模型,除开优缺点比较之外,我的观点是模型中性原则:在某种意义上来说,没有最强的模型,只有适合你研究内容的模型。

本人已委托维权骑士(rightknights.com)进行原创维权。


01

ResNet

ResNet是微软研究院何恺明团队的作品,取得了2015 年ILSVRC(ImageNet Large Scale Visual Recognition Challenge)的冠军,对应论文为:

https://arxiv.org/pdf/1512.03385.pdf

1. Hilights

1.1 证明了传统卷积神经网络(CNN)的退化(degradation)现象: 随着网络层级加深,训练准确率趋于饱和(反向传播的梯度下降).

Degradation

1.2 引入快捷连接(shortcut)和残差(residual)的概念: 通过快捷连接的引入将优化目标从期望输出替换为残差,残差比期望输出更容易学习且更好避免反向传播的梯度下降,从而可以构造层数更多的深层网络,实现更强的网络表达能力。

如下图左,以CNN为例,传统的CNN网络是以卷积层的期望输出F(x)作为优化目标,期望使F(x)最接近输入x,即以F(x)≈x为目标拟合从而反向传播优化权重参数。ResNet通过引入shortcut,将输入x和卷积层输出F(x)直接相加作为期望输出H(x)=F(x)+x,这时候卷积层输出F(x)实际就是期望输出H(x)和输入x之间的残差F(x)=H(x)-x,那么使期望输出H(x)最接近输入x实际就是使残差F(x)趋近于0的最小化,即将优化目标从期望输出替换为残差。这样就形成了ResNet如下图右的基本结构Residual (亦称Bottleneck).

shortcut and residual (left) + Residual/Bottleneck (right)

1.3 网络架构:

图中每一个{ }为一个Bottleneck,网络架构为:Conv -> MaxPool -> Bottleneck x n -> AvgPool -> Fc -> softmax.

The structure of ResNet

2. Reflection

2.1 Residual结构为1x1Conv降维 -> 3x3Conv -> 1x1Conv升维,整个网络是这一基本结构的堆叠,根据expand_ratio其实是不断在升维(增加feature_map的channels数). 这一特点能更好发挥CNN的卷积功能,且升维之后能降低ReLU激活带来的信息损耗.

2.2 虽然引入shortcut不改变维数也计算简单,但是Residual结构堆叠和不断升维带来了巨大的算力需求,这是ResNet的缺点。

3. PyTorch code of ResNet50

import torch as t
import torch.nn as nn


def Conv1(in_planes, places, stride=2):
  return nn.Sequential(
    nn.Conv2d(in_channels=in_planes,out_channels=places,kernel_size=7,stride=stride,padding=3, bias=False),
    nn.BatchNorm2d(places),
    nn.ReLU(inplace=True),
    nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
  )


class Bottleneck(nn.Module):
  def __init__(self,in_places,places, stride=1,downsampling=False, expansion = 4):
    super(Bottleneck,self).__init__()
    self.expansion = expansion
    self.downsampling = downsampling


    self.bottleneck = nn.Sequential(
      nn.Conv2d(in_channels=in_places,out_channels=places,kernel_size=1,stride=1, bias=False),
      nn.BatchNorm2d(places),
      nn.ReLU(inplace=True),
      nn.Conv2d(in_channels=places, out_channels=places, kernel_size=3, stride=stride, padding=1, bias=False),
      nn.BatchNorm2d(places),
      nn.ReLU(inplace=True),
      nn.Conv2d(in_channels=places, out_channels=places*self.expansion, kernel_size=1, stride=1, bias=False),
      nn.BatchNorm2d(places*self.expansion),
    )


    if self.downsampling:
      self.downsample = nn.Sequential(
        nn.Conv2d(in_channels=in_places, out_channels=places*self.expansion, kernel_size=1, stride=stride, bias=False),
        nn.BatchNorm2d(places*self.expansion)
      )
    self.relu = nn.ReLU(inplace=True)
  def forward(self, x):
    residual = x
    out = self.bottleneck(x)


    if self.downsampling:
      residual = self.downsample(x)


    out += residual
    out = self.relu(out)
    return out


class ResNet(nn.Module):
  def __init__(self,blocks, num_classes=1000, expansion = 4):
    super(ResNet,self).__init__()
    self.expansion = expansion


    self.conv1 = Conv1(in_planes = 3, places= 64)


    self.layer1 = self.make_layer(in_places = 64, places= 64, block=blocks[0], stride=1)
    self.layer2 = self.make_layer(in_places = 256,places=128, block=blocks[1], stride=2)
    self.layer3 = self.make_layer(in_places=512,places=256, block=blocks[2], stride=2)
    self.layer4 = self.make_layer(in_places=1024,places=512, block=blocks[3], stride=2)


    self.avgpool = nn.AvgPool2d(7, stride=1)
    self.fc = nn.Linear(2048,num_classes)


    for m in self.modules():
      if isinstance(m, nn.Conv2d):
        nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
      elif isinstance(m, nn.BatchNorm2d):
        nn.init.constant_(m.weight, 1)
        nn.init.constant_(m.bias, 0)


  def make_layer(self, in_places, places, block, stride):
    layers = []
    layers.append(Bottleneck(in_places, places,stride, downsampling =True))
    for i in range(1, block):
      layers.append(Bottleneck(places*self.expansion, places))


    return nn.Sequential(*layers)


  def forward(self, x):
    x = self.conv1(x)


    x = self.layer1(x)
    x = self.layer2(x)
    x = self.layer3(x)
    x = self.layer4(x)


    x = self.avgpool(x)
    x = x.view(x.size(0), -1)
    x = self.fc(x)
    return x


def ResNet50():
  return ResNet([3, 4, 6, 3])#ResNet中每个Layer的Bottleneck数
net = ResNet50()


02

MobileNet

MobileNet由谷歌于2017年提出,是一款专注于移动端或者嵌入式设备中的轻量级CNN网络。对应论文为 arxiv.org/pdf/1801.0438.

1. Highlights

1.1 同样使用了快捷连接(shortcut)和残差(residual)优化

1.2 引入深度级可拆分卷积(Deepwise Separable Convolution): 注意这里的深度指的是输入特征图(feature_map)的通道数(channels)而非网络层级,卷积核个数 = 输入channels数,卷积核channel数 = 1 ,即对feature_map的每个channel进行独立的卷积且卷积核channel为1. 通过各channel独立卷积以小幅下降准确率为代价实现运算量缩小8-9倍。(详细计算比较见Reflection部分)

深度级可拆分卷积

1.3 引入逆残差(InvertedResidual)结构作为网络架构的基本模块: 1x1Conv升维 -> 3x3Conv -> 1x1Conv降维(降维后使用的是linear线性激活函数以减少信息损耗).

InvertedResidual

如下图所示,这一结构与ResNet的基本模块Residual相反, 先升维是为了发挥深度级可拆分卷积的优势,中间的3x3Conv为DeepwiseConv,两端的1x1Conv为PointWiseConv,即每个输入channel对应多个卷积核进行常规卷积。

Comparision

1.4 网络架构:

图中每一个bottleneck为一个InvertedResidual,网络架构为: Conv -> InvertedResidual x n -> Conv -> AvgPool -> Conv.

The structure of MobileNet

2. Reflection

DeepwiseConv+PointwiseConv与传统Conv+Conv的运算量比较:

M为输入channel数,N为第二次卷积的卷积核个数,DF为输入feature_map的宽和高(假设输入矩阵等边),DK为第一次卷积核的边长。那么:

  • TDC+PC = DF x DF x M x DK x DK + DF x DF x M x N
  • Tconv+conv = DF x DF x M x N x DK x DK

TDC+PC / Tconv+conv = 1/N + 1/9

因此普通卷积运算量是DeepwiseConv+PointwiseConv的8-9倍。

Deepwise Conv + Pointwise Conv

3. PyTorch code of MobileNetV2

import torch as t
import torch.nn as nn


class InvertedResidual(nn.Module):
    def __init__(self, inp, oup, stride, expansion, downsample=None):
        super(InvertedResidual, self).__init__()
        self.stride = stride
        assert stride in [1, 2]
        self.use_res_connect = self.stride == 1 and inp == oup
        self.conv = nn.Sequential(
            # pw
            nn.Conv2d(inp, inp * expansion, 1, 1, 0, bias=False),
            nn.BatchNorm2d(inp * expansion),
            nn.ReLU6(inplace=True),
            # dw
            nn.Conv2d(inp * expansion, inp * expansion, 3, stride, 1, groups=inp * expansion, bias=False),
            nn.BatchNorm2d(inp * expansion),
            nn.ReLU6(inplace=True),
            # pw-linear
            nn.Conv2d(inp * expansion, oup, 1, 1, 0, bias=False),
            nn.BatchNorm2d(oup),
        )


    def forward(self, x):
        if self.use_res_connect:
            return x + self.conv(x)
        else:
            return self.conv(x)


class MobileNet(nn.Module):
    def __init__(self, blocks, num_classes=1000,inchannels=32):
        super(MobileNet, self).__init__()
        self.inchannels = inchannels
        self.head_conv = nn.Sequential(nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1, bias=False),
                                        nn.BatchNorm2d(32),
                                        nn.ReLU())


        self.layer1 = self.make_layer(blocks[0],outchannels=16,stride=1,expansion=1)
        self.layer2 = self.make_layer(blocks[1],outchannels=24,stride=2,expansion=6)
        self.layer3 = self.make_layer(blocks[2],outchannels=32,stride=2,expansion=6)
        self.layer4 = self.make_layer(blocks[3],outchannels=64,stride=2,expansion=6)
        self.layer5 = self.make_layer(blocks[4],outchannels=96,stride=1,expansion=6)
        self.layer6 = self.make_layer(blocks[5],outchannels=160,stride=2,expansion=6)
        self.layer7 =self.make_layer(blocks[6],outchannels=320,stride=1,expansion=6)


        self.end_conv_avgpool_conv = nn.Sequential(nn.Conv2d(320, 1280, kernel_size=1, stride=1, bias=False),
                                                    nn.AvgPool2d(7,stride=1),
                                                    nn.Conv2d(1280,num_classes,kernel_size=1,stride=1))


    def make_layer(self, block, outchannels, stride, expansion):
        downsample_ = nn.Sequential(
            nn.Conv2d(self.inchannels,outchannels,kernel_size=1,stride=stride),
            nn.BatchNorm2d(outchannels)
            )


        layers = []
        layers.append(InvertedResidual(self.inchannels, outchannels, expansion=expansion, stride=stride, downsample=downsample_))
        self.inchannels = outchannels
        for i in range(1, block):
            layers.append(InvertedResidual(self.inchannels, outchannels, expansion=expansion, stride=stride))
        return nn.Sequential(*layers)


    def forward(self, x):
        x = self.head_conv(x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.layer5(x)
        x = self.layer6(x)
        x = self.layer7(x)
        x = self.end_conv_avgpool_conv(x)
        x = x.view(x.size(0), -1)
        return x


def MobileNetV2():
    return MobileNet([1,2,3,4,3,3,1])
net=MobileNetV2()


03

EfficientNet

EfficientNet由谷歌于2019年提出,是一款通过寻找共参数的复合缩放策略从而在模型优化的三个维度(depth, width, resolution)进行协同优化的CNN网络。对应论文为 arxiv.org/pdf/1905.1194.

Depth: 这里的深度和MobileNet的InvertedResidual-deepswide中的深度不是一个概念:前者是指神经网络的层级深度,后者是指输入特征图的通道数即这里的width,要注意区分

Width: 输入特征图的通道数

Resolution:输入特征图的分辨率,即高x宽

model scaling

1. Highlights

1. 同样使用了快捷连接(shortcut)和残差(residual)优化

2.网络架构的基本结构为MBConv:是MobileNetV2的InvertedResidual-DeepwiseConv与SeNet的Squeeze-and-Excitation的组合,详细结构如下图:

MBConv

  • Deepwise Conv: H’, W’, C分别为输入feature_map的高、宽、通道数,经过DeepwiseConv之后转换为H,W,C的格式
  • Fsq为squeeze:通过一个全局平均池化层将H x W x C的输入转换为1 x 1 x C的输出
  • Fex为excitation:通过1x1Conv -> swish -> 1x1Conv -> sigmoid激活,格式仍然为1 x 1 x c
  • Fscale为squeeze-and-excitation的输入和输出相乘,得到MBConv输出格式为H x W x C,其后接Conv -> Batch Normalization

注:MBConv中的SE过程将SeNet的SE中的全局池化选定为全局平均池化,Fc -> ReLU -> Fc替换为了1x1Conv -> swish -> 1x1Conv以适应架构。

3. 网络架构:

EfficientNet的网络层级深,结构十分复杂,但主要模块都是基于基本结构MBConv的变体,主要有以下模块:

Stem: Input Layer -> Rescaling -> Normalization -> Zero Padding -> Conv -> Batch Normalization -> Activation

Stem

MBConv-Start: 其结构同MBConv结构图亦如下图,其DeepwiseConv部分结构为:DeepwiseConv -> Batch Normalization -> Activation

MBConv-start

MBConv:其结构同MBConv结构图亦如下图,其Deepwise Conv部分结构为:DeepwiseConv -> Batch Normalization -> Activation -> Zero Padding -> DeepwiseConv -> Batch Normalization -> Activation

MBConv

Dropconnect: 随机丢弃输入来减少过拟合,而Dropout是随机丢弃节点即输出结果

Dropout

Final Layer: Conv2D -> Batch Normalization -> Activation -> AvgPool -> dropout -> Fc

Final Layer

以EfficientNetB7的网络架构为例,{ }为一个shortcut,B7的网络架构为:

Stem -> layer1: { MBConv-start -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 2

-> layer2: { MBConv -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 5

-> layer3: { MBConv -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 5

-> layer4: { MBConv -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 8

-> layer5: { MBConv -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 8

-> layer6: { MBConv -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 11

-> layer7: { MBConv -> MBConv -> Dropconnect } -> { MBConv -> Dropconnect } x 2

-> Final Layer

The structure of EfficientNetB7

  • 图中Layer2-7为循环,即Layer2完成后进入Layer3,Layer3之后进入Layer4,直至Layer7完成才进入Final Layer
  • x2表示箭头方向连接循环2次
  • 黑白+表示快速链接

2. Reflection

EfficientNet实际是在layers之间运用InvertedResidual进行升维和可拆分卷积及Squeeze-and-Excitation(SE);在layers内部运用InvertedResidual进行不改变channels数的可拆分卷积及SE,并通过dropconnect随机丢弃输入减少过拟合。

3. PyTorch code of EfficientNetB7

import math

import torch as t

from torch import nn

from torch.nn import functional as F

from functools import partial




class SwishImplementation(t.autograd.Function):

    #staticmethod

    def forward(ctx, i):

        result = i * t.sigmoid(i)

        ctx.save_for_backward(i)

        return result

    #staticmethod

    def backward(ctx, grad_output):

        i = ctx.saved_tensors[0]#saved_variables

        sigmoid_i = t.sigmoid(i)

        return grad_output * (sigmoid_i * (1 + i * (1 - sigmoid_i)))




class MemoryEfficientSwish(nn.Module):

    def forward(self, x):

        return SwishImplementation.apply(x)




def drop_connect(inputs, p, training):

    #Drop connect

    if not training: return inputs

    batch_size = inputs.shape[0]

    keep_prob = 1 - p

    random_tensor = keep_prob

    random_tensor += t.rand([batch_size, 1, 1, 1], dtype=inputs.dtype, device=inputs.device)

    binary_tensor = t.floor(random_tensor)

    output = inputs / keep_prob * binary_tensor

return output




def get_same_padding_conv2d(image_size=None):

return partial(Conv2dStaticSamePadding, image_size=image_size)




def get_width_and_height_from_size(x):

    #Obtains width and height from a int or tuple

    if isinstance(x, int): return x, x

    if isinstance(x, list) or isinstance(x, tuple): return x

else: raise TypeError()




def calculate_output_image_size(input_image_size, stride):

    #计算出 Conv2dSamePadding with a stride.

    if input_image_size is None: return None

    image_height, image_width = get_width_and_height_from_size(input_image_size)

    stride = stride if isinstance(stride, int) else stride[0]

    image_height = int(math.ceil(image_height / stride))

    image_width = int(math.ceil(image_width / stride))

return [image_height, image_width]




class Conv2dStaticSamePadding(nn.Conv2d):

    #2D Convolutions like TensorFlow, for a fixed image size

    def __init__(self, in_channels, out_channels, kernel_size, image_size=None, **kwargs):

        super().__init__(in_channels, out_channels, kernel_size, **kwargs)

        self.stride = self.stride if len(self.stride) == 2 else [self.stride[0]] * 2

        # Calculate padding based on image size and save it

        assert image_size is not None

        ih, iw = (image_size, image_size) if isinstance(image_size, int) else image_size

        kh, kw = self.weight.size()[-2:]

        sh, sw = self.stride

        oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)

        pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)

        pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)

        if pad_h > 0 or pad_w > 0:

            self.static_padding = nn.ZeroPad2d((pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2))

        else:

            self.static_padding = Identity()

    def forward(self, x):

        x = self.static_padding(x)

        x = F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)

        return x




class Identity(nn.Module):

    def __init__(self, ):

        super(Identity, self).__init__()

    def forward(self, input):

        return input




# MBConvBlock

class MBConvBlock(nn.Module):

    # ksize3*3 输入32 输出16  conv1  stride步长1

    def __init__(self, ksize, input_filters, output_filters, expand_ratio, stride, image_size=None):

        super().__init__()

        self._bn_mom = 0.1

        self._bn_eps = 0.01

        self._se_ratio = 0.25

        self._input_filters = input_filters

        self._output_filters = output_filters

        self._expand_ratio = expand_ratio

        self._kernel_size = ksize

        self._stride = stride

        inp = self._input_filters

        oup = self._input_filters * self._expand_ratio

        # Depthwise convolution

        if self._expand_ratio != 1:

            Conv2d = get_same_padding_conv2d(image_size=image_size)

            self._expand_conv = Conv2d(in_channels=inp, out_channels=oup, kernel_size=1, bias=False)

            self._bn0 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps)

        k = self._kernel_size

        s = self._stride

        Conv2d = get_same_padding_conv2d(image_size=image_size)

        self._depthwise_conv = Conv2d(

            in_channels=oup, out_channels=oup, groups=oup,

            kernel_size=k, stride=s, bias=False)

        self._bn1 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps)

        image_size = calculate_output_image_size(image_size, s)

        # Squeeze and Excitation layer, if desired

        Conv2d = get_same_padding_conv2d(image_size=(1,1))

        num_squeezed_channels = max(1, int(self._input_filters * self._se_ratio))

        self._se_reduce = Conv2d(in_channels=oup, out_channels=num_squeezed_channels, kernel_size=1)

        self._se_expand = Conv2d(in_channels=num_squeezed_channels, out_channels=oup, kernel_size=1)

        # Output phase

        final_oup = self._output_filters

        Conv2d = get_same_padding_conv2d(image_size=image_size)

        self._project_conv = Conv2d(in_channels=oup, out_channels=final_oup, kernel_size=1, bias=False)

        self._bn2 = nn.BatchNorm2d(num_features=final_oup, momentum=self._bn_mom, eps=self._bn_eps)

        self._swish = MemoryEfficientSwish()

    def forward(self, inputs, drop_connect_rate=None):

        #:param inputs: input tensor

        #:param drop_connect_rate: drop connect rate (float, between 0 and 1)

        #:return: output of block

        # Expansion and Depthwise Convolution

        x = inputs

        if self._expand_ratio != 1:

            expand = self._expand_conv(inputs)

            bn0 = self._bn0(expand)

            x = self._swish(bn0)

        depthwise = self._depthwise_conv(x)

        bn1 = self._bn1(depthwise)

        x = self._swish(bn1)

        # Squeeze and Excitation

        x_squeezed = F.adaptive_avg_pool2d(x, 1)

        x_squeezed = self._se_reduce(x_squeezed)

        x_squeezed = self._swish(x_squeezed)

        x_squeezed = self._se_expand(x_squeezed)

        x = t.sigmoid(x_squeezed) * x

        x = self._bn2(self._project_conv(x))

        # Skip connection and drop connect

        input_filters, output_filters = self._input_filters, self._output_filters

        if self._stride == 1 and input_filters == output_filters:

            if drop_connect_rate:

                x = drop_connect(x, p=drop_connect_rate, training=self.training)

            x = x + inputs  # skip connection

        return x




class EfficientNet(nn.Module):

    def __init__(self, cfgs, num_classes=1000, image_size=224):

        super().__init__()

        bn_mom = 0.01

        bn_eps = 0.001

        self.cfgs = cfgs

        # stem

        Conv2d = get_same_padding_conv2d(image_size=image_size)

        self._conv_stem = Conv2d(3, 32, kernel_size=3, stride=2, bias=False)

        self._bn0 = nn.BatchNorm2d(num_features=32, momentum=bn_mom, eps=bn_eps)

        # MBConv

        self._blocks = nn.ModuleList([])

        for expand, ksize, input_filters, output_filters,  stride, image_size in self.cfgs:

            self._blocks.append(MBConvBlock(ksize, input_filters, output_filters, expand, stride, image_size))

        # Head

        in_channels = self.cfgs[-1][3]

        out_channels = in_channels * 4

        image_size = self.cfgs[-1][-1]

        Conv2d = get_same_padding_conv2d(image_size=image_size)

        self._conv_head = Conv2d(in_channels, out_channels, kernel_size=1, bias=False)

        self._bn1 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)

        # Final linear layer

        self._avg_pooling = nn.AdaptiveAvgPool2d(1)

        self._dropout = nn.Dropout(0.2)

        self._fc = nn.Linear(out_channels, num_classes)

        self._swish = MemoryEfficientSwish()

    def _make_MBConv(self, inputs):

        x = self._swish(inputs)

        # Blocks

        for idx, block in enumerate(self._blocks):

            drop_connect_rate = 0.2

            drop_connect_rate *= float(idx) / len(self._blocks)

            x = block(x, drop_connect_rate=drop_connect_rate)

        return x

    def forward(self, inputs):

        # Stem

        x = self._conv_stem(inputs)

        x = self._bn0(x)

        # Convolution layers

        x = self._make_MBConv(x)

        # Head

        x = self._conv_head(x)

        s = self._bn1(x)

        x = self._swish(x)

        # Pooling and final linear layer

        x = self._avg_pooling(x)

        x = x.view(inputs.size(0), -1)

        x = self._dropout(x)

        x = self._fc(x)

        return x




        

def EfficientNetB7(**kwargs):

    MBConv_cfgs = [

        # expand_ratio, ksize, input, output, expand, stride, image_size

        # layer1 1 4

        # MBConv1, k3*3, inputChannels, outputChannels, stride, Resolution

        [1, 3, 32, 16, 1, [112, 112]],

        [6,3,16,16,1,[112,112]],

        [6,3,16,16,1,[112,112]],

        [6,3,16,16,1,[112,112]],

        # layer2 2 7

        # MBConv6, k3*3, inputChannels, outputChannels, stride, Resolution

        [6, 3, 16, 24, 2, [112, 112]],

        [6, 3, 24, 24, 1, [56, 56]],

        [6,3,24,24,1,[56,56]],

        [6,3,24,24,1,[56,56]],

        [6,3,24,24,1,[56,56]],

        [6,3,24,24,1,[56,56]],

        [6,3,24,24,1,[56,56]],

        # layer3 2 7

        [6, 5, 24, 40, 2, [56, 56]],

        [6, 5, 40, 40, 1, [28, 28]],

        [6,5,40,40,1,[28,28]],

        [6,5,40,40,1,[28,28]],

        [6,5,40,40,1,[28,28]],

        [6,5,40,40,1,[28,28]],

        [6,5,40,40,1,[28,28]],

        # layer4 3 10

        [6, 3, 40, 80, 2, [28, 28]],

        [6, 3, 80, 80, 1, [14, 14]],

        [6, 3, 80, 80, 1, [14, 14]],

        [6,3,80,80,1,[14,14]],

        [6,3,80,80,1,[14,14]],

        [6,3,80,80,1,[14,14]],

        [6,3,80,80,1,[14,14]],

        [6,3,80,80,1,[14,14]],

        [6,3,80,80,1,[14,14]],

        [6,3,80,80,1,[14,14]],

        # layer5 3 10 #第一个MBConv的stride=1

        [6, 5, 80,  112, 1, [14, 14]],

        [6, 5, 112, 112, 1, [14, 14]],

        [6, 5, 112, 112, 1, [14, 14]],

        [6,5,112,112,1,[14,14]],

        [6,5,112,112,1,[14,14]],

        [6,5,112,112,1,[14,14]],

        [6,5,112,112,1,[14,14]],

        [6,5,112,112,1,[14,14]],

        [6,5,112,112,1,[14,14]],

        [6,5,112,112,1,[14,14]],

        # layer6 4 13

        [6, 5, 112, 192, 2, [14, 14]],

        [6, 5, 192, 192, 1, [7, 7]],

        [6, 5, 192, 192, 1, [7, 7]],

        [6, 5, 192, 192, 1, [7, 7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        [6,5,192,192,1,[7,7]],

        # layer7 1 4 #第一个MBConv的stride=1

        [6, 3, 192, 320, 1, [7, 7]],

        [6,3,320,320,1,[7,7]],

        [6,3,320,320,1,[7,7]],

        [6,3,320,320,1,[7,7]],

    ]

    return EfficientNet(MBConv_cfgs, **kwargs)

net=EfficientNetB7()

以上。

来源:知乎—糖糖家的老张

地址:https://zhuanlan.zhihu.com/p/433728480

相关推荐

linux 命令行之你真的会用吗?--free 基本用法篇

free命令行统计内存使用率及swap交换分区的使用率数据。是由sourceforge负责维护的,在ubuntu上其包名为procps,这个源码包中,除了free还有ps,top,vmstat,ki...

kong api gateway 初体验(konga github)

kongapigateway初体验(firstsight?)。Kong是一个可扩展的开源API层(也称为API网关或API中间件)。Kong运行在任何RESTfulAPI的前面,并通过插件...

在Ubuntu下开启IP转发的方法(ubuntu20 ip)

IP地址分为公有ip地址和私有ip地址,PublicAddress是由INIC(internetnetworkinformationcenter)负责的,这些IP地址分配给了注册并向INIC提...

基于 Kubernetes 的 Serverless PaaS 稳定性建设万字总结

作者:许成铭(竞霄)数字经济的今天,云计算俨然已经作为基础设施融入到人们的日常生活中,稳定性作为云产品的基本要求,研发人员的技术底线,其不仅仅是文档里承诺的几个九的SLA数字,更是与客户切身利益乃...

跟老韩学Ubuntu Linux系列-sysctl 帮助文档

sysctl一般用于基于内核级别的系统调优,man帮助手册如下。...

如何在 Linux/Unix/Windows 中发现隐藏的进程和端口

unhide是一个小巧的网络取证工具,能够发现那些借助rootkit、LKM及其它技术隐藏的进程和TCP/UDP端口。这个工具在Linux、UNIX类、MS-Windows等操作系统下都...

跟老韩学Ubuntu Server 2204-Linux性能管理-uptime指令帮助手册

uptime指令是每个从事Linux系统工作的相关同学必知必会的指令之一,如下是uptime指令的帮助手册。UPTIME(1)...

Openwrt+Rclone+emby+KODI搭建完美家庭影音服务器

特别声明:本篇内容参考了波仔分享,在此表示感谢!上一篇《Openwrt+emby+KODI搭建家庭影音服务器》只适用影音下载到本地的情形,不能播放云盘中的影音,内容较少,缺少了趣味性,也不直观。...

Linux Shell脚本经典案例(linux shell脚本例子)

编写Shell过程中注意事项:开头加解释器:#!/bin/bash语法缩进,使用四个空格;多加注释说明。命名建议规则:变量名大写、局部变量小写,函数名小写,名字体现出实际作用。默认变量是全局的,在函数...

解决 Linux 性能瓶颈的黄金 60 秒

如果你的Linux服务器突然负载暴增,告警短信快发爆你的手机,如何在最短时间内找出Linux性能问题所在?来看Netflix性能工程团队的这篇博文,看它们通过十条命令在一分钟内对机器性能问题进行诊断。...

跟老韩学Ubuntu Server 2204-Linux性能管理-vmstat指令帮助手册

vmstat可查看ubuntlinux的综合性能,是每个从事Linux人员必知必会、需掌握的核心指令之一。vmstat指令帮助手册如下。VMSTAT(8)...

Python 可视化工具包(python常见的可视化工具)

喜欢用Python做项目的小伙伴不免会遇到这种情况:做图表时,用哪种好看又实用的可视化工具包呢?本文将介绍一些常用的Python可视化包,包括这些包的优缺点以及分别适用于什么样的场景。这篇文章...

Python的GPU编程实例——近邻表计算

目录技术背景...

python算法体验-3.python实现欧式距离的三种方式

欧式距离也称欧几里得距离,是最常见的距离度量,衡量的是多维空间中两个点之间的绝对距离。欧式距离源自N维欧氏空间中两点...

python实现Lasso回归分析(特征筛选、建模预测)

实现功能:...

取消回复欢迎 发表评论:

请填写验证码