Simple Api Notes ✍🏼 For Beginners！
记录并尝试解释一些常见的Api，并部分介绍它们的原理、实战运用。

torch

Serialization

torch.save

torch.save(obj, f, pickle_module=<module ‘pickle’ from ‘/homes/alexandrov/.pyenv/versions/3.6.5/lib/python3.6/pickle.py’>, pickle_protocol=2) 存储对象到硬盘中。

torch.load

torch.load(f,map_location=None,pickle_module=<module ‘pickle’ from ‘/homes/alexandrov/.pyenv/versions/3.6.5/lib/python3.6/pickle.py’>)
从文件中加载由torch.save()方法存储的对象。序列化存储调用。

transpose_

transpose_(dim0, dim1) → Tensor
转置dim0和dim1。

torch.autograd

torch.autograd提供了实现任意标量值函数自动区分的类和函数，它只需要对现有代码进行最小的更改：只需要在声明需要计算梯度的张量的时候，设置requires_grad关键字为True。

torch.Tensor

backward

backward(gradient=None, retain_graph=None, create_graph=False)
计算当前张量的梯度。

torch.nn

Containers

torch.nn.Module

所有网络模型的基类，即所有自定义的网络都要继承该类。
Modules可以包含其他Modules，允许嵌套成树形结构。
当我们调用xxmodel.cuda()的时候，模型的参数也会转化为cuda Tensor。

cuda

cuda(device=None)
这个方法帮助我们把所有的模型参数和buffers转移到GPU。
注意，这会使得参数和buffers变成不同的对象（cuda Tensor）。所以如果优化时模型存在于GPU上时，本方法需要 在构造优化器之前被调用。

与之对应的有个方法cpu()。

eval

eval()
使模型处于evaluation模式。对特定的模块(层)有效，比如Dropout,BatchNorm等，在遇到更具体的模块的时候注意它们的文档。

train

train()
使模型处于training模式，同eval()方法，对特点模块有效。

forward

forward(*input)
定义每次调用时的计算过程。 所有的子类都需要覆盖这个方法。

to

to(*args, **kwargs)
移动或者映射所有的参数、buffers。

可以这么调用:
- to(device=None, dtype=None, non_blocking=False)
- to(dtype, non_blocking=False)
- to(tensor, non_blocking=False)

这里的dtype是此模块中浮点参数和缓冲区的所需浮点类型.

modules

modules()
返回可以迭代模型所有模块的迭代器(yields)。

load_state_dict

load_state_dict(state_dict,strict=True)
从state_dict中拷贝参数和缓冲区。
如果strict为真，那么state_dict就必须和模型state_dict()方法返回的key完全匹配。
这个方法可以用来调用pretrain的model。

state_dict

state_dict(destination=None,prefix='',keep_vars=False)
返回包含模块完整状态的词典。
所有的参数和缓冲区都被包含进去。key对应参数和缓冲区的名字。

named_modules

named_modules(memo=None,prefix='')
返回网络中所有模块的迭代器yields，同时包含模块的名称以及模块本身。

torch.nn.ModuleList(modules=None)

在list中持有若干子模块。
可以像python自带的list一样，调用下标。但是它包含的模块均是正确注册过的，可以通过调用modules方法可视化。

append

append(module)
添加一个。

extend

extend(modules) 添加多个。

torch.nn.Sequential(*args)

一个队列容器，模块传递进构造器的顺序就是它们添加到模型的顺序，因此一个有序的模块字典OrderedDict也可以传入构造器中。

Pooling layers

AdaptiveMaxPool2d

torch.nn.AdaptiveMaxPool2d(output_size,return_indices=False)
在由多个输入平面组成的输入信号上应用2D自适应最大池化。
对于任何输入尺寸，输出的大小为H x W. 输出特征的数量等于输入平面的数量。

参数：
- output_size- 目标输出尺寸。可以是一个tuple，也可以是单个值（表示宽和高相同）。除了int也可以是None，代表输出和输入尺寸相同。
- return_indices- 默认False。如果是True，和输出一起返回切片。对nn.MaxUnpool2d有用。

Normalization layers

BatchNorm2d

BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
未完。。

Non-linear activations

ReLU

Input: (N,∗) where * means, any number of additional dimensions
Output: (N,∗), same shape as the input

Dropout layers

Dropout

torch.nn.Dropout(p=0.5, inplace=False)
在训练期间，使用来自伯努利分布的样本以概率p随机地将输入张量的一些元素归零。在每个前向传播中随机化零个元素。
此外，输出按1/(1-p)的比例缩放.
- p – probability of an element to be zeroed. Default: 0.5 - inplace – If set to True, will do this operation in-place. Default: False

torch.nn.functional

Convolution functions

conv2d

conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor
- input – 输入张量的形状 (minibatch×in_channels×iH×iW)
- weight – filters of shape (out_channels × (in_channels/groups) × kH × kW)
- bias – optional bias tensor of shape (out_channels). Default: None
- stride – 卷积核的步长. 可以是一个数，也可以是一个tuple (sH, sW). 默认为1
- padding – 输入边缘的隐式零填充，可以是一个数或者一个tuple (padH, padW). 默认为0
- dilation – 内核元素之间的间距. 可以是一个数或者一个tuple (dH, dW). 默认为1
- groups – split input into groups, in_channels should be divisible by the number of groups. Default: 1

Do not go gentle into that good night

PyTorch Api Notes 📙