Merge branch 'master' of github.com:Jittor/jittor

This commit is contained in:
Dun Liang 2022-09-16 01:01:20 +08:00
commit e661f19e20
14 changed files with 1593 additions and 624 deletions

View File

@ -2,6 +2,7 @@
![Jittor Logo](https://cg.cs.tsinghua.edu.cn/jittor/favicon_package_v0/JittorLogo_Final1220.svg)
[快速开始](#快速开始) | [安装](#安装) | [教程](#教程) | [English](./README.md)
@ -18,6 +19,7 @@ Jittor前端语言为Python。前端使用了模块化和动态图执行的设
* [Jittor文档](https://cg.cs.tsinghua.edu.cn/jittor/assets/docs/index.html)
* [Github](https://github.com/jittor/jittor) [Gitee](https://gitee.com/jittor/jittor)
* [Jittor 论坛](https://discuss.jittor.org/)
* [Jittor 精选仓库](https://github.com/Jittor/jittor/blob/master/AWESOME-JITTOR-LIST.md)
* 即时通信: QQ Group(761222083)
@ -88,40 +90,18 @@ for i,(x,y) in enumerate(get_data(n)):
## 安装
Jittor框架对环境要求如下:
Jittor 支持**Linux**(e.g. Ubuntu/CentOS/Arch), **macOS**,**Windows** 其中**Linux**和**macOS**的依赖如下:
* Python版本 >= 3.7
* C++编译器 (需要下列至少一个)
- g++ >=5.4.0 for linux
- clang >=8.0 for mac
* GPU 编译器可选nvcc >=10.0
* GPU 加速库可选cudnn-dev (cudnn开发版, 推荐使用tar安装方法[参考链接](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar))
Jittor 目前还支持主流国产Linux操作系统如统信、麒麟、普华、龙芯Loongnix安装方式可参考 Linux pip安装方法准备好python和g++即可。
**Windows**对环境的要求为:
* Python版本 >= 3.8(建议从Python官网安装<https://www.python.org/downloads/windows/>)
* x86_64处理器
* Windows 10及以上。
如果您不希望手动配置环境,我们推荐使用 Docker 进行安装。
除此之外,您还可以使用 pip 安装和手动安装。
注意1macOS 用户需要安装额外依赖,请参考 [macOS 安装](#macOS-安装)。
| OS | CPU | Python | Compiler | (Optional) GPU platform |
|--------------------------------------------------------|-------------------------------------|--------|--------------|---------------------------------------------|
| Linux<br>(Ubuntu, CentOS, Arch, <br>UOS, KylinOS, ...) | x86 <br>x86_64 <br>ARM <br>loongson | >= 3.7 | g++ >=5.4 | Nvidia CUDA >= 10.0, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar) <br> or [AMD ROCm](https://docs.amd.com/) >= 4.0 <br> or [Hygon DCU DTK](https://tycloud.hpccube.com/doc/1.0.6/11277/general-handbook/software-tutorial/jittor.html) >= 22.04 |
| macOS <br>(>= 10.14 Mojave) | intel<br>Apple Silicon | >= 3.7 | clang >= 8.0 | - |
| Windows 10 & 11 | x86_64 | [>= 3.8](https://www.python.org/downloads/windows/) | - | Nvidia CUDA >= 10.2 [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-windows) |
Jittor 提供了三种安装方法pip、docker和手动安装
## Pip 安装
@ -142,11 +122,11 @@ jittor会自动在路径中寻找合适的编译器, 如果您希望手动指定
### macOS 安装
macOS 请使用 [homebrew](https://brew.sh) 安装额外的依赖 (python>=3.7, onednn)
macOS 请使用 [homebrew](https://brew.sh) 安装额外的依赖。
```bash
brew install python@3.7 onednn libomp
brew install onednn libomp
```
之后您可以通过 pip 安装 jittor并测试是否可以成功运行。
@ -157,7 +137,7 @@ python3.7 -m pip install jittor
python3.7 -m jittor.test.test_example
```
目前在macOS中jittor 只支持 CPU 计算。
目前在 macOS jittor 只支持 CPU 计算。
### Windows安装
@ -439,3 +419,4 @@ Jittor目前由[清华大学计算机图形学组](https://cg.cs.tsinghua.edu.cn
如LICENSE.txt文件中所示Jittor使用Apache 2.0版权协议。

View File

@ -91,34 +91,14 @@ We provide some jupyter notebooks to help you quick start with Jittor.
Jittor environment requirements:
* System: **Linux**(e.g. Ubuntu/CentOS/Arch), **macOS**, or **Windows**, enviroment requirements of **Linux** and **Mac** are list below:
| OS | CPU | Python | Compiler | (Optional) GPU platform |
|--------------------------------------------------------|-------------------------------------|--------|--------------|---------------------------------------------|
| Linux<br>(Ubuntu, CentOS, Arch, <br>UOS, KylinOS, ...) | x86 <br>x86_64 <br>ARM <br>loongson | >= 3.7 | g++ >=5.4 | Nvidia CUDA >= 10.0, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar) <br> or [AMD ROCm](https://docs.amd.com/) >= 4.0 <br> or [Hygon DCU DTK](https://tycloud.hpccube.com/doc/1.0.6/11277/general-handbook/software-tutorial/jittor.html) >= 22.04 |
| macOS <br>(>= 10.14 Mojave) | intel<br>Apple Silicon | >= 3.7 | clang >= 8.0 | - |
| Windows 10 & 11 | x86_64 | [>= 3.8](https://www.python.org/downloads/windows/) | - | Nvidia CUDA >= 10.2 [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-windows) |
* Python version >= 3.7
* CPU compiler (require at least one of the following)
* g++ (>=5.4.0)
* clang (>=8.0)
* GPU compiler (optional)
* nvcc (>=10.0 for g++ or >=10.2 for clang)
* GPU library: cudnn-dev (recommend tar file installation, [reference link](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar))
**Windows** requirements atr:
* Python: version >= 3.8(recommend install from <https://www.python.org/downloads/windows/>)
* x86_64 CPU processor
* Windows 10 or above
Note#1: macOS users have to install additional dependencies, see [macOS install](#macOS-install).
Jittor offers three ways to install: pip, docker, or manual.
@ -142,7 +122,7 @@ python3.7 -m jittor.test.test_example
Please first install additional dependencies with [homebrew](https://brew.sh).
```bash
brew install python@3.7 onednn libomp
brew install onednn libomp
```
@ -433,3 +413,4 @@ Jittor is currently maintained by the [Tsinghua CSCG Group](https://cg.cs.tsingh
Jittor is Apache 2.0 licensed, as found in the LICENSE.txt file.

View File

@ -5,7 +5,7 @@
[Quickstart](#quickstart) | [Install](#install) | [Tutorial](#tutorial) | [Chinese](./README.cn.md)
[快速开始](#快速开始) | [安装](#安装) | [教程](#教程)
[快速开始](#快速开始) | [安装](#安装) | [教程](#教程) | [English](./README.md)
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators. The whole framework and meta-operators are compiled just-in-time. A powerful op compiler and tuner are integrated into Jittor. It allowed us to generate high-performance code with specialized for your model. Jittor also contains a wealth of high-performance model libraries, including: image recognition, detection, segmentation, generation, differentiable rendering, geometric learning, reinforcement learning, etc. .
@ -22,6 +22,7 @@ Related Links:
* [Jittor Documents](https://cg.cs.tsinghua.edu.cn/jittor/assets/docs/index.html)
* [Github](https://github.com/jittor/jittor), [Gitee](https://gitee.com/jittor/jittor)
* [Jittor Forum](https://discuss.jittor.org/)
* [Awesome Jittor List](https://github.com/Jittor/jittor/blob/master/AWESOME-JITTOR-LIST.md)
* IM: QQ Group(761222083)
相关链接:
@ -31,6 +32,7 @@ Related Links:
* [Jittor文档](https://cg.cs.tsinghua.edu.cn/jittor/assets/docs/index.html)
* [Github](https://github.com/jittor/jittor) [Gitee](https://gitee.com/jittor/jittor)
* [Jittor 论坛](https://discuss.jittor.org/)
* [Jittor 精选仓库](https://github.com/Jittor/jittor/blob/master/AWESOME-JITTOR-LIST.md)
* 即时通信: QQ Group(761222083)
@ -115,52 +117,17 @@ We provide some jupyter notebooks to help you quick start with Jittor.
## 安装
Jittor框架对环境要求如下:
Jittor 支持**Linux**(e.g. Ubuntu/CentOS/Arch), **macOS**,**Windows** 其中**Linux**和**macOS**的依赖如下:
* Python版本 >= 3.7
* C++编译器 (需要下列至少一个)
- g++ >=5.4.0 for linux
- clang >=8.0 for mac
* GPU 编译器可选nvcc >=10.0
* GPU 加速库可选cudnn-dev (cudnn开发版, 推荐使用tar安装方法[参考链接](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar))
Jittor 目前还支持主流国产Linux操作系统如统信、麒麟、普华、龙芯Loongnix安装方式可参考 Linux pip安装方法准备好python和g++即可。
**Windows**对环境的要求为:
* Python版本 >= 3.8(建议从Python官网安装<https://www.python.org/downloads/windows/>)
* x86_64处理器
* Windows 10及以上。
如果您不希望手动配置环境,我们推荐使用 Docker 进行安装。
除此之外,您还可以使用 pip 安装和手动安装。
注意1macOS 用户需要安装额外依赖,请参考 [macOS 安装](#macOS-安装)。
Jittor 提供了三种安装方法pip、docker和手动安装
Jittor environment requirements:
* System: **Linux**(e.g. Ubuntu/CentOS/Arch), **macOS**, or **Windows**, enviroment requirements of **Linux** and **Mac** are list below:
| OS | CPU | Python | Compiler | (Optional) GPU platform |
|--------------------------------------------------------|-------------------------------------|--------|--------------|---------------------------------------------|
| Linux<br>(Ubuntu, CentOS, Arch, <br>UOS, KylinOS, ...) | x86 <br>x86_64 <br>ARM <br>loongson | >= 3.7 | g++ >=5.4 | Nvidia CUDA >= 10.0, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar) <br> or [AMD ROCm](https://docs.amd.com/) >= 4.0 <br> or [Hygon DCU DTK](https://tycloud.hpccube.com/doc/1.0.6/11277/general-handbook/software-tutorial/jittor.html) >= 22.04 |
| macOS <br>(>= 10.14 Mojave) | intel<br>Apple Silicon | >= 3.7 | clang >= 8.0 | - |
| Windows 10 & 11 | x86_64 | [>= 3.8](https://www.python.org/downloads/windows/) | - | Nvidia CUDA >= 10.2 [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-windows) |
* Python version >= 3.7
* CPU compiler (require at least one of the following)
* g++ (>=5.4.0)
* clang (>=8.0)
* GPU compiler (optional)
* nvcc (>=10.0 for g++ or >=10.2 for clang)
* GPU library: cudnn-dev (recommend tar file installation, [reference link](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-tar))
**Windows** requirements atr:
* Python: version >= 3.8(recommend install from <https://www.python.org/downloads/windows/>)
* x86_64 CPU processor
* Windows 10 or above
Note#1: macOS users have to install additional dependencies, see [macOS install](#macOS-install).
Jittor 提供了三种安装方法pip、docker和手动安装
Jittor offers three ways to install: pip, docker, or manual.
@ -186,12 +153,12 @@ jittor会自动在路径中寻找合适的编译器, 如果您希望手动指定
### macOS install
macOS 请使用 [homebrew](https://brew.sh) 安装额外的依赖 (python>=3.7, onednn)
macOS 请使用 [homebrew](https://brew.sh) 安装额外的依赖。
Please first install additional dependencies with [homebrew](https://brew.sh).
```bash
brew install python@3.7 onednn libomp
brew install onednn libomp
```
之后您可以通过 pip 安装 jittor并测试是否可以成功运行。
@ -203,7 +170,7 @@ python3.7 -m pip install jittor
python3.7 -m jittor.test.test_example
```
目前在macOS中jittor 只支持 CPU 计算。
目前在 macOS jittor 只支持 CPU 计算。
Currently jittor only supports CPU in macOS.

View File

@ -9,7 +9,7 @@
# file 'LICENSE.txt', which is part of this source code package.
# ***************************************************************
__version__ = '1.3.5.11'
__version__ = '1.3.5.14'
from jittor_utils import lock
with lock.lock_scope():
ori_int = int

File diff suppressed because it is too large Load Diff

View File

@ -69,6 +69,10 @@ unordered_map<string, unsigned int> cutt_plan_cache;
EXTERN_LIB unordered_map<string, unsigned int> cutt_plan_cache;
void CuttTransposeOp::jit_run() {
// Return if x is empty
if (x->num == 0)
return;
cudaGetLastError();
auto* __restrict__ xp = x->mem_ptr;
auto* __restrict__ yp = y->mem_ptr;
@ -116,4 +120,4 @@ void CuttTransposeOp::jit_run() {
}
#endif // JIT
} // jittor
} // jittor

View File

@ -525,20 +525,224 @@ _triple = _ntuple(3)
_quadruple = _ntuple(4)
def unique(x):
def unique(
input: jt.Var,
return_inverse: bool=False,
return_counts: bool=False,
dim: int=None):
r'''
Returns the unique elements of the input tensor.
Args:
x the input tensor.
input (var) the input var
return_inverse (bool) Whether to also return the indices for where elements in the original input ended up in the returned unique list. default: False
return_counts (bool) Whether to also return the counts for each unique element. default: False
dim (int) the dimension to apply unique. If None, the unique of the flattened input is returned. default: None
Example:
>>> jittor.unique(jittor.array([1, 3, 2, 3]))
jt.Var([1 2 3], dtype=int32)
>>> jittor.unique(jittor.array([1, 3, 2, 3, 2]), return_inverse=True, return_counts=True)
(jt.Var([1 2 3], dtype=int32), jt.Var([0 2 1 2 1], dtype=int32), jt.Var([1 2 2], dtype=int32))
>>> jittor.unique(jittor.array([[1, 3], [2, 3]]), return_inverse=True)
(jt.Var([1 2 3], dtype=int32), jt.Var([[0 2]
[1 2]], dtype=int32))
>>> jittor.unique(jittor.array([[1, 3], [1, 3]]), dim=0)
jt.Var([[1 3]], dtype=int32)
'''
x = x.reshape(-1)
_,x = jt.argsort(x)
index,= jt.index((x.shape[0],))
y = x[1:][x[index[1:]] != x[index[:-1]]]
x = jt.concat([x[:1],y],dim=0)
return x
temp_shape = None
if dim == None:
temp_shape = list(input.shape)
input_flatten = input.flatten()
dim = 0
else:
input_flatten = input
input_flatten = input_flatten.transpose(dim, 0)
orig_shape = input_flatten.shape
input_flatten = input_flatten.view(orig_shape[0], -1)
with jt.flag_scope(compile_options = {"FLAGS: --extended-lambda ": 1} if jt.flags.use_cuda else {}):
indice = jt.code((input_flatten.shape[0], ), 'int32', [input_flatten],
cpu_header='''
#include <algorithm>
''',
cpu_src='''
@alias(input_flatten, in0)
@alias(indice, out)
int dimlen = input_flatten_shape0, dimsize = input_flatten_shape1;
for(int i = 0; i < dimlen; ++i) @indice(i) = i;
std::sort(&@indice(0), &@indice(dimlen), [&](int a, int b){
for(int i = 0; i < dimsize; ++i) {
int lhs = @input_flatten(a, i), rhs = @input_flatten(b, i);
if (lhs != rhs) return lhs < rhs;
}
return false;
});
''',
cuda_header='''
#undef out
#include <thrust/extrema.h>
#include <thrust/device_ptr.h>
#include <thrust/execution_policy.h>
#include <thrust/device_vector.h>
#include <thrust/sequence.h>
#include <cub/cub.cuh>
#include <executor.h>
''',
cuda_src=
'''
@alias(input_flatten, in0)
@alias(indice, out)
int dimlen = indice_shape0, dimsize = input_flatten_shape1;
if (dimsize == 1) {
size_t raw_allocation, d_allocation, temp_storage_bytes = 0;
void *d_temp_storage = NULL;
int32_t* raw_ptr = (int32_t*)exe.allocator->alloc(dimlen * (sizeof(int32_t) + sizeof(input_flatten_type)), raw_allocation);
thrust::device_ptr<int32_t> arange_ptr = thrust::device_pointer_cast(raw_ptr);
thrust::sequence(arange_ptr, arange_ptr + dimlen);
cub::DeviceRadixSort::SortPairs(d_temp_storage, temp_storage_bytes, input_flatten_p,
(input_flatten_type*)(raw_ptr + dimlen), thrust::raw_pointer_cast(arange_ptr), indice_p, dimlen);
d_temp_storage = exe.allocator->alloc(temp_storage_bytes, d_allocation);
cub::DeviceRadixSort::SortPairs(d_temp_storage, temp_storage_bytes, input_flatten_p,
(input_flatten_type*)(raw_ptr + dimlen), thrust::raw_pointer_cast(arange_ptr), indice_p, dimlen);
exe.allocator->free(raw_ptr, dimlen * (sizeof(int) + sizeof(input_flatten_type)), raw_allocation);
exe.allocator->free(d_temp_storage, temp_storage_bytes, d_allocation);
} else {
thrust::device_ptr<input_flatten_type> input_ptr = thrust::device_pointer_cast(input_flatten_p);
thrust::device_ptr<int32_t> indice_ptr = thrust::device_pointer_cast(indice_p);
thrust::sequence(indice_ptr, indice_ptr + dimlen);
thrust::sort(thrust::device, indice_ptr, indice_ptr + dimlen,
[=] __device__ (int32_t a, int32_t b)->bool {
for(int i = 0; i < dimsize; ++i) {
input_flatten_type lhs = input_ptr[i + a * dimsize],
rhs = input_ptr[i + b * dimsize];
if (lhs != rhs) return lhs < rhs;
}
return false;
});
}
'''
)
input_sorted = input_flatten[indice][:]
dimlen = indice.shape[0]
diff = jt.logical_not(jt.all(input_sorted[1:] == input_sorted[: -1], 1))
diff = jt.concat([jt.Var([False]), diff], 0)
diff = jt.array(diff, dtype = jt.int32)
with jt.flag_scope(compile_options = {"FLAGS: --extended-lambda ": 1} if jt.flags.use_cuda else {}):
output, inverse = jt.code(
[(-input_sorted.shape[0], ), (indice.shape)],
[input_sorted.dtype, indice.dtype],
[input_sorted, diff, indice],
cpu_header='''
#include <algorithm>
@alias(input_sorted, in0)
@alias(diff, in1)
@alias(indice, in2)
@alias(output, out0)
@alias(inverse, out1)
''',
cpu_src=
f"bool return_inverse = {int(return_inverse)};" +
'''
int tot = -1;
for (int i = 0; i < input_sorted_shape0; ++i) {
if (i == 0 || @diff(i)) {
++tot; @output(tot) = i;
}
if (return_inverse)
@inverse(@indice(i)) = tot;
}
output->set_shape({tot + 1});
''',
cuda_header='''
#undef out
#include <thrust/extrema.h>
#include <thrust/device_ptr.h>
#include <thrust/execution_policy.h>
#include <thrust/scan.h>
#include <executor.h>
@alias(input_sorted, in0)
@alias(diff, in1)
@alias(indice, in2)
@alias(output, out0)
@alias(inverse, out1)
''',
cuda_src=
f"bool return_inverse = {int(return_inverse)};" +
'''
int dimlen = input_sorted_shape0, dimsize = input_sorted_shape1;
size_t raw_allocation;
int32_t* raw_ptr = (int32_t*)exe.allocator->alloc(2 * dimlen * sizeof(int), raw_allocation);
thrust::device_ptr<int32_t> diff_ptr = thrust::device_pointer_cast(diff_p),
inverse_ptr = thrust::device_pointer_cast(inverse_p),
array_ptr = thrust::device_pointer_cast(raw_ptr),
sum_ptr = thrust::device_pointer_cast(raw_ptr + dimlen),
indice_ptr = thrust::device_pointer_cast(indice_p);
thrust::device_ptr<input_sorted_type> input_ptr = thrust::device_pointer_cast(input_sorted_p);
if (return_inverse) {
thrust::inclusive_scan(diff_ptr, diff_ptr + dimlen, sum_ptr);
thrust::scatter(sum_ptr, sum_ptr + dimlen, indice_ptr, inverse_ptr);
}
thrust::sequence(array_ptr, array_ptr + dimlen);
int32_t num = thrust::unique(array_ptr, array_ptr + dimlen,
[=] __device__ (int32_t a, int32_t b)->bool {
for(int i = 0; i < dimsize; ++i) {
input_sorted_type lhs = input_ptr[i + a * dimsize],
rhs = input_ptr[i + b * dimsize];
if (lhs != rhs) return false;
}
return true;
}) - array_ptr;
cudaMemcpy(output_p, raw_ptr, sizeof(int32_t) * num, cudaMemcpyDeviceToDevice);
exe.allocator->free(raw_ptr, 2 * dimlen * sizeof(int32_t), raw_allocation);
output->set_shape({ num });
'''
)
indice_shape = (output.shape[0], )
output = input_sorted[output][:]
new_shape = list(orig_shape[1:])
new_shape.insert(0, -1)
output = output.view(new_shape).transpose(dim, 0)
if temp_shape != None:
inverse = inverse.view(temp_shape).transpose(dim, 0)
if return_inverse:
if return_counts:
counts = jt.zeros(indice_shape, dtype=jt.int32)
jt.scatter_(counts, 0, inverse.flatten(), jt.ones(dimlen), reduce='add')
return output, inverse, counts
else:
return output, inverse
else:
return output
jt.Var.unique = unique
@ -1176,8 +1380,11 @@ def numpy_cumprod(a, dim):
return func(a, dim)
def linspace(start, end, steps):
res = jt.index((steps,))[0]
res = res*float((end-start)/(steps-1))+start
if steps > 1:
res = jt.index((steps,))[0]
res = res*float((end-start)/(steps-1))+start
else:
res = jt.array([start])
return res
def randperm(n, dtype="int32"):

View File

@ -12,7 +12,6 @@
# file 'LICENSE.txt', which is part of this source code package.
# ***************************************************************
from abc import abstractmethod
from sys import breakpointhook
import jittor as jt
from jittor import flatten, init, Module
import numpy as np
@ -262,7 +261,8 @@ def sign(x: jt.Var) -> jt.Var:
def gelu(x):
r''' Applies the element-wise function:
.. math:: \text{GELU}(x) = x * \Phi(x)
.. math::
\text{GELU}(x) = x * \Phi(x)
where :math:`\Phi(x)` is the Cumulative Distribution Function for Gaussian Distribution.
@ -546,6 +546,31 @@ class Dropout(Module):
def dropout(x,p=0.5,is_train=False):
return Dropout(p=p,is_train=is_train)(x)
class DropPath(Module):
'''Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
'''
def __init__(self, p=0.5, is_train=False):
'''
:param p: Specifies the probability of each batch retention. Defaults to 0.5.
:type p: float dtype
:param is_train: Specify whether it is a training model. Defaults to False.
:type is_train: bool
'''
self.p = p
self.is_train = is_train
#TODO: test model.train() to change self.is_train
def execute(self, x):
if self.p == 0. or not self.is_train:
return x
keep_prob = 1 - self.p
shape = (x.shape[0], ) + (1, ) * (x.ndim - 1)
random_tensor = keep_prob + jt.rand(shape, dtype=x.dtype)
output = x.divide(keep_prob) * random_tensor.floor()
return output
def droppath(x,p=0.5,is_train=False):
return DropPath(p=p,is_train=is_train)(x)
class Linear(Module):
def __init__(self, in_features, out_features, bias=True):
self.in_features = in_features
@ -2799,4 +2824,91 @@ def _fft2(x, inverse=False):
y = jt.compile_extern.cufft_ops.cufft_fft(x, inverse)
if inverse:
y /= x.shape[1] * x.shape[2]
return y
return y
def one_hot(x: jt.Var, num_classes: int=-1) -> jt.Var:
''' Returns the one_hot encoding of inputs.
:param x: class values of any shape
:type x: jt.Var with bool or integer dtype
:param num_classes: Total number of classes. If set to -1, the number of classes will be inferred as one greater than the largest class value in the input tensor.
:type num_classes: int, optional
:return: a Var with one more dimension with 1 values at the index
of last dimension indicated by the input, and 0 everywhere else.
:rtype: jt.Var
.. note::
if the values in x are greater than num_class or less than 0,
the returned one_hot will be all zeros.
Example:
>>> jt.nn.one_hot(jt.arange(5) % 3)
jt.Var([[1 0 0]
[0 1 0]
[0 0 1]
[1 0 0]
[0 1 0]], dtype=int32)
>>> jt.nn.one_hot(jt.arange(5) % 3, num_classes=5)
jt.Var([[1 0 0 0 0]
[0 1 0 0 0]
[0 0 1 0 0]
[1 0 0 0 0]
[0 1 0 0 0]], dtype=int32)
>>> jt.nn.one_hot(jt.arange(6).reshape(3,2) % 3)
jt.Var([[[1 0 0]
[0 1 0]]
[[0 0 1]
[1 0 0]]
[[0 1 0]
[0 0 1]]], dtype=int32)
'''
assert x.dtype in [jt.bool, jt.int8, jt.int16, jt.int32, jt.int64, jt.uint8, jt.uint16, jt.uint32, jt.uint64]
if num_classes == -1:
num_classes = x.max().item() + 1
N = len(x.shape)
indices = ["i"+str(i) for i in range(N)]
y = jt.ones_like(x).reindex(
x.shape + [num_classes],
indices,
extras=[x],
overflow_conditions=[f"i{N} != @e0({','.join(indices)})"],
overflow_value=0)
return y
class KLDivLoss(Module):
''' Computes the Kullback-Leibler divergence loss.
'''
def __init__(self, reduction: str = 'mean', log_target: bool = False):
'''
:param reduction: Specifies the reduction to apply to the output. Can be 'mean', 'sum', 'batchmean', or 'none'. Defaults to 'mean'.
:type reduction: str, optional
:param log_target: Specifies whether target is the log space. Defaults to False.
:type log_target: bool, optional
'''
self.reduction = reduction
self.log_target = log_target
def execute(self, input: jt.Var, target: jt.Var) -> jt.Var:
if not self.log_target:
loss_pointwise = target * (target.log() - input)
else:
loss_pointwise = target.exp() * (target - input)
if self.reduction == "mean":
loss = loss_pointwise.mean()
elif self.reduction == "batchmean":
loss = loss_pointwise.sum() / input.size(0)
elif self.reduction == "sum":
loss = loss_pointwise.sum()
else:
loss = loss_pointwise
return loss

View File

@ -32,9 +32,9 @@ struct IndexOp : Op {
Example::
print(jt.index([2,2], 0)())
print(jt.index([2,2], 0))
# output: [[0,0],[1,1]]
print(jt.index([2,2], 1)())
print(jt.index([2,2], 1))
# output: [[0,1],[0,1]]
*/
IndexOp(NanoVector shape, int64 dim, NanoString dtype=ns_int32);

View File

@ -90,5 +90,14 @@ class TestCuttTransposeOp(unittest.TestCase):
assert ((da-jda.data)<1e-5).all(), (da, jda.data, da-jda.data)
assert ((db-jdb.data)<1e-5).all(), (db-jdb.data)
@unittest.skipIf(cutt_ops==None, "Not use cutt, Skip")
@jt.flag_scope(use_cuda=1)
def test_matmul_grad(self):
a = jt.zeros((0, 10))
b = a.transpose(1, 0)
c = b.data
assert c.shape[0] == 10
assert c.shape[1] == 0
if __name__ == "__main__":
unittest.main()

View File

@ -0,0 +1,57 @@
# ***************************************************************
# Copyright (c) 2022 Jittor. All Rights Reserved.
# Maintainers:
# Dun Liang <randonlang@gmail.com>.
# Xiangli Li <1905692338@qq.com>
# Jiapeng Zhang <zhangjp20@mails.tsinghua.edu.cn>
#
# This file is subject to the terms and conditions defined in
# file 'LICENSE.txt', which is part of this source code package.
# ***************************************************************
from cgi import test
import unittest
import jittor as jt
import numpy as np
skip_this_test = False
try:
jt.dirty_fix_pytorch_runtime_error()
import torch
except:
torch = None
skip_this_test = True
def test_unique_with_torch(input, dim=None):
jt0, jt1, jt2 = jt.unique(jt.array(input), True, True, dim)
torch0, torch1, torch2 = torch.unique(torch.tensor(input), True, True, True, dim)
assert np.allclose(jt0, torch0) and np.allclose(jt1, torch1) and np.allclose(jt2, torch2)
@unittest.skipIf(skip_this_test, "No Torch found")
class TestSparse(unittest.TestCase):
def test_unique(self):
test_unique_with_torch(np.array([1, 3, 2, 3, 3, 3], dtype=np.int32))
test_unique_with_torch(np.array([[1, 3], [2, 3], [1, 2]], dtype=np.int64))
def test_unique_dim(self):
test_unique_with_torch(np.array([[1, 3], [2, 3], [1, 3], [2, 3]]), 0)
test_unique_with_torch(np.array([[1, 3], [2, 3], [1, 3], [2, 3]]), 1)
@unittest.skipIf(not jt.compiler.has_cuda, "No CUDA found")
@jt.flag_scope(use_cuda=1)
def test_unique_cuda(self):
self.test_unique()
@unittest.skipIf(not jt.compiler.has_cuda, "No CUDA found")
@jt.flag_scope(use_cuda=1)
def test_unique_dim_cuda(self):
self.test_unique_dim()
if __name__ == "__main__":
unittest.main()

View File

@ -16,7 +16,7 @@ In detail, autocompletion of the following functions are supported.
- methods of jittor.Var
Prerequisite:
- mypy for automatic stub generation
- mypy for automatic stub generation, installation: pip install mypy
Usage: python3 -m jittor.utils.gen_pyi
@ -35,7 +35,7 @@ def add_indent(s: str, n=1):
def ctype_to_python(type_str):
if type_str == "bool":
return "bool"
if type_str in ["int", "uint", "int64", "uint64", "size_t"]:
if type_str in ["int", "uint", "uint8", "int64", "uint64", "size_t"]:
return "int"
if type_str in ["float32", "float64"]:
return "float"
@ -49,6 +49,8 @@ def ctype_to_python(type_str):
return "Var"
if type_str in ["vector<VarHolder*>", "vector<VarHolder*>&&"]:
return "List[Var]"
if type_str in ["vector_to_tuple<VarHolder*>"]:
return "Tuple[Var]"
if type_str == "NanoVector":
return "Tuple[int]"
if type_str == "vector<NanoVector>&&":
@ -161,11 +163,22 @@ def gen_ops_stub(jittor_path):
if func_name == "bool":
continue
docstring = func.__doc__[:func.__doc__.find("Declaration:")]
docstring = docstring.replace("'''", '"""').strip()
declarations = re.findall(r"Declaration:\n(.+)\n", func.__doc__)
docstrings = []
declarations = []
for i, doc in enumerate(re.split(r"Declaration:\n(.+)\n", func.__doc__)):
if i % 2 == 0:
if not doc.strip() and docstrings:
# if the current docstring is empty, use the last docstring
docstrings.append(docstrings[-1])
else:
docstrings.append(doc.replace("'''", '"""').strip())
else:
declarations.append(doc)
for i in range(len(declarations)):
decl = declarations[i]
docstring = docstrings[i]
for decl in declarations:
decorators = "@overload\n" if len(declarations) > 1 else ""
return_type = ctype_to_python(decl.split(' ', maxsplit=1)[0])
param_hints = decl_to_param_hints(decl)

View File

@ -8,6 +8,22 @@ import numpy as np
from typing import Any, BinaryIO, cast, Dict, Optional, Type, Tuple, Union, IO, List
loaded_storages = {}
deserialized_objects = {}
def _is_zipfile(fn):
f = open(fn, "rb")
read_bytes = []
start = f.tell()
byte = f.read(1)
while byte != "":
read_bytes.append(byte)
if len(read_bytes) == 4:
break
byte = f.read(1)
f.seek(start)
local_header_magic_number = [b'P', b'K', b'\x03', b'\x04']
return read_bytes == local_header_magic_number
def _maybe_decode_ascii(bytes_str: Union[bytes, str]) -> str:
if isinstance(bytes_str, bytes):
@ -19,14 +35,14 @@ def load_tensor(contents, dtype, numel, key, location):
loaded_storages[key] = np.frombuffer(contents[name], dtype).copy()
def get_dtype_size(dtype):
if dtype is np.float32 or dtype is np.int32:
dtype = dtype.__str__()
if dtype == "float32" or dtype == "int32":
return 4
elif dtype is np.float64 or dtype is np.int64:
if dtype == "float64" or dtype == "int64":
return 8
elif dtype is np.float16 or dtype is np.int16:
if dtype == "float16" or dtype == "int16":
return 2
else:
return 1
return 1
def persistent_load(saved_id):
global contents
@ -38,7 +54,7 @@ def persistent_load(saved_id):
storage_type, key, location, numel = data
dtype = storage_type.dtype
if key not in loaded_storages:
nbytes = numel * get_dtype_size(dtype)
nbytes = numel
load_tensor(contents, dtype, nbytes, key, _maybe_decode_ascii(location))
return loaded_storages[key]
@ -72,13 +88,12 @@ class StorageType():
return f'StorageType(dtype={self.dtype})'
def jittor_rebuild(storage, storage_offset, size, stride, requires_grad, backward_hooks):
# print(storage, size)
if len(size) == 0:
return jt.array(storage)
return jt.array(storage).reshape(size)
def jittor_rebuild_var(data, requires_grad, backward_hooks):
v = jt.array(data)
v = jt.array(data)
v.requires_grad = requires_grad
return v
@ -96,6 +111,38 @@ class UnpicklerWrapper(pickle.Unpickler): # type: ignore[name-defined]
return super().find_class(mod_name, name)
class ArrayWrapper:
def __init__(self, storage, size=None, requires_grad=None):
self.requires_grad = requires_grad
self.size = size
self.storage = storage
def __str__(self):
return self.storage.__str__()
def jittor_rebuild_direct(storage, storage_offset, size, stride, requires_grad, backward_hooks):
if len(size) == 0:
return ArrayWrapper(storage, size=size)
storage.reshape(size)
return ArrayWrapper(storage, size=size)
def jittor_rebuild_var_direct(data, requires_grad, backward_hooks):
v = ArrayWrapper(storage, requires_grad=requires_grad)
return v
class DirectUnpicklerWrapper(pickle.Unpickler): # type: ignore[name-defined]
def find_class(self, mod_name, name):
if type(name) is str and 'Storage' in name:
try:
return StorageType(name)
except KeyError:
pass
if type(name) is str and '_rebuild_tensor_v2' in name:
return super().find_class("jittor_utils.load_pytorch", "jittor_rebuild_direct")
if type(name) is str and '_rebuild_parameter' in name:
return super().find_class("jittor_utils.load_pytorch", "jittor_rebuild_var_direct")
return super().find_class(mod_name, name)
def _check_seekable(f) -> bool:
def raise_err_msg(patterns, e):
for p in patterns:
@ -117,18 +164,97 @@ def extract_zip(input_zip):
input_zip = ZipFile(input_zip)
return {name: input_zip.read(name) for name in input_zip.namelist()}
def _is_compressed_file(f):
compress_modules = ['gzip']
try:
return f.__module__ in compress_modules
except AttributeError:
return False
def _should_read_directly(f):
if _is_compressed_file(f):
return False
try:
return f.fileno() >= 0
except io.UnsupportedOperation:
return False
except AttributeError:
return False
def persistent_load_direct(saved_id):
global deserialized_objects
assert isinstance(saved_id, tuple)
typename = _maybe_decode_ascii(saved_id[0])
data = saved_id[1:]
if typename == 'module':
# Ignore containers that don't have any sources saved
return data[0]
elif typename == 'storage':
data_type, root_key, location, size, view_metadata = data
location = _maybe_decode_ascii(location)
if root_key not in deserialized_objects:
deserialized_objects[root_key] = np.zeros(size, dtype=data_type)
storage = deserialized_objects[root_key]
if view_metadata is not None:
view_key, offset, view_size = view_metadata
if view_key not in deserialized_objects:
deserialized_objects[view_key] = storage[offset:offset + view_size]
return deserialized_objects[view_key]
else:
return storage
else:
raise RuntimeError("Unknown saved id type: %s" % saved_id[0])
def load_pytorch(fn_name):
global contents
global contents, deserialized_objects
if not fn_name.endswith(".pth"):
print("This function is designed to load pytorch pth format files.")
return None
else:
contents = extract_zip(fn_name)
data_file = io.BytesIO(contents['archive/data.pkl'])
pickle_load_args = {'encoding': 'utf-8'}
unpickler = UnpicklerWrapper(data_file, **pickle_load_args)
unpickler.persistent_load = persistent_load
result = unpickler.load()
if _is_zipfile(fn_name):
contents = extract_zip(fn_name)
data_file = io.BytesIO(contents['archive/data.pkl'])
pickle_load_args = {'encoding': 'utf-8'}
unpickler = UnpicklerWrapper(data_file, **pickle_load_args)
unpickler.persistent_load = persistent_load
result = unpickler.load()
else:
deserialized_objects = {}
f = open(fn_name, "rb")
f_should_read_directly = _should_read_directly(f)
MAGIC_NUMBER = 0x1950a86a20f9469cfc6c
PROTOCOL_VERSION = 1001
pickle_load_args = {'encoding': 'utf-8'}
magic_number = pickle.load(f, **pickle_load_args)
if magic_number != MAGIC_NUMBER:
raise RuntimeError("Invalid magic number; corrupt file?")
protocol_version = pickle.load(f, **pickle_load_args)
if PROTOCOL_VERSION != protocol_version:
raise RuntimeError("Invalid protocal version.")
_sys_info = pickle.load(f, **pickle_load_args)
unpickler = DirectUnpicklerWrapper(f, **pickle_load_args)
unpickler.persistent_load = persistent_load_direct
result = unpickler.load()
offset = f.tell() if f_should_read_directly else None
deserialized_storage_keys = pickle.load(f, **pickle_load_args)
f.read(8)
for key in deserialized_storage_keys:
assert key in deserialized_objects
dtype = deserialized_objects[key].dtype
size = deserialized_objects[key].size * get_dtype_size(dtype)
byte_data = f.read(size)
deserialized_objects[key][:] = np.frombuffer(byte_data, dtype).copy()
f.read(8)
if offset is not None:
offset = f.tell()
for key, params in result.items():
requires_grad = params.requires_grad
shape = params.size
result[key] = jt.array(params.storage)
if shape is not None and len(shape) > 0:
result[key] = result[key].reshape(shape)
if requires_grad is not None:
result[key].requires_grad = requires_grad
return result
if __name__ == "__main__":

View File

@ -19,23 +19,69 @@ def check_is_en(src):
return en_cnt == len(src)
def check_is_both(src):
if src.startswith("!"):
return True
return len(src) < 2
def splite_markdown_blocks(src):
''' split markdown document into text, code, table blocks
'''
blocks = []
block = ""
status = "text"
def commit_block():
blocks.append((block, status))
for line in src.split('\n'):
line = line + "\n"
if line.startswith("```"):
assert status in ["text", "code"]
if status == "text":
commit_block()
status = "code"
block = line
elif status == "code":
block += line
commit_block()
status = "text"
block = ""
elif line.strip().startswith('|') and line.strip().endswith('|'):
assert status in ["text", "table"]
if status == "text":
commit_block()
status = "table"
block = line
else:
block += line
else:
if status == "table":
commit_block()
status = "text"
block = line
else:
block += line
if status != "code":
commit_block()
return blocks
for mdname in all_src_md:
print(mdname)
with open(mdname, "r", encoding='utf8') as f:
src = f.read()
src = src.split("```")
en_src = []
cn_src = []
for i, s in enumerate(src):
if i%2==1:
en_src.append(s)
cn_src.append(s)
src_blocks = splite_markdown_blocks(src)
en_src = ""
cn_src = ""
for block, status in src_blocks:
if status == "code" or status == "table":
en_src += block
cn_src += block
else:
en_s = []
cn_s = []
for line in s.split('\n'):
for line in block.split('\n'):
if check_is_both(line):
en_s.append(line)
cn_s.append(line)
@ -43,10 +89,9 @@ for mdname in all_src_md:
en_s.append(line)
else:
cn_s.append(line)
en_src.append("\n".join(en_s))
cn_src.append("\n".join(cn_s))
en_src = "```".join(en_src)
cn_src = "```".join(cn_src)
en_src += "\n".join(en_s)
cn_src += "\n".join(cn_s)
with open(mdname.replace(".src.md", ".md"), 'w', encoding='utf8') as f:
f.write(en_src)
with open(mdname.replace(".src.md", ".cn.md"), 'w', encoding='utf8') as f: