Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

Commit

Permalink
Update AGI startup for pytorch install
Browse files Browse the repository at this point in the history
Signed-off-by: Jianliang Shen <[email protected]>
  • Loading branch information
Jianliang Shen committed Jun 26, 2024
1 parent 1b406f1 commit 9fde626
Show file tree
Hide file tree
Showing 8 changed files with 356 additions and 79 deletions.
411 changes: 334 additions & 77 deletions source/_posts/AGI-Startup.md

Large diffs are not rendered by default.

22 changes: 21 additions & 1 deletion source/_posts/CUDA——C.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ tags:
- GPU
- CUDA
categories:
- CUDA
- GPU
---

源码:<https://www.wiley.com/en-us/Professional+CUDA+C+Programming-p-9781118739327>
Expand Down Expand Up @@ -89,6 +89,15 @@ categories:
- [4.1 CUDA内存模型概述](#41-cuda内存模型概述)
- [4.1.1 内存层次结构的优点](#411-内存层次结构的优点)
- [4.1.2 CUDA内存模型](#412-cuda内存模型)
- [4.1.2.1 寄存器](#4121-寄存器)
- [4.1.2.2 本地内存](#4122-本地内存)
- [4.1.2.3 共享内存](#4123-共享内存)
- [4.1.2.4 常量内存](#4124-常量内存)
- [4.1.2.5 纹理内存](#4125-纹理内存)
- [4.1.2.6 全局内存](#4126-全局内存)
- [4.1.2.7 GPU缓存](#4127-gpu缓存)
- [4.1.2.8 CUDA变量声明总结](#4128-cuda变量声明总结)
- [4.1.2.9 静态全局内存](#4129-静态全局内存)
- [4.2 内存管理](#42-内存管理)
- [4.2.1 内存分配和释放](#421-内存分配和释放)
- [4.2.2 内存传输](#422-内存传输)
Expand All @@ -99,12 +108,23 @@ categories:
- [4.3 内存访问模式](#43-内存访问模式)
- [4.3.1 对齐与合并访问](#431-对齐与合并访问)
- [4.3.2 全局内存读取](#432-全局内存读取)
- [4.3.2.1 缓存加载](#4321-缓存加载)
- [4.3.2.2 没有缓存的加载](#4322-没有缓存的加载)
- [4.3.2.3 非对齐读取的示例](#4323-非对齐读取的示例)
- [4.3.2.4 只读缓存](#4324-只读缓存)
- [4.3.3 全局内存写入](#433-全局内存写入)
- [4.3.4 结构体数组与数组结构体](#434-结构体数组与数组结构体)
- [4.3.5 性能调整](#435-性能调整)
- [4.3.5.1 展开技术](#4351-展开技术)
- [4.3.5.2 增大并行性](#4352-增大并行性)
- [4.4 核函数可达到的带宽](#44-核函数可达到的带宽)
- [4.4.1 内存带宽](#441-内存带宽)
- [4.4.2 矩阵转置问题](#442-矩阵转置问题)
- [4.4.2.1 为转置核函数设置性能的上限和下限](#4421-为转置核函数设置性能的上限和下限)
- [4.4.2.2 朴素转置:读取行与读取列](#4422-朴素转置读取行与读取列)
- [4.4.2.3 展开转置:读取行与读取列](#4423-展开转置读取行与读取列)
- [4.4.2.4 对角转置:读取行与读取列](#4424-对角转置读取行与读取列)
- [4.4.2.5 使用瘦块来增加并行性](#4425-使用瘦块来增加并行性)
- [4.5 使用统一内存的矩阵加法](#45-使用统一内存的矩阵加法)
- [4.6 总结](#46-总结)
- [第5章 共享内存和常量内存](#第5章-共享内存和常量内存)
Expand Down
2 changes: 1 addition & 1 deletion source/_posts/CUDA——CUDA并行程序设计.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ tags:
- GPU
- CUDA
categories:
- CUDA
- GPU
---

## 目录
Expand Down
Binary file added themes/fluid/source/img/post_pics/ai/Pytorch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 9fde626

Please sign in to comment.