機器學習與計算機視覺（opencl編程）

原創

费晓行

2020-05-03 14:08

因爲學習cuda的原因，所以最近一段時間對GPU編程比較感興趣。大家都知道，cuda是屬於nvidia公司的產品，那麼我就在想，對於其他公司開發的GPU產品，他們是怎麼做的？結果就是opencl編程。

1、opencl編程

opencl支持nvidia、ati、arm等多個gpu，也可以在嵌入式設備上完成。

2、opencl的編譯環境

大部分安裝了gpu的編程環境，比如說cuda sdk，就可以開發opencl編程了。

3、最簡單的opencl代碼

https://github.com/giobauermeister/OpenCL-test-apps/blob/master/cl_sample_timer/cl_sample_timer.c

4、編譯opencl代碼的時候需要注意什麼

在鏈接的時候添加OpenCL.lib即可

5、查詢opencl設備信息

可以參考這個鏈接，https://www.cnblogs.com/mtcnn/p/9411861.html

用到的api主要就是兩個，clGetPlatformIDs & clGetPlatformInfo

#include <stdio.h>  
#include <stdlib.h>  
#include <iostream>  
#include <CL/cl.h>  


int main()
{
	//cl_platform 表示一個OpenCL的執行平臺，關聯到GPU硬件，如N卡，AMD卡
	cl_platform_id *platforms;

	//OpenCL中定義的跨平臺的usigned int和int類型
	cl_uint num_platforms;
	cl_int i, err, platform_index = -1;

	char* ext_data;
	size_t ext_size;
	const char icd_ext[] = "cl_khr_icd";

	//要使platform工作，需要兩個步驟。1 需要爲cl_platform_id結構分配內存空間。2 需要調用clGetPlatformIDs初始化這些數據結構。一般還需要步驟0：詢問主機上有多少platforms  

	//查詢計算機上有多少個支持OpenCL的設備
	err = clGetPlatformIDs(5, NULL, &num_platforms);
	if (err < 0)
	{
		perror("Couldn't find any platforms.");
		exit(1);
	}
	printf("本機上支持OpenCL的環境數量: %d\n", num_platforms);

	//爲platforms分配空間												 
	platforms = (cl_platform_id*)
		malloc(sizeof(cl_platform_id) * num_platforms);

	clGetPlatformIDs(num_platforms, platforms, NULL);

	//獲取GPU平臺的詳細信息 
	for (i = 0; i < num_platforms; i++)
	{
		//獲取緩存大小
		err = clGetPlatformInfo(platforms[i],
			CL_PLATFORM_EXTENSIONS, 0, NULL, &ext_size);
		if (err < 0)
		{
			perror("Couldn't read extension data.");
			exit(1);
		}

		printf("緩存大小: %d\n", ext_size);

		ext_data = (char*)malloc(ext_size);

		//獲取支持的擴展功能
		clGetPlatformInfo(platforms[i], CL_PLATFORM_EXTENSIONS,
			ext_size, ext_data, NULL);
		printf("平臺 %d 支持的擴展功能: %s\n", i, ext_data);

		//獲取顯卡的名稱  
		char *name = (char*)malloc(ext_size);
		clGetPlatformInfo(platforms[i], CL_PLATFORM_NAME,
			ext_size, name, NULL);
		printf("平臺 %d 是: %s\n", i, name);

		//獲取顯卡的生產商名稱  
		char *vendor = (char*)malloc(ext_size);
		clGetPlatformInfo(platforms[i], CL_PLATFORM_VENDOR,
			ext_size, vendor, NULL);
		printf("平臺 %d 的生產商是: %s\n", i, vendor);

		//獲取平臺版本 
		char *version = (char*)malloc(ext_size);
		clGetPlatformInfo(platforms[i], CL_PLATFORM_VERSION,
			ext_size, version, NULL);
		printf("平臺 %d 的版本信息： %s\n", i, version);

		//查詢顯卡是獨立的還是嵌入的 
		char *profile = (char*)malloc(ext_size);
		clGetPlatformInfo(platforms[i], CL_PLATFORM_PROFILE,
			ext_size, profile, NULL);
		printf("平臺 %d 是獨立的（full profile）還是嵌入式的（embeded profile）?: %s\n", i, profile);

		//查詢是否支持ICD擴展
		if (strstr(ext_data, icd_ext) != NULL)
			platform_index = i;
		std::cout << "平臺ID = " << platform_index << std::endl;
		/* Display whether ICD extension is supported */
		if (platform_index > -1)
			printf("平臺 %d 支持ICD擴展: %s\n",
				platform_index, icd_ext);
		std::cout << std::endl;

		//釋放空間  
		free(ext_data);
		free(name);
		free(vendor);
		free(version);
		free(profile);
	}

	if (platform_index <= -1)
		printf("No platforms support the %s extension.\n", icd_ext);
	getchar();

	//釋放資源
	free(platforms);
	return 0;
}

6、opencl和cuda的相同之處

opencl和cuda結構上差不多，一部分api代碼都是在cpu運行的，只有kernel部分的代碼纔是在gpu上運行的。當然除了c語言之外，opencl和cuda都是支持python的，將kernel代碼傳給python就可以了，使用起來不復雜。

7、opencl和cuda的關係

opencl&cuda有點類似於opengl&direct x的關係。對於跨平臺來說，opencl更好，但是cuda的生態更棒。本身cuda除了cudnn之外，還提供了很多的第三方庫，這些對於需要數學庫移植的同學來說太方便了。

8、個人怎麼學習使用

如果是在nvidia的環境開發，那麼多多使用cuda，總歸沒有錯的。但是如果arm系列的gpu，opencl基本上是你唯一的選擇。不然，你只能使用neon指令去加速一些特殊的算法了。

ps：

目前soc上面的dsp、fpga、gpu、nn其實都是可以用來數據計算的。只不過因爲功耗、習慣、技術積累的原因，通常我們只使用其中的一種技術。這個時候，就要適當多瞭解一下其他硬件編程技術。不管從哪方面講，opencl&cuda都是非常不錯的，既滿足了併發的要求，而且學習難度上也小很多。dsp&fpga太小衆了，nn也是每一家都有自己的方案，自己能控制的可能只剩下GPU編程這一種了。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

機器學習與計算機視覺（opencl編程）

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

公衆號5月C#/.NET熱文一覽

git 下載大陸鏡像地址

隨想錄（386cpu保護模式）

機器學習與計算機視覺（計算機視覺的嵌入式平臺）

機器學習與計算機視覺（FPGA的圖像處理方法）

隨想錄（文件系統的第一個用戶程序shell）

隨想錄（反調試技術）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結