@@ -28,7 +28,7 @@ reduce memory usage and unnecessary ``memcpy`` calls.
28
28
.. _memory_allocation_virtual_memory :
29
29
30
30
Memory allocation
31
- ================================================================================
31
+ =================
32
32
33
33
Standard memory allocation uses the :cpp:func: `hipMalloc ` function to allocate a
34
34
block of memory on the device. However, when using virtual memory, this process
@@ -37,53 +37,89 @@ is separated into multiple steps using the :cpp:func:`hipMemCreate`,
37
37
:cpp:func: `hipMemSetAccess ` functions. This guide explains what these functions
38
38
do and how you can use them for virtual memory management.
39
39
40
+ .. _vmm_support :
41
+
42
+ Virtual memory management support
43
+ ---------------------------------
44
+
45
+ The first step is to check if the targeted device or GPU supports virtual memory management.
46
+ Use the :cpp:func: `hipDeviceGetAttribute ` function to get the
47
+ ``hipDeviceAttributeVirtualMemoryManagementSupported `` attribute for a specific GPU, as shown in the following example.
48
+
49
+ .. code-block :: cpp
50
+
51
+ int vmm = 0, currentDev = 0;
52
+ hipDeviceGetAttribute(
53
+ &vmm, hipDeviceAttributeVirtualMemoryManagementSupported, currentDev
54
+ );
55
+
56
+ if (vmm == 0) {
57
+ std::cout << "GPU " << currentDev << " doesn't support virtual memory management." << std::endl;
58
+ } else {
59
+ std::cout << "GPU " << currentDev << " support virtual memory management." << std::endl;
60
+ }
61
+
62
+ .. _allocate_physical_memory :
63
+
40
64
Allocate physical memory
41
- --------------------------------------------------------------------------------
65
+ ------------------------
42
66
43
- The first step is to allocate the physical memory itself with the
67
+ The next step is to allocate the physical memory using the
44
68
:cpp:func: `hipMemCreate ` function. This function accepts the size of the buffer,
45
69
an ``unsigned long long `` variable for the flags, and a
46
70
:cpp:struct: `hipMemAllocationProp ` variable. :cpp:struct: `hipMemAllocationProp `
47
71
contains the properties of the memory to be allocated, such as where the memory
48
72
is physically located and what kind of shareable handles are available. If the
49
73
allocation is successful, the function returns a value of
50
74
:cpp:enumerator: `hipSuccess `, with :cpp:type: `hipMemGenericAllocationHandle_t `
51
- representing a valid physical memory allocation. The allocated memory size must
52
- be aligned with the granularity appropriate for the properties of the
53
- allocation. You can use the :cpp:func: `hipMemGetAllocationGranularity ` function
54
- to determine the correct granularity.
75
+ representing a valid physical memory allocation.
76
+
77
+ The allocated memory must be aligned with the appropriate granularity. The
78
+ granularity value can be queried with :cpp:func: `hipMemGetAllocationGranularity `,
79
+ and its value depends on the target device hardware and the type of memory
80
+ allocation. If the allocation size is not aligned, meaning it is not cleanly
81
+ divisible by the minimum granularity value, :cpp:func: `hipMemCreate ` will return
82
+ an out-of-memory error.
55
83
56
84
.. code-block :: cpp
57
85
58
86
size_t granularity = 0;
59
87
hipMemGenericAllocationHandle_t allocHandle;
60
88
hipMemAllocationProp prop = {};
61
- prop.type = HIP_MEM_ALLOCATION_TYPE_PINNED;
62
- prop.location.type = HIP_MEM_LOCATION_TYPE_DEVICE;
89
+ // The pinned allocation type cannot be migrated from its current location
90
+ // while the application is actively using it.
91
+ prop.type = hipMemAllocationTypePinned;
92
+ // Set the location type to device, currently there are no other valid option.
93
+ prop.location.type = hipMemLocationTypeDevice;
94
+ // Set the device id, where the memory will be allocated.
63
95
prop.location.id = currentDev;
64
- hipMemGetAllocationGranularity(&granularity, &prop, HIP_MEM_ALLOC_GRANULARITY_MINIMUM );
96
+ hipMemGetAllocationGranularity(&granularity, &prop, hipMemAllocationGranularityMinimum );
65
97
padded_size = ROUND_UP(size, granularity);
66
98
hipMemCreate(&allocHandle, padded_size, &prop, 0);
67
99
100
+ .. _reserve_virtual_address :
101
+
68
102
Reserve virtual address range
69
- --------------------------------------------------------------------------------
103
+ -----------------------------
70
104
71
- After you have acquired an allocation of physical memory, you must map it before
72
- you can use it. To do so, you need a virtual address to map it to. Mapping
73
- means the physical memory allocation is available from the virtual address range
74
- it is mapped to. To reserve a virtual memory range, use the
75
- :cpp:func: ` hipMemAddressReserve ` function. The size of the virtual memory must
76
- match the amount of physical memory previously allocated. You can then map the
77
- physical memory allocation to the newly-acquired virtual memory address range
78
- using the :cpp:func: ` hipMemMap ` function.
105
+ After you have acquired an allocation of physical memory, you must map it to a
106
+ virtual address before you can use it. Mapping means the physical memory
107
+ allocation is available from the virtual address range it is mapped to. To
108
+ reserve a virtual memory range, use the :cpp:func: ` hipMemAddressReserve `
109
+ function. The size of the virtual memory must match the amount of physical
110
+ memory previously allocated. You can then map the physical memory allocation to
111
+ the newly-acquired virtual memory address range using the :cpp:func: ` hipMemMap `
112
+ function.
79
113
80
114
.. code-block :: cpp
81
115
82
116
hipMemAddressReserve(&ptr, padded_size, 0, 0, 0);
83
117
hipMemMap(ptr, padded_size, 0, allocHandle, 0);
84
118
119
+ .. _set_memory_access :
120
+
85
121
Set memory access
86
- --------------------------------------------------------------------------------
122
+ -----------------
87
123
88
124
Finally, use the :cpp:func: `hipMemSetAccess ` function to enable memory access.
89
125
It accepts the pointer to the virtual memory, the size, and a
@@ -103,16 +139,39 @@ devices.
103
139
.. code-block :: cpp
104
140
105
141
hipMemAccessDesc accessDesc = {};
106
- accessDesc.location.type = HIP_MEM_LOCATION_TYPE_DEVICE ;
142
+ accessDesc.location.type = hipMemLocationTypeDevice ;
107
143
accessDesc.location.id = currentDev;
108
- accessDesc.flags = HIP_MEM_ACCESS_FLAGS_PROT_READWRITE ;
144
+ accessDesc.flags = hipMemAccessFlagsProtReadwrite ;
109
145
hipMemSetAccess(ptr, padded_size, &accessDesc, 1);
110
146
111
147
At this point the memory is allocated, mapped, and ready for use. You can read
112
148
and write to it, just like you would a C style memory allocation.
113
149
150
+ .. _usage_virtual_memory :
151
+
152
+ Dynamically increase allocation size
153
+ ------------------------------------
154
+
155
+ To increase the amount of pre-allocated memory, use
156
+ :cpp:func: `hipMemAddressReserve `, which accepts the starting address, and the
157
+ size of the reservation in bytes. This allows you to have a continuous virtual
158
+ address space without worrying about the underlying physical allocation.
159
+
160
+ .. code-block :: cpp
161
+
162
+ hipMemAddressReserve(&new_ptr, (new_size - padded_size), 0, ptr + padded_size, 0);
163
+ hipMemMap(new_ptr, (new_size - padded_size), 0, newAllocHandle, 0);
164
+ hipMemSetAccess(new_ptr, (new_size - padded_size), &accessDesc, 1);
165
+
166
+ The code sample above assumes that :cpp:func: `hipMemAddressReserve ` was able to
167
+ reserve the memory address at the specified location. However, this isn't
168
+ guaranteed to be true, so you should validate that ``new_ptr `` points to a
169
+ specific virtual address before using it.
170
+
171
+ .. _free_virtual_memory :
172
+
114
173
Free virtual memory
115
- --------------------------------------------------------------------------------
174
+ -------------------
116
175
117
176
To free the memory allocated in this manner, use the corresponding free
118
177
functions. To unmap the memory, use :cpp:func: `hipMemUnmap `. To release the
@@ -128,27 +187,197 @@ synchronizes the device. This causes worse resource usage and performance.
128
187
hipMemRelease(allocHandle);
129
188
hipMemAddressFree(ptr, size);
130
189
131
- .. _usage_virtual_memory :
190
+ Example code
191
+ ============
192
+
193
+ The virtual memory management example follows these steps:
194
+
195
+ 1. Check virtual memory management :ref: `support <vmm_support >`:
196
+ The :cpp:func: `hipDeviceGetAttribute ` function is used to check the virtual
197
+ memory management support of the GPU with ID 0.
198
+
199
+ 2. Physical memory :ref: `allocation <allocate_physical_memory >`: Physical memory
200
+ is allocated using :cpp:func: `hipMemCreate ` with pinned memory on the
201
+ device.
202
+
203
+ 3. Virtual memory :ref: `reservation <reserve_virtual_address >`: Virtual address
204
+ range is reserved using :cpp:func: `hipMemAddressReserve `.
205
+
206
+ 4. Mapping virtual address to physical memory: The physical memory is mapped
207
+ to a virtual address (``virtualPointer ``) using :cpp:func: `hipMemMap `.
208
+
209
+ 5. Memory :ref: `access permissions<set_memory_access> `: Permission is set for
210
+ pointer to allow read and write access using :cpp:func: `hipMemSetAccess `.
211
+
212
+ 6. Memory operation: Data is written to the memory via ``virtualPointer ``.
213
+
214
+ 7. Launch kernels: The ``zeroAddr `` and ``fillAddr `` kernels are
215
+ launched using the virtual memory pointer.
216
+
217
+ 8. :ref: `Cleanup <free_virtual_memory >`: The mappings, physical memory, and
218
+ virtual address are released at the end to avoid memory leaks.
219
+
220
+ .. code-block :: cpp
132
221
133
- Memory usage
222
+ #include <hip/hip_runtime.h>
223
+ #include <iostream>
224
+
225
+ #define ROUND_UP(SIZE,GRANULARITY) ((1 + SIZE / GRANULARITY) * GRANULARITY)
226
+
227
+ #define HIP_CHECK(expression) \
228
+ { \
229
+ const hipError_t err = expression; \
230
+ if(err != hipSuccess){ \
231
+ std::cerr << "HIP error: " \
232
+ << hipGetErrorString(err) \
233
+ << " at " << __LINE__ << "\n"; \
234
+ } \
235
+ }
236
+
237
+ __global__ void zeroAddr(int* pointer) {
238
+ *pointer = 0;
239
+ }
240
+
241
+ __global__ void fillAddr(int* pointer) {
242
+ *pointer = 42;
243
+ }
244
+
245
+
246
+ int main() {
247
+
248
+ int currentDev = 0;
249
+
250
+ // Step 1: Check virtual memory management support on device 0
251
+ int vmm = 0;
252
+ HIP_CHECK(
253
+ hipDeviceGetAttribute(
254
+ &vmm, hipDeviceAttributeVirtualMemoryManagementSupported, currentDev
255
+ )
256
+ );
257
+
258
+ std::cout << "Virtual memory management support value: " << vmm << std::endl;
259
+
260
+ if (vmm == 0) {
261
+ std::cout << "GPU 0 doesn't support virtual memory management.";
262
+ return 0;
263
+ }
264
+
265
+ // Size of memory to allocate
266
+ size_t size = 4 * 1024;
267
+
268
+ // Step 2: Allocate physical memory
269
+ hipMemGenericAllocationHandle_t allocHandle;
270
+ hipMemAllocationProp prop = {};
271
+ prop.type = hipMemAllocationTypePinned;
272
+ prop.location.type = hipMemLocationTypeDevice;
273
+ prop.location.id = currentDev;
274
+ size_t granularity = 0;
275
+ HIP_CHECK(
276
+ hipMemGetAllocationGranularity(
277
+ &granularity,
278
+ &prop,
279
+ hipMemAllocationGranularityMinimum));
280
+ size_t padded_size = ROUND_UP(size, granularity);
281
+ HIP_CHECK(hipMemCreate(&allocHandle, padded_size * 2, &prop, 0));
282
+
283
+ // Step 3: Reserve a virtual memory address range
284
+ void* virtualPointer = nullptr;
285
+ HIP_CHECK(hipMemAddressReserve(&virtualPointer, padded_size, granularity, nullptr, 0));
286
+
287
+ // Step 4: Map the physical memory to the virtual address range
288
+ HIP_CHECK(hipMemMap(virtualPointer, padded_size, 0, allocHandle, 0));
289
+
290
+ // Step 5: Set memory access permission for pointer
291
+ hipMemAccessDesc accessDesc = {};
292
+ accessDesc.location.type = hipMemLocationTypeDevice;
293
+ accessDesc.location.id = currentDev;
294
+ accessDesc.flags = hipMemAccessFlagsProtReadWrite;
295
+
296
+ HIP_CHECK(hipMemSetAccess(virtualPointer, padded_size, &accessDesc, 1));
297
+
298
+ // Step 6: Perform memory operation
299
+ int value = 42;
300
+ HIP_CHECK(hipMemcpy(virtualPointer, &value, sizeof(int), hipMemcpyHostToDevice));
301
+
302
+ int result = 1;
303
+ HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));
304
+ if( result == 42) {
305
+ std::cout << "Success. Value: " << result << std::endl;
306
+ } else {
307
+ std::cout << "Failure. Value: " << result << std::endl;
308
+ }
309
+
310
+ // Step 7: Launch kernels
311
+ // Launch zeroAddr kernel
312
+ zeroAddr<<<1, 1>>>((int*)virtualPointer);
313
+ HIP_CHECK(hipDeviceSynchronize());
314
+
315
+ // Check zeroAddr kernel result
316
+ result = 1;
317
+ HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));
318
+ if( result == 0) {
319
+ std::cout << "Success. zeroAddr kernel: " << result << std::endl;
320
+ } else {
321
+ std::cout << "Failure. zeroAddr kernel: " << result << std::endl;
322
+ }
323
+
324
+ // Launch fillAddr kernel
325
+ fillAddr<<<1, 1>>>((int*)virtualPointer);
326
+ HIP_CHECK(hipDeviceSynchronize());
327
+
328
+ // Check fillAddr kernel result
329
+ result = 1;
330
+ HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));
331
+ if( result == 42) {
332
+ std::cout << "Success. fillAddr kernel: " << result << std::endl;
333
+ } else {
334
+ std::cout << "Failure. fillAddr kernel: " << result << std::endl;
335
+ }
336
+
337
+ // Step 8: Cleanup
338
+ HIP_CHECK(hipMemUnmap(virtualPointer, padded_size));
339
+ HIP_CHECK(hipMemRelease(allocHandle));
340
+ HIP_CHECK(hipMemAddressFree(virtualPointer, padded_size));
341
+
342
+ return 0;
343
+ }
344
+
345
+ Virtual aliases
134
346
================================================================================
135
347
136
- Dynamically increase allocation size
137
- --------------------------------------------------------------------------------
348
+ Virtual aliases are multiple virtual memory addresses mapping to the same
349
+ physical memory on the GPU. When this occurs, different threads, processes, or memory
350
+ allocations to access shared physical memory through different virtual
351
+ addresses on different devices.
352
+
353
+ Multiple virtual memory mappings can be created using multiple calls to
354
+ :cpp:func: `hipMemMap ` on the same memory allocation.
355
+
356
+ .. note ::
357
+
358
+ RDNA cards may not produce correct results, if users access two different
359
+ virtual addresses that map to the same physical address. In this case, the
360
+ L1 data caches will be incoherent due to the virtual-to-physical aliasing.
361
+ These GPUs will produce correct results if users access virtual-to-physical
362
+ aliases using volatile pointers.
138
363
139
- The :cpp:func: ` hipMemAddressReserve ` function allows you to increase the amount
140
- of pre-allocated memory. This function accepts a parameter representing the
141
- requested starting address of the virtual memory. This allows you to have a
142
- continuous virtual address space without worrying about the underlying physical
143
- allocation.
364
+ NVIDIA GPUs require special fences to produce correct results when
365
+ using virtual aliases.
366
+
367
+ In the following code block, the kernels input device pointers are virtual
368
+ aliases of the same memory allocation:
144
369
145
370
.. code-block :: cpp
146
371
147
- hipMemAddressReserve(&new_ptr, (new_size - padded_size), 0, ptr + padded_size, 0);
148
- hipMemMap(new_ptr, (new_size - padded_size), 0, newAllocHandle, 0);
149
- hipMemSetAccess(new_ptr, (new_size - padded_size), &accessDesc, 1);
372
+ __global__ void updateBoth(int* pointerA, int* pointerB) {
373
+ // May produce incorrect results on RDNA and NVIDIA cards.
374
+ *pointerA = 0;
375
+ *pointerB = 42;
376
+ }
377
+
378
+ __global__ void updateBoth_v2(volatile int* pointerA, volatile int* pointerB) {
379
+ // May produce incorrect results on NVIDIA cards.
380
+ *pointerA = 0;
381
+ *pointerB = 42;
382
+ }
150
383
151
- The code sample above assumes that :cpp:func: `hipMemAddressReserve ` was able to
152
- reserve the memory address at the specified location. However, this isn't
153
- guaranteed to be true, so you should validate that ``new_ptr `` points to a
154
- specific virtual address before using it.
0 commit comments