Skip to content

Commit 8e7a935

Browse files
neon60randyh62
authored andcommitted
Add virtual aliases support section
1 parent 90388dd commit 8e7a935

File tree

2 files changed

+272
-40
lines changed

2 files changed

+272
-40
lines changed

docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst

+268-39
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ reduce memory usage and unnecessary ``memcpy`` calls.
2828
.. _memory_allocation_virtual_memory:
2929

3030
Memory allocation
31-
================================================================================
31+
=================
3232

3333
Standard memory allocation uses the :cpp:func:`hipMalloc` function to allocate a
3434
block of memory on the device. However, when using virtual memory, this process
@@ -37,53 +37,89 @@ is separated into multiple steps using the :cpp:func:`hipMemCreate`,
3737
:cpp:func:`hipMemSetAccess` functions. This guide explains what these functions
3838
do and how you can use them for virtual memory management.
3939

40+
.. _vmm_support:
41+
42+
Virtual memory management support
43+
---------------------------------
44+
45+
The first step is to check if the targeted device or GPU supports virtual memory management.
46+
Use the :cpp:func:`hipDeviceGetAttribute` function to get the
47+
``hipDeviceAttributeVirtualMemoryManagementSupported`` attribute for a specific GPU, as shown in the following example.
48+
49+
.. code-block:: cpp
50+
51+
int vmm = 0, currentDev = 0;
52+
hipDeviceGetAttribute(
53+
&vmm, hipDeviceAttributeVirtualMemoryManagementSupported, currentDev
54+
);
55+
56+
if (vmm == 0) {
57+
std::cout << "GPU " << currentDev << " doesn't support virtual memory management." << std::endl;
58+
} else {
59+
std::cout << "GPU " << currentDev << " support virtual memory management." << std::endl;
60+
}
61+
62+
.. _allocate_physical_memory:
63+
4064
Allocate physical memory
41-
--------------------------------------------------------------------------------
65+
------------------------
4266

43-
The first step is to allocate the physical memory itself with the
67+
The next step is to allocate the physical memory using the
4468
:cpp:func:`hipMemCreate` function. This function accepts the size of the buffer,
4569
an ``unsigned long long`` variable for the flags, and a
4670
:cpp:struct:`hipMemAllocationProp` variable. :cpp:struct:`hipMemAllocationProp`
4771
contains the properties of the memory to be allocated, such as where the memory
4872
is physically located and what kind of shareable handles are available. If the
4973
allocation is successful, the function returns a value of
5074
:cpp:enumerator:`hipSuccess`, with :cpp:type:`hipMemGenericAllocationHandle_t`
51-
representing a valid physical memory allocation. The allocated memory size must
52-
be aligned with the granularity appropriate for the properties of the
53-
allocation. You can use the :cpp:func:`hipMemGetAllocationGranularity` function
54-
to determine the correct granularity.
75+
representing a valid physical memory allocation.
76+
77+
The allocated memory must be aligned with the appropriate granularity. The
78+
granularity value can be queried with :cpp:func:`hipMemGetAllocationGranularity`,
79+
and its value depends on the target device hardware and the type of memory
80+
allocation. If the allocation size is not aligned, meaning it is not cleanly
81+
divisible by the minimum granularity value, :cpp:func:`hipMemCreate` will return
82+
an out-of-memory error.
5583

5684
.. code-block:: cpp
5785
5886
size_t granularity = 0;
5987
hipMemGenericAllocationHandle_t allocHandle;
6088
hipMemAllocationProp prop = {};
61-
prop.type = HIP_MEM_ALLOCATION_TYPE_PINNED;
62-
prop.location.type = HIP_MEM_LOCATION_TYPE_DEVICE;
89+
// The pinned allocation type cannot be migrated from its current location
90+
// while the application is actively using it.
91+
prop.type = hipMemAllocationTypePinned;
92+
// Set the location type to device, currently there are no other valid option.
93+
prop.location.type = hipMemLocationTypeDevice;
94+
// Set the device id, where the memory will be allocated.
6395
prop.location.id = currentDev;
64-
hipMemGetAllocationGranularity(&granularity, &prop, HIP_MEM_ALLOC_GRANULARITY_MINIMUM);
96+
hipMemGetAllocationGranularity(&granularity, &prop, hipMemAllocationGranularityMinimum);
6597
padded_size = ROUND_UP(size, granularity);
6698
hipMemCreate(&allocHandle, padded_size, &prop, 0);
6799
100+
.. _reserve_virtual_address:
101+
68102
Reserve virtual address range
69-
--------------------------------------------------------------------------------
103+
-----------------------------
70104

71-
After you have acquired an allocation of physical memory, you must map it before
72-
you can use it. To do so, you need a virtual address to map it to. Mapping
73-
means the physical memory allocation is available from the virtual address range
74-
it is mapped to. To reserve a virtual memory range, use the
75-
:cpp:func:`hipMemAddressReserve` function. The size of the virtual memory must
76-
match the amount of physical memory previously allocated. You can then map the
77-
physical memory allocation to the newly-acquired virtual memory address range
78-
using the :cpp:func:`hipMemMap` function.
105+
After you have acquired an allocation of physical memory, you must map it to a
106+
virtual address before you can use it. Mapping means the physical memory
107+
allocation is available from the virtual address range it is mapped to. To
108+
reserve a virtual memory range, use the :cpp:func:`hipMemAddressReserve`
109+
function. The size of the virtual memory must match the amount of physical
110+
memory previously allocated. You can then map the physical memory allocation to
111+
the newly-acquired virtual memory address range using the :cpp:func:`hipMemMap`
112+
function.
79113

80114
.. code-block:: cpp
81115
82116
hipMemAddressReserve(&ptr, padded_size, 0, 0, 0);
83117
hipMemMap(ptr, padded_size, 0, allocHandle, 0);
84118
119+
.. _set_memory_access:
120+
85121
Set memory access
86-
--------------------------------------------------------------------------------
122+
-----------------
87123

88124
Finally, use the :cpp:func:`hipMemSetAccess` function to enable memory access.
89125
It accepts the pointer to the virtual memory, the size, and a
@@ -103,16 +139,39 @@ devices.
103139
.. code-block:: cpp
104140
105141
hipMemAccessDesc accessDesc = {};
106-
accessDesc.location.type = HIP_MEM_LOCATION_TYPE_DEVICE;
142+
accessDesc.location.type = hipMemLocationTypeDevice;
107143
accessDesc.location.id = currentDev;
108-
accessDesc.flags = HIP_MEM_ACCESS_FLAGS_PROT_READWRITE;
144+
accessDesc.flags = hipMemAccessFlagsProtReadwrite;
109145
hipMemSetAccess(ptr, padded_size, &accessDesc, 1);
110146
111147
At this point the memory is allocated, mapped, and ready for use. You can read
112148
and write to it, just like you would a C style memory allocation.
113149

150+
.. _usage_virtual_memory:
151+
152+
Dynamically increase allocation size
153+
------------------------------------
154+
155+
To increase the amount of pre-allocated memory, use
156+
:cpp:func:`hipMemAddressReserve`, which accepts the starting address, and the
157+
size of the reservation in bytes. This allows you to have a continuous virtual
158+
address space without worrying about the underlying physical allocation.
159+
160+
.. code-block:: cpp
161+
162+
hipMemAddressReserve(&new_ptr, (new_size - padded_size), 0, ptr + padded_size, 0);
163+
hipMemMap(new_ptr, (new_size - padded_size), 0, newAllocHandle, 0);
164+
hipMemSetAccess(new_ptr, (new_size - padded_size), &accessDesc, 1);
165+
166+
The code sample above assumes that :cpp:func:`hipMemAddressReserve` was able to
167+
reserve the memory address at the specified location. However, this isn't
168+
guaranteed to be true, so you should validate that ``new_ptr`` points to a
169+
specific virtual address before using it.
170+
171+
.. _free_virtual_memory:
172+
114173
Free virtual memory
115-
--------------------------------------------------------------------------------
174+
-------------------
116175

117176
To free the memory allocated in this manner, use the corresponding free
118177
functions. To unmap the memory, use :cpp:func:`hipMemUnmap`. To release the
@@ -128,27 +187,197 @@ synchronizes the device. This causes worse resource usage and performance.
128187
hipMemRelease(allocHandle);
129188
hipMemAddressFree(ptr, size);
130189
131-
.. _usage_virtual_memory:
190+
Example code
191+
============
192+
193+
The virtual memory management example follows these steps:
194+
195+
1. Check virtual memory management :ref:`support <vmm_support>`:
196+
The :cpp:func:`hipDeviceGetAttribute` function is used to check the virtual
197+
memory management support of the GPU with ID 0.
198+
199+
2. Physical memory :ref:`allocation <allocate_physical_memory>`: Physical memory
200+
is allocated using :cpp:func:`hipMemCreate` with pinned memory on the
201+
device.
202+
203+
3. Virtual memory :ref:`reservation <reserve_virtual_address>`: Virtual address
204+
range is reserved using :cpp:func:`hipMemAddressReserve`.
205+
206+
4. Mapping virtual address to physical memory: The physical memory is mapped
207+
to a virtual address (``virtualPointer``) using :cpp:func:`hipMemMap`.
208+
209+
5. Memory :ref:`access permissions<set_memory_access>`: Permission is set for
210+
pointer to allow read and write access using :cpp:func:`hipMemSetAccess`.
211+
212+
6. Memory operation: Data is written to the memory via ``virtualPointer``.
213+
214+
7. Launch kernels: The ``zeroAddr`` and ``fillAddr`` kernels are
215+
launched using the virtual memory pointer.
216+
217+
8. :ref:`Cleanup <free_virtual_memory>`: The mappings, physical memory, and
218+
virtual address are released at the end to avoid memory leaks.
219+
220+
.. code-block:: cpp
132221
133-
Memory usage
222+
#include <hip/hip_runtime.h>
223+
#include <iostream>
224+
225+
#define ROUND_UP(SIZE,GRANULARITY) ((1 + SIZE / GRANULARITY) * GRANULARITY)
226+
227+
#define HIP_CHECK(expression) \
228+
{ \
229+
const hipError_t err = expression; \
230+
if(err != hipSuccess){ \
231+
std::cerr << "HIP error: " \
232+
<< hipGetErrorString(err) \
233+
<< " at " << __LINE__ << "\n"; \
234+
} \
235+
}
236+
237+
__global__ void zeroAddr(int* pointer) {
238+
*pointer = 0;
239+
}
240+
241+
__global__ void fillAddr(int* pointer) {
242+
*pointer = 42;
243+
}
244+
245+
246+
int main() {
247+
248+
int currentDev = 0;
249+
250+
// Step 1: Check virtual memory management support on device 0
251+
int vmm = 0;
252+
HIP_CHECK(
253+
hipDeviceGetAttribute(
254+
&vmm, hipDeviceAttributeVirtualMemoryManagementSupported, currentDev
255+
)
256+
);
257+
258+
std::cout << "Virtual memory management support value: " << vmm << std::endl;
259+
260+
if (vmm == 0) {
261+
std::cout << "GPU 0 doesn't support virtual memory management.";
262+
return 0;
263+
}
264+
265+
// Size of memory to allocate
266+
size_t size = 4 * 1024;
267+
268+
// Step 2: Allocate physical memory
269+
hipMemGenericAllocationHandle_t allocHandle;
270+
hipMemAllocationProp prop = {};
271+
prop.type = hipMemAllocationTypePinned;
272+
prop.location.type = hipMemLocationTypeDevice;
273+
prop.location.id = currentDev;
274+
size_t granularity = 0;
275+
HIP_CHECK(
276+
hipMemGetAllocationGranularity(
277+
&granularity,
278+
&prop,
279+
hipMemAllocationGranularityMinimum));
280+
size_t padded_size = ROUND_UP(size, granularity);
281+
HIP_CHECK(hipMemCreate(&allocHandle, padded_size * 2, &prop, 0));
282+
283+
// Step 3: Reserve a virtual memory address range
284+
void* virtualPointer = nullptr;
285+
HIP_CHECK(hipMemAddressReserve(&virtualPointer, padded_size, granularity, nullptr, 0));
286+
287+
// Step 4: Map the physical memory to the virtual address range
288+
HIP_CHECK(hipMemMap(virtualPointer, padded_size, 0, allocHandle, 0));
289+
290+
// Step 5: Set memory access permission for pointer
291+
hipMemAccessDesc accessDesc = {};
292+
accessDesc.location.type = hipMemLocationTypeDevice;
293+
accessDesc.location.id = currentDev;
294+
accessDesc.flags = hipMemAccessFlagsProtReadWrite;
295+
296+
HIP_CHECK(hipMemSetAccess(virtualPointer, padded_size, &accessDesc, 1));
297+
298+
// Step 6: Perform memory operation
299+
int value = 42;
300+
HIP_CHECK(hipMemcpy(virtualPointer, &value, sizeof(int), hipMemcpyHostToDevice));
301+
302+
int result = 1;
303+
HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));
304+
if( result == 42) {
305+
std::cout << "Success. Value: " << result << std::endl;
306+
} else {
307+
std::cout << "Failure. Value: " << result << std::endl;
308+
}
309+
310+
// Step 7: Launch kernels
311+
// Launch zeroAddr kernel
312+
zeroAddr<<<1, 1>>>((int*)virtualPointer);
313+
HIP_CHECK(hipDeviceSynchronize());
314+
315+
// Check zeroAddr kernel result
316+
result = 1;
317+
HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));
318+
if( result == 0) {
319+
std::cout << "Success. zeroAddr kernel: " << result << std::endl;
320+
} else {
321+
std::cout << "Failure. zeroAddr kernel: " << result << std::endl;
322+
}
323+
324+
// Launch fillAddr kernel
325+
fillAddr<<<1, 1>>>((int*)virtualPointer);
326+
HIP_CHECK(hipDeviceSynchronize());
327+
328+
// Check fillAddr kernel result
329+
result = 1;
330+
HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));
331+
if( result == 42) {
332+
std::cout << "Success. fillAddr kernel: " << result << std::endl;
333+
} else {
334+
std::cout << "Failure. fillAddr kernel: " << result << std::endl;
335+
}
336+
337+
// Step 8: Cleanup
338+
HIP_CHECK(hipMemUnmap(virtualPointer, padded_size));
339+
HIP_CHECK(hipMemRelease(allocHandle));
340+
HIP_CHECK(hipMemAddressFree(virtualPointer, padded_size));
341+
342+
return 0;
343+
}
344+
345+
Virtual aliases
134346
================================================================================
135347

136-
Dynamically increase allocation size
137-
--------------------------------------------------------------------------------
348+
Virtual aliases are multiple virtual memory addresses mapping to the same
349+
physical memory on the GPU. When this occurs, different threads, processes, or memory
350+
allocations to access shared physical memory through different virtual
351+
addresses on different devices.
352+
353+
Multiple virtual memory mappings can be created using multiple calls to
354+
:cpp:func:`hipMemMap` on the same memory allocation.
355+
356+
.. note::
357+
358+
RDNA cards may not produce correct results, if users access two different
359+
virtual addresses that map to the same physical address. In this case, the
360+
L1 data caches will be incoherent due to the virtual-to-physical aliasing.
361+
These GPUs will produce correct results if users access virtual-to-physical
362+
aliases using volatile pointers.
138363

139-
The :cpp:func:`hipMemAddressReserve` function allows you to increase the amount
140-
of pre-allocated memory. This function accepts a parameter representing the
141-
requested starting address of the virtual memory. This allows you to have a
142-
continuous virtual address space without worrying about the underlying physical
143-
allocation.
364+
NVIDIA GPUs require special fences to produce correct results when
365+
using virtual aliases.
366+
367+
In the following code block, the kernels input device pointers are virtual
368+
aliases of the same memory allocation:
144369

145370
.. code-block:: cpp
146371
147-
hipMemAddressReserve(&new_ptr, (new_size - padded_size), 0, ptr + padded_size, 0);
148-
hipMemMap(new_ptr, (new_size - padded_size), 0, newAllocHandle, 0);
149-
hipMemSetAccess(new_ptr, (new_size - padded_size), &accessDesc, 1);
372+
__global__ void updateBoth(int* pointerA, int* pointerB) {
373+
// May produce incorrect results on RDNA and NVIDIA cards.
374+
*pointerA = 0;
375+
*pointerB = 42;
376+
}
377+
378+
__global__ void updateBoth_v2(volatile int* pointerA, volatile int* pointerB) {
379+
// May produce incorrect results on NVIDIA cards.
380+
*pointerA = 0;
381+
*pointerB = 42;
382+
}
150383
151-
The code sample above assumes that :cpp:func:`hipMemAddressReserve` was able to
152-
reserve the memory address at the specified location. However, this isn't
153-
guaranteed to be true, so you should validate that ``new_ptr`` points to a
154-
specific virtual address before using it.

include/hip/hip_runtime_api.h

+4-1
Original file line numberDiff line numberDiff line change
@@ -1089,7 +1089,10 @@ typedef enum hipMemAccessFlags {
10891089
hipMemAccessFlagsProtReadWrite = 3 ///< Set the address range read-write accessible
10901090
} hipMemAccessFlags;
10911091
/**
1092-
* Memory access descriptor
1092+
* Memory access descriptor structure is used to specify memory access
1093+
* permissions for a virtual memory region in Virtual Memory Management API.
1094+
* This structure changes read, and write permissions for
1095+
* specific memory regions.
10931096
*/
10941097
typedef struct hipMemAccessDesc {
10951098
hipMemLocation location; ///< Location on which the accessibility has to change

0 commit comments

Comments
 (0)