Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cmake nvhpc support #1286

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

jim-p-w
Copy link
Contributor

@jim-p-w jim-p-w commented Feb 7, 2025

This PR adds support for using the nvhpc toolchain when building with cmake.

Note: When using cmake to build with the nvhpc toolchain, OpenACC will always be enabled.

The primary changes include the following when the nvhpc toolchain is detected:

  1. setting the MPI_Fortran_COMPILER variable to mpifort prior to searching for the MPI package.
  2. adding various defines and compiler switches for the compilation stage.

For simplicity, the dependencies on netcdf are removed. MPAS-Model only depends on the parallel netcdf packages.

To test the nvhpc toolchain, a regional test case was submitted to gpu nodes, and the output
was compared to the output from running it against an nvhpc build built with make.
The test output was the same.

To ensure changing the netcdf dependencies is backward compatible, the mpas-bundle software was built with the gnu toolchain. Everything built properly.
The result of the build was used to run the panda-c cylc test suite, all tests passed.

@mgduda mgduda requested review from amstokely, islas and mgduda February 7, 2025 22:52
@mgduda mgduda added the feature label Feb 7, 2025
Copy link
Contributor

@amstokely amstokely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the FindNetCDF.cmake file is no longer used, you may want to delete it. Other than that, I was able to build this branch on my personal Linux desktop and on Aspire2a (both use GNU and OpenMPI) without any problems.

@mgduda
Copy link
Contributor

mgduda commented Feb 8, 2025

Since the FindNetCDF.cmake file is no longer used, you may want to delete it. Other than that, I was able to build this branch on my personal Linux desktop and on Aspire2a (both use GNU and OpenMPI) without any problems.

It's possible that PIO would require netCDF, however. So perhaps the find_package invocation for NetCDF should depend on PIO being requested?

@jim-p-w
Copy link
Contributor Author

jim-p-w commented Feb 8, 2025

Since the FindNetCDF.cmake file is no longer used, you may want to delete it. Other than that, I was able to build this branch on my personal Linux desktop and on Aspire2a (both use GNU and OpenMPI) without any problems.

It's possible that PIO would require netCDF, however. So perhaps the find_package invocation for NetCDF should depend on PIO being requested?

I will test that and respond accordingly. Thanks for the heads up about that possibility.

@mgduda netcdf is only required by src/core_atmosphere/physics/physics_noahmp/utility/ErrorHandleMod.F90, which contains one subroutine, ErrorHandle.
ErrorHandle is not called anywhere in the MPAS code base (in the develop branch).

I noticed that ErrorHandleMod.F90 is not built by make, and I removed it from the cmake build (see the change to src/core_atmosphere/CMakeLists.txt). That allowed me to remove the netcdf dependencies from cmake.

Perhaps ErrorHandleMod.F90 should be deleted from the repo?
That file was added as part of importing the Noah-MP land surface scheme, see commit 148dc18.

@jim-p-w
Copy link
Contributor Author

jim-p-w commented Feb 28, 2025

Given that the PIO version of mpas_atmosphere (what atmosphere_model is named when building with cmake) builds successfully without using FindNetCDF.cmake I think FindNetCDF.cmake should be deleted.

Other than that, are there other issues I need to address before we can consider merging this PR? (I have two downstream PR's in the mpas-bundle repository which depend on this PR.)

@islas
Copy link
Contributor

islas commented Feb 28, 2025

@jim-p-w I think it might be getting away with building if you are linking against a dynamic library of PIO. If it is static, it does require netCDF :

target_link_libraries(PIO::PIO_Fortran_STATIC INTERFACE NetCDF::NetCDF_C)

I would also think if it needs it, it should call find_package() inside of the call to resolve that dependency, but that is outside the scope of this PR. Mostly noting that this might be addressed in the future.

@jim-p-w
Copy link
Contributor Author

jim-p-w commented Feb 28, 2025

@jim-p-w I think it might be getting away with building if you are linking against a dynamic library of PIO. If it is static, it does require netCDF :

target_link_libraries(PIO::PIO_Fortran_STATIC INTERFACE NetCDF::NetCDF_C)

I would also think if it needs it, it should call find_package() inside of the call to resolve that dependency, but that is outside the scope of this PR. Mostly noting that this might be addressed in the future.

@islas @amstokely @michalakes In that case I will add back the call to find the netcdf package if building PIO. Thanks for the info Anthony!

The netcdf package is now required when building with PIO.

@jim-p-w
Copy link
Contributor Author

jim-p-w commented Mar 24, 2025

@amstokely @islas @mgduda A gentle reminder, I think I have addressed all issues which have been raised so far.
Is there anything else I need to change? (I have a couple of PR's in the mpas-bundle repo which are dependent on this)

Copy link
Contributor

@islas islas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configures and compiles when nvhpc SDK is in path and the compiler is set to nvfortran with only PnetCDF additionally in path 👍

Copy link
Contributor

@mgduda mgduda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objection to the changes themselves, but I would suggest doing one of two things concerning commit 55c6391:

  1. The commit message could be improved by mentioning in the one-line summary that the change affects CMake builds specifically, and explaining further in the commit message why we are adding back a line (now within a conditional check) that was removed by the previous commit.
  2. Or, just squashing 55c6391 into the previous commit.

@jim-p-w jim-p-w force-pushed the cmake-nvhpc-support branch from 55c6391 to ae56965 Compare March 24, 2025 21:33
@jim-p-w
Copy link
Contributor Author

jim-p-w commented Mar 24, 2025

I squashed the 2 commits into 1, and updated the commit message.

@mgduda
Copy link
Contributor

mgduda commented Mar 26, 2025

I'm not able to build stand-alone MPAS-A with the changes in this PR using the NVHPC compilers. Here's where the build fails:

[ 25%] CORE atmosphere: Parse Registry

Error: Could not open file Registry_processed.xml for reading.

make[2]: *** [src/core_atmosphere/CMakeFiles/gen_atmosphere.dir/build.make:83: src/core_atmosphere/block_dimension_routines.inc] Error 1
make[1]: *** [CMakeFiles/Makefile2:600: src/core_atmosphere/CMakeFiles/gen_atmosphere.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

On Derecho, I have the following modules loaded:

Currently Loaded Modules:
  1) ncarenv/24.12 (S)   4) ncarcompilers/1.0.0   7) cuda/12.3.2
  2) craype/2.7.31       5) libfabric/1.15.2.0    8) parallel-netcdf/1.14.0
  3) nvhpc/24.11         6) cray-mpich/8.1.29     9) cmake/3.26.6

@jim-p-w
Copy link
Contributor Author

jim-p-w commented Mar 26, 2025

Using the same modules as Michael,

Currently Loaded Modules:
  1) ncarenv/24.12 (S)   4) ncarcompilers/1.0.0   7) cuda/12.3.2
  2) craype/2.7.31       5) libfabric/1.15.2.0    8) parallel-netcdf/1.14.0
  3) nvhpc/24.11         6) cray-mpich/8.1.29     9) cmake/3.26.6

I built successfully using the following commands:
cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DMPAS_DOUBLE_PRECISION=OFF ../MPAS |&tee cmake.log
qcmd -A nmmm0015 -q develop -- make -j4 |&tee make.log

My build dir was parallel to the MPAS source dir.

From the source directory:

git status
On branch cmake-nvhpc-support
Your branch is up to date with 'myrepo/cmake-nvhpc-support'.

@mgduda what cmake and make commands did you use?

@mgduda
Copy link
Contributor

mgduda commented Mar 26, 2025

@jim-p-w I think I've figure out what was going on. In the MPAS-Model directory I was pointing to with cmake, the code had already been compiled with the Make-based build system. If I clean the MPAS-Model source directory, it looks like all is well.

@mgduda mgduda self-requested a review March 26, 2025 23:32
$<$<COMPILE_LANGUAGE:Fortran>:-DMPAS_OPENACC -DSINGLE_PRECISION -DMPAS_BUILD_TARGET=nvhpc>
)
list(APPEND MPAS_FORTRAN_TARGET_COMPILE_OPTIONS_PRIVATE
$<$<COMPILE_LANGUAGE:Fortran>: -Mnofma -acc -gpu=math_uniform,cc70,cc80 -Minfo=accel -byteswapio>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to state clearly in the commit message (and to clarify in the PR description) that building with the NVHPC compilers using CMake will always enable OpenACC.

Note: When using cmake to build with the nvhpc toolchain,
OpenACC will always be enabled.
Note: These changes only impact building via cmake.

This includes conditionally setting the MPI_Fortran_COMPILER prior to
searching for the MPI package.

This also includes adding a number of defines and compiler switches
when the nvhpc toolchain is detected.
@jim-p-w jim-p-w force-pushed the cmake-nvhpc-support branch from ae56965 to 5f99ec6 Compare March 27, 2025 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants