Question: What should I do if I get this error running distributed jobs across nodes on my Windows HPC Cluster ? Error Fatal error in MPI_Comm_create: Other MPI error, error stack: MPI_Comm_create(MPI_COMM_WORLD, group=0x88000001, new_comm=0x000000E071458E90) failed unable to connect to
on port #####, no endpoint matches the netmask Jobs requiring a single node run without issues.
January 25, 2023 at 7:28 amFAQParticipant
Answer: Either the bind order of interfaces or an incorrectly set MPI NETMASK is causing the issue. A typical error may look like this unable to connect to 10.0.0.12 node12 on port 52935, no endpoint matches the netmask 10.0.1.0/255.255.255.0 Note the difference in subnets Please have you cluster / network administrator review the suggestions below a) Please check how many network interfaces do the compute nodes have. If multiple interfaces then please make sure that the bind order is set correctly. b) If there is only one interface and still seeing this error, then the MPI NETMASK may need to be configured correctly for this example it will need to be set to the 10.0.0.* subnet so the command will look like cluscfg setenvs CCP_MPI_NETMASK=10.0.0.0/255.255.255.0 Additional information If using RSM to submit job to the cluster then the RSM job log may show errors like the example below Running Solver : C:Program FilesANSYS Incv192ansysbinwinx64ANSYS192.exe -b nolist -s noread -p ansys -i remote.dat -o solve.out -dis -mpi msmpi -np 12 -dir “C:/scratch/n3r39eoc.i2n” job aborted: [ranks] message  fatal error Fatal error in MPI_Comm_create: Other MPI error, error stack: MPI_Comm_create(MPI_COMM_WORLD, group=0x88000001, new_comm=0x000000E071458E90) failed [ch3:sock] rank 0 unable to connect to rank 8 using business card
unable to connect to 10.0.0.12 node12 on port 52935, no endpoint matches the netmask 10.0.1.0/255.255.255.0 [1-11] terminated —- error analysis —–  on node01 mpi has detected a fatal error and aborted C:Program FilesANSYS Incv192ANSYSbinwinx64ANSYS.EXE —- error analysis —– . . . Command Exit Code: -4 ClusterJobs Exiting with code: -4 Individual Command Exit Codes are: [-4]
Introducing Ansys Electronics Desktop on Ansys Cloud
The Watch & Learn video article provides an overview of cloud computing from Electronics Desktop and details the product licenses and subscriptions to ANSYS Cloud Service that are...
How to Create a Reflector for a Center High-Mounted Stop Lamp (CHMSL)
This video article demonstrates how to create a reflector for a center high-mounted stop lamp. Optical Part design in Ansys SPEOS enables the design and validation of multiple...
Introducing the GEKO Turbulence Model in Ansys Fluent
The GEKO (GEneralized K-Omega) turbulence model offers a flexible, robust, general-purpose approach to RANS turbulence modeling. Introducing 2 videos: Part 1 provides background information on the model and a...
Postprocessing on Ansys EnSight
This video demonstrates exporting data from Fluent in EnSight Case Gold format, and it reviews the basic postprocessing capabilities of EnSight.
- Ansys Licensing: Managing Activations
- Installing ANSYS 2020 Releases on Windows
- Installing ANSYS License Manager on Windows
- 2019R1 Workbench Fluent design point update (foreground in Solution and foreground in Parameter Set) stuck at end of 1st DP and doesn’t proceed to next DP. Fluent processes could not be terminated after iterations and cas/dat files are written in 1st DP.
- Ansys Licensing: Activating Entitlements
- Troubleshooting with ANSYS License Management Center
- ANSYS Licensing Portal Overview
- Importing Zuken into ANSYS
- License Reporting in ANSYS License Management Center
- ANSYS License Manager: Configuring Firewall Exceptions on Windows