Date:Feb-2024 This document describes the information regarding the VEOS version 3.3 or later. - CAUTIONS (1)Unauthorized reproduction of the contents of this document, in part or in its entirety, is prohibited. (2)The contents of this document may change without prior notice. (3)Every effort has been made to ensure the completeness of this document. However, if you have any concerns, or discover errors or omissions, please contact your retailer. (4)NEC assumes no liability for any loss, including loss of earnings, arising from the use of this SW, irrespective of (3) above. (5)This SW is not intended for use in medical, nuclear, aerospace, mass transit or other applications where human life may be at stake or high reliability is required, nor is it intended for use in controlling such applications. We disclaim liability for any personal injury and property damages caused by such use of this SW. - About VEOS VEOS is the software running on Linux/Vector Host and provides OS functionality for VE programs running on Vector Engine. - Supported Platforms and Operating Systems Operating System Platform RHEL7.9 x86_64 RHEL8.6 x86_64 RHEL8.8 x86_64 VE30 is available on RHEL8.6 or later. - Notes regarding compatibility between different versions - In order to use VEOS 3.0.2 or later with NQSV, NQSV R1.13-143 or later is required. The data structure of the process accounting has been changed in VEOS 3.0.2, so NQSV prior to R1.13-143 will fail to obtain process account information. - To use accelerated I/O with VEOS 3.0.2 or later, veos-3.0.2 or later, libsysve-ve3-3.0.2 or later, and libsysve-ve1-3.0.2 or later are required. If some packages are old, accelerated I/O will be disabled. Both of libsysve-ve3-3.0.2 or later, and libsysve-ve1-3.0.2 or later will be installed due to dependencies when users install NEC Compiler 5.0.0 or later. In this case, veos will not be updated. Please manually update to veos-3.0.2 or later, to enable accelerated I/O. - VEOS doesn't support Linux suspend and resume. Please do not suspend and resume your Linux/x86 machine where VEs are installed. When you suspend and resume the Linux/x86 machine, Linux will be down in the worst case, such as kernel panic etc. This causes your important data loss. - VEOS doesn't support Linux which works on any virtual machine, such as QEMU etc. VEOS can work correctly only on Linux which works directly on the physical machine. - The ve_exec command which is a part of VEOS uses over 1TB of VH virtual address space. VEOS requires that Linux kernel parameter vm.overcommit_memory is not '2' to enable "overcommit". - Please do not apply excessive loads on your Linux/x86 machine to make VEOS to be able to provide OS functionality steadily. Excessive loads on Linux/x86 machine may cause unexpected performance degrades on VEOS such as long response time. - The number of threads of VE processes which run on VE should be less than or equal to the number of available VE cores, in order to achieve best performance. - The getaddrinfo() function in VE glibc fails if a program invokes it in a docker container which disabled IPv6 support. Please enable IPv6 support of docker environment. - MPIPROGINF/PROGINF and some commands such as "ps" and "veswap" show the VE memory usage and non-swappable VE memory usage which are obtained from VEOS. The getrusage() function also obtains VE memory usage from VEOS. These value includes pinned VE memory to be transferred by MPI API, even if it is already unmapped from a virtual address space of a VE process. But, there is a restriction that these values do not include pinned memory if the type of memory is one of the following types, and if it is already unmapped from a virtual address space of a VE process. Private filebacked memory (mmap() with MAP_PRIVATE) Shared filebacked memory (mmap() with MAP_SHARED) Shared anonymous memory (mmap() with MAP_SHARED and MAP_ANONYMOUS) System V shared memory (shmat()) POSIX shared memory (shm_open()) - The VEO implementation has changed to Alternative VE Offload(AVEO). It is a faster and much lower latency replacement to the previous VEO implementation which brings multi-VE support, simultaneous debugging of VE and VH side, API extensions. You can migrate to AVEO from the previous VEO implementation by installing the AVEO's packages and re-linking your program with AVEO without modification of makefiles. Please see VEOS document "The Tutorial and API Reference of Alternative VE Offloading" for VEO migration and AVEO installation. Support for the previous VEO ended on the end of Mar. 2021. - If the operating system of a vector host is RHEL8.x and VE is partitioning mode (NUMA mode), the following functions work incorrectly in rare cases. - vfork() system call - posix_spawn() with POSIX_SPAWN_USEVFORK flag When this issue occurred, child VE processes will be created repeatedly until total number of VE process in a VE node reached upper limit, and the VE programs in the VE node may hang-up. If you face this issue, please terminate a VE process which invoked vfork() and its all created children forcibly by sending SIGKILL signal.To avoid this issue, please use fork() instead of these functions. When you use fork(), creating a child process takes more time and memory usage of child will increase than these functions. This issue will be fixed in future release. - If the operating system of a vector host is RHEL8.4, the behavior of Linux kernel is changed from the previous versions in order to fix the security issue (CVE-2020-29374). Due to this change, loading a VE program takes more time and more VH memory than previous versions, if the file size of the VE program is very large. The issue doesn't occur in RHEL 8.5 or later. - If the operating system of a vector host is RHEL8.5 or later, the performance of I/O like read or write request of user program may be degraded due to a change of Linux kernel behavior. To avoid the issue, please enable the Accelerated I/O when you execute a VE program. Please refer to the VEOS document "How to Execute VE Program" for the usage of the Accelerated I/O. - "ve_exec" process uses about 32 TB and some hundreds mega bytes virtual memory. If you limit the size of virtual memory using "-l vememsz_prc" option, "-l vmemsz_job" option or "--vmemsz-lhost" option of NQSV, please update the size of virtual memory to allow "ve_exec" process to use 32TB and some hundreds mega bytes virtual memory. Please note you need not change the configuration if you use the default values of these options, because the default values are UNLIMITED. Please see the description of qsub, qrsh, qlogin, qalter in "NQSV User's Guide Reference" for these options. If you limit the size of virtual memory using /etc/secure/limits.conf please update the size of virtual memory to allow "ve_exec" process to use 32TB and some hundreds mega bytes virtual memory. - There are cases in which VEOS fails counting Performance Monitor Counter (PMC) attributes if child threads are created before constructor of executable invocation. In this case, there are possibility of the following. - PMC attributes in process accounting becomes N/A. - A part of performance items in PROGINF/MPI runtime performance information are not displayed on VE30. - When running VE10 executable binaries on VE30, the following may occur. If you want to avoid this, please recreate binary by latest compiler with '-march=ve3' option. - backtrace_symbols() in glibc cannot display symbol names. - ngprof cannot display performance information. - When hooking a function using __malloc_hook, __memalign_hook, __free_hook, or __realloc_hook, the address provided as caller is different from the address where the binary or library is loaded. - A VE30 program created with the "-static" and "-fno-fast-math" option will not raise an INV exception if the divisor is zero in a quadruple precision floating-point remainder operation(fmodl()), even if the environment variable VE_FPE_ENABLE="INV" is set. If you want to raise an INV exception using the floating-point remainder operation, use single-precision or double-precision floating-point remainder operation. Setting the environment variable VE_FPE_ENABLE="DIV" raises a division-by-zero exception, which can be used as an alternative method of raising an exception. - There is a possibility of abnormal termination in multi-threaded programs running on a VE when swap-in and swap-out are performed by Partial Process Swapping, and the program starts executing (SIGCONT) after swap-in. - It is possible for veo_proc_create() within AVEO-utilizing programs to fail on unusual occasions when swap-in and swap-out are performed by Partial Process Swapping, and the program starts executing (SIGCONT) after swap-in. - There is a possibility of VEOS terminating abnormally on rare occasions when only some of the processes on a VE are swap-out by Partial Process Swapping. In the case of urgent request execution with NQSV, all processes on a VE are swap-out, so this limitation does not apply. - Specifications such as maximum number of processes and maximum number of threads are described in the "Spacifications List(VE3)" and "Specifications List(VE1)" sections of the "VEOS High Level Design". - The VE allocates memory in units of 64MB pages to increase address translation efficiency and improve performance. Even a simple program consumes around 300MB of memory.