llvm-vec - NEC LLVM-IR Vectorizer¶
Synopsis¶
llvm-vec [options] filename
Example
$ /opt/nec/ve/llvm-vec/<version>/bin/llvm-vec -o a.s -x ir a.ll
Description¶
The llvm-vec command compiles LLVM source inputs into Vector Engine (VE) assembly language. The assembly language output can then be passed through a VE assembler (nas) and linker (nld) to generate a VE native executable.
The filename can specify one LLVM source.
Options¶
Many options have names beginning with -f, -m or -W. Most of them have positive and negative forms. The negative forms are beginning with -fno-, -mno- or -Wno-.
- --help¶
- Displays usage of the compiler. 
- --version¶
- Displays the version number and copyrights of NEC LLVM-IR Vectorizer. 
- -O<n>¶
- Specifies optimization level by n. (default: -O2) - -O4
- Enables aggressive optimization which violates language standard. 
- -O3
- Enables optimization which causes side-effects and nested loop optimization. 
- -O2
- Enables optimization which causes side-effects. 
- -O1
- Enables optimization which does not cause any side-effects. 
- -O0
- Disables any optimizations, automatic vectorization, parallelization, and inlining. 
 
- -faggressive-associative-math¶
- Allows re-association of operands aggresively in series during optimization. This optimization causes side-effect.(default: -fno-aggressive-associative-math)
- -fargument-alias¶
- Disallows the compiler to assume that arguments are not aliasing each other and non-local-objects in all optimization.(default)
- -fargument-noalias¶
- Allows the compiler to assume that arguments are not aliasing each other and non-local-objects in all optimization.(default: -fargument-alias)
- -fassociative-math¶
- Disallows re-association of operands in series during optimization and loop transformation.(default)
- -fcse-after-vectorization¶
- Re-apply common subexpression elimination after vectorization.(default: -fno-cse-after-vectorization)
- -fdiag-vector=<n>¶
- Specifies vector diagnostics level by n. (0: No output, 1:Information, 2:Detail) The vector diagnostic is output to the standard error output.(default: -fdiag-vector=1)
- -ffast-math¶
- Does not uses fast scalar version math functions outside of vectorized loops.(default)
- -finstrument-functions¶
- Inserts function calls for the instrumentation to entry and exit of functions. The instrumented functions are; - void __cyg_profile_func_enter(void *this_fn, void *call_site); void __cyg_profile_func_exit(void *this_fn,void *call_site); 
- -fivdep¶
- Inserts ivdep directive before all loops.(default: -fno-ivdep)
- -floop-collapse¶
- Allows loop collapsing. -On (n=2,3,4) must be effective.(default: -fno-loop-collapse)
- -floop-count=n¶
- Specifies n which is taken to assume the iteration count of the loop whose iteration count cannot be decided at compilation.(default: -floop-count=5000)
- -fmove-loop-invariants¶
- Enables the loop invariant motion under if-condition.(default)
- -fmove-loop-invariants-unsafe¶
- The unsafe codes which may cause any side effects are moved.(default: -fno-move-loop-invariants-unsafe)The example of unsafe codes are: - divide 
- memory reference to 1 byte or 2byte area 
 -fmove-loop-invariants must be effective when you specify this option. 
- -fmove-nested-loop-invariants-outer¶
- Disallows the compiler to move the loop invariants expression to outer loop. When this option is specified they are moved before the current loop.(default)
- -fnamed-alias¶
- Disallows the compiler to assume that the object pointed-to-by a named pointer are no aliasing in vectorization.(default)
- -fnamed-noalias¶
- Allows the compiler to assume that the object pointed-to-by a named pointer are no aliasing in vectorization.(default: -fnamed-alias)
- -fpic, -fPIC¶
- Generates position-independent code. 
- -fprecise-math¶
- Apply high resolution algorithm in the vector version of power operation when the exponent is an integer value. The result becomes more exact but the calculation speed becomes slower.(default: -fno-precise-math)
- -freplace-loop-equation¶
- Replaces “!=” and “==” operator with “<=” or “>=” at the loop backedge.(default: -fno-replace-loop-equation)
- -fstrict-aliasing¶
- Disallows the compiler to assume the ANSI aliasing rules in all optimization. The compiler assumes the stored value is accessed only by one of the following types. - A type compatible with the effective type of the object 
- A qualified version of a type compatible with the effective type of the object 
- A type that is the signed or unsigned type corresponding to the effective type of the object 
- A type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object 
- An aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union) 
- A character type 
 - (default: -fno-strict-aliasing) 
- -fthis-pointer-alias¶
- Disallows the compiler to assume that this pointer does not have any alias in all optimization.(default: -fthis-pointer-noalias)
- -fthis-pointer-noalias¶
- Allows the compiler to assume that this pointer does not have any aliases in all optimization. (default) 
- -ftrace¶
- Creates an object file and the executable file for ftrace function. 
- -minit-stack=<value>¶
- Initializes the stack area with the specified value at the run-time. The following are available as value: - zero
- Initializes with zeroes. 
- nan
- Initializes with quiet NaN in double type (0x7fffffff7fffffff). 
- nanf
- Initializes with quiet NaN in float type (0x7fffffff). 
- snan
- Initializes with signaling NaN in double type (0x7ff4000000000000). 
- snanf
- Initializes with signaling NaN in float type (0x7fa00000). 
- 0xXXXX
- Initializes with the value specified in a hexadecimal format up to 16 digits. When the specified value has more than 8 hexadecimal digits, the initialization is done on an 8-byte cycle. Otherwise it is done on a 4-byte cycle. 
 
- -mlist-vector¶
- Allows the vectorization of the statement in a loop when an array element with a vector subscript expression appears on both the left and right sides of an assignment operator.(default: -mno-list-vector)
- -mretain-<keyword>¶
- Set priority of retaining LLC to kind. The following kind can be specified.(default: -mretain-all)- all
- Set higher priority to vector load/store/gather/scatter results to retain on LLC. (default) 
- list-vector
- Set higher priority to vector gather/scatter results to retain on LLC. 
- none
- Do not set higher priority to vector load/store/gather/scatter results to retain on LLC. 
 
- -msched-<keyword>¶
- Specifies the instruction scheduling kind by kind.(default: -msched-insns)- none
- Dose not perform instruction scheduling. 
- insns
- Performs the instruction scheduling in a basic block. (default) 
- block
- Performs the instruction scheduling in basic block, but to a wider range than -msched-insns does, in order to schedule instructions aggressively. 
- interblock
- Performs the instruction scheduling in two or more basic blocks, in order to schedule instructions aggressively. The compiler may require more time and memory at compilation. 
 
- -mvector¶
- Enables automatic vectorization.(default)
- -mvector-advance-gather¶
- Move vector gather operations so that they can be started as advance as possible.(default)
- -mvector-advance-gather-limit=<n>¶
- The number of vector gather operations which is moved by -mvector-advance-gataher is up to n.(default: -mvector-advance-gather-limit=56)
- -mvector-floating-divide-instruction¶
- Uses vector floating divide instruction for division. In default, apporximate instruction sequence is used.(default: -mno-vector-floating-divide-instruction)
- -mvector-fma¶
- Allows to use vector fused-multiply-add instruction.(default)
- -mvector-intrinsic-check¶
- Checks the value ranges of arguments in the mathematical functions the vectorized version.(default: -mno-vector-intrinsic-check)- The target mathematical functions of this option are as follows.
- acos, acosh, asin, atan, atan2, atanh, cos, cosh, cotan, exp, exp10, exp2, expm1, log10, log2, log, pow, sin, sinh, sqrt, tan, tanh 
 
- -mvector-iteration¶
- Allows to use vector iteration instruction in the vectorization.(default)
- -mvector-iteration-unsafe¶
- Allows to use vector iteration instruction in the vectorization even when it may give incorrect result.(default: -mno-vector-iteration-unsafe)
- -mvector-low-precise-divide-function¶
- Takes low-precise divide function for vector floating division.(default: -mno-vector-low-precise-divide-function)
- -mvector-merge-conditional¶
- Allows to merge vector load and store in THEN block, ELSE IF block, and ELSE block.(default: -mno-vector-merge-conditional)
- -mvector-packed¶
- Allows to use packed vector instruction in the vectorization.(default: -mno-vector-packed)
- -mvector-power-to-explog¶
- Allows to replace pow(R1,R2) and/or ** operator in a vectorized loop with exp(R2,log(R1)). By the replacement, the execution time would be shortened, but numerical error occurs rarely in the calculation.(default:-mno-vector-power-to-explog)
- -mvector-power-to-sqrt¶
- Allows to replace pow(R1,R2) in a vectorized loop with the expression including sqrt(3C) or cbrt(3C) when R2 is a special value such as 0.5, 1.0/3.0 etc. When it is replaced, the execution time would become faster, but numerical error occurs rarely in the calculation.(default)
- -mvector-reduction¶
- Disallows to use vector reduction instruction in the vectorization.(default)
- -mvector-sqrt-instruction¶
- Uses vector sqrt instruction for SQRT. In default, apporximate instruction sequence is used.(default: -mno-vector-sqrt-instruction)
- -mvector-threshold=<n>¶
- Specifies the minimum iteration count n of a loop for vectorization.(default: -mvecter-threshold=5)
- -o <filename>¶
- Specifies a file name filename to which output is written, where the output is assembler source file. 
- -p, -pg¶
- Creates an executable file for output profiler information (ngprof). 
- -proginf¶
- Creates an executable file for PROGINF function.(default: -proginf)
- -traceback[=verbose]¶
- Specifies to generate extra information in the object file and to link run-time library due to provide traceback information when a fatal error occurs and the environment variable VE_TRACEBACK is set at run-time.When verbose is specified, generates filename and line number information in addition to the above due to provide these information in traceback output. Set the environment variable VE_TRACEBACK=VERBOSE to output these information at run-time.
- -v¶
- Displays the invoked commands at each stage of compilation. 
Metadata¶
The following Metadata are used to control NEC LLVM-IR Vectorizer optimization and correspaonds to Compiler Directives. There are two kinds of Metadata. One has two operands and the other has one operand.
The former second operand is a bit. ‘llvm.loop.nec.vector.enable’ Metadata is representative of the former. It corresponds to ‘vector’ and ‘novector’ Compiler Directives. Its notation is as follows.
!{!"llvm.loop.nec.vector.enable", i1 true}
!{!"llvm.loop.nec.vector.enable", i1 false}
The latter has no second operand. ‘llvm.loop.nec.gather_reorder’ Metadata is representative of the latter. It corresponds to ‘gather_reorder’ Compiler Directives. Its notation is as follows.
!{!"llvm.loop.nec.gather_reorder"}
Although there is a difference in the presence or absence of a second operand, both follow the rules of ‘llvm.loop’. For more information on ‘llvm.loop’, please see https://llvm.org/docs/LangRef.html#llvm-loop.
New Metadata list¶
| Metadata | Compiler directives | Second Operand | 
|---|---|---|
| llvm.loop.nec.advance_gather.enable | advance_gather | true | 
| llvm.loop.nec.advance_gather.enable | noadvance_gather | false | 
| llvm.loop.nec.assume.enable | assume | true | 
| llvm.loop.nec.assume.enable | noassume | false | 
| llvm.loop.nec.gather_reorder | gather_reorder | |
| llvm.loop.nec.ivdep | ivdep | |
| llvm.loop.nec.list_vector.enable | list_vector | true | 
| llvm.loop.nec.list_vector.enable | nolist_vector | false | 
| llvm.loop.nec.lstval.enable | lstval | true | 
| llvm.loop.nec.lstval.enable | nolstval | false | 
| llvm.loop.nec.move.enable | move | true | 
| llvm.loop.nec.move.enable | nomove | false | 
| llvm.loop.nec.move_unsafe | move_unsafe | |
| llvm.loop.nec.nofma | nofma | |
| llvm.loop.nec.packed_vector.enable | packed_vector | true | 
| llvm.loop.nec.packed_vector.enable | nopacked_vector | false | 
| llvm.loop.nec.shortloop | shortloop | |
| llvm.loop.nec.sparse.enable | sparse | true | 
| llvm.loop.nec.sparse.enable | nosparse | false | 
| llvm.loop.nec.vector.enable | vector | true | 
| llvm.loop.nec.vector.enable | novector | false | 
| llvm.loop.nec.verror_check.enable | verror_check | true | 
| llvm.loop.nec.verror_check.enable | noverror_check | false | 
| llvm.loop.nec.vob.enable | vob | true | 
| llvm.loop.nec.vob.enable | novob | false | 
| llvm.loop.nec.vovertake.enable | vovertake | true | 
| llvm.loop.nec.vovertake.enable | novovertake | false | 
| llvm.loop.nec.vwork.enable | vwork | true | 
| llvm.loop.nec.vwork.enable | novwork | false | 
Remarks¶
LLVM source inputs must be legal which can be compiled by LLVM llc command.
The llvm-vec generates assembly language outputs which are not compatible with those generated by NEC C/C++ Compiler (ncc/nc++) and mixture among executable from assembly language output generated by them is not supported.
llvm-vec uses ‘SjLj’ exception handling mechanism.