llvm-vec - NEC LLVM-IR Vectorizer

Synopsis

llvm-vec [options] filename

Example

$ /opt/nec/ve/llvm-vec/<version>/bin/llvm-vec -o a.s -x ir a.ll

Description

The llvm-vec command compiles LLVM source inputs into Vector Engine (VE) assembly language. The assembly language output can then be passed through a VE assembler (nas) and linker (nld) to generate a VE native executable.

The filename can specify one LLVM source.

Options

Many options have names beginning with -f, -m or -W. Most of them have positive and negative forms. The negative forms are beginning with -fno-, -mno- or -Wno-.

--help

Displays usage of the compiler.

--version

Displays the version number and copyrights of NEC LLVM-IR Vectorizer.

-O<n>

Specifies optimization level by n. (default: -O2)

-O4

Enables aggressive optimization which violates language standard.

-O3

Enables optimization which causes side-effects and nested loop optimization.

-O2

Enables optimization which causes side-effects.

-O1

Enables optimization which does not cause any side-effects.

-O0

Disables any optimizations, automatic vectorization, parallelization, and inlining.

-faggressive-associative-math
Allows re-association of operands aggresively in series during optimization. This optimization causes side-effect.
(default: -fno-aggressive-associative-math)
-fargument-alias
Disallows the compiler to assume that arguments are not aliasing each other and non-local-objects in all optimization.
(default)
-fargument-noalias
Allows the compiler to assume that arguments are not aliasing each other and non-local-objects in all optimization.
(default: -fargument-alias)
-fassociative-math
Disallows re-association of operands in series during optimization and loop transformation.
(default)
-fcse-after-vectorization
Re-apply common subexpression elimination after vectorization.
(default: -fno-cse-after-vectorization)
-fdiag-vector=<n>
Specifies vector diagnostics level by n. (0: No output, 1:Information, 2:Detail) The vector diagnostic is output to the standard error output.
(default: -fdiag-vector=1)
-ffast-math
Does not uses fast scalar version math functions outside of vectorized loops.
(default)
-finstrument-functions

Inserts function calls for the instrumentation to entry and exit of functions. The instrumented functions are;

void __cyg_profile_func_enter(void *this_fn, void *call_site);
void __cyg_profile_func_exit(void *this_fn,void *call_site);
-fivdep
Inserts ivdep directive before all loops.
(default: -fno-ivdep)
-floop-collapse
Allows loop collapsing. -On (n=2,3,4) must be effective.
(default: -fno-loop-collapse)
-floop-count=n
Specifies n which is taken to assume the iteration count of the loop whose iteration count cannot be decided at compilation.
(default: -floop-count=5000)
-fmove-loop-invariants
Enables the loop invariant motion under if-condition.
(default)
-fmove-loop-invariants-unsafe
The unsafe codes which may cause any side effects are moved.
(default: -fno-move-loop-invariants-unsafe)

The example of unsafe codes are:

  • divide

  • memory reference to 1 byte or 2byte area

-fmove-loop-invariants must be effective when you specify this option.

-fmove-nested-loop-invariants-outer
Disallows the compiler to move the loop invariants expression to outer loop. When this option is specified they are moved before the current loop.
(default)
-fnamed-alias
Disallows the compiler to assume that the object pointed-to-by a named pointer are no aliasing in vectorization.
(default)
-fnamed-noalias
Allows the compiler to assume that the object pointed-to-by a named pointer are no aliasing in vectorization.
(default: -fnamed-alias)
-fpic, -fPIC

Generates position-independent code.

-fprecise-math
Apply high resolution algorithm in the vector version of power operation when the exponent is an integer value. The result becomes more exact but the calculation speed becomes slower.
(default: -fno-precise-math)
-freplace-loop-equation
Replaces “!=” and “==” operator with “<=” or “>=” at the loop backedge.
(default: -fno-replace-loop-equation)
-fstrict-aliasing

Disallows the compiler to assume the ANSI aliasing rules in all optimization. The compiler assumes the stored value is accessed only by one of the following types.

  • A type compatible with the effective type of the object

  • A qualified version of a type compatible with the effective type of the object

  • A type that is the signed or unsigned type corresponding to the effective type of the object

  • A type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object

  • An aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union)

  • A character type

(default: -fno-strict-aliasing)

-fthis-pointer-alias
Disallows the compiler to assume that this pointer does not have any alias in all optimization.
(default: -fthis-pointer-noalias)
-fthis-pointer-noalias

Allows the compiler to assume that this pointer does not have any aliases in all optimization. (default)

-ftrace

Creates an object file and the executable file for ftrace function.

-minit-stack=<value>

Initializes the stack area with the specified value at the run-time. The following are available as value:

zero

Initializes with zeroes.

nan

Initializes with quiet NaN in double type (0x7fffffff7fffffff).

nanf

Initializes with quiet NaN in float type (0x7fffffff).

snan

Initializes with signaling NaN in double type (0x7ff4000000000000).

snanf

Initializes with signaling NaN in float type (0x7fa00000).

0xXXXX

Initializes with the value specified in a hexadecimal format up to 16 digits. When the specified value has more than 8 hexadecimal digits, the initialization is done on an 8-byte cycle. Otherwise it is done on a 4-byte cycle.

-mlist-vector
Allows the vectorization of the statement in a loop when an array element with a vector subscript expression appears on both the left and right sides of an assignment operator.
(default: -mno-list-vector)
-mretain-<keyword>
Set priority of retaining LLC to kind. The following kind can be specified.
(default: -mretain-all)
all

Set higher priority to vector load/store/gather/scatter results to retain on LLC. (default)

list-vector

Set higher priority to vector gather/scatter results to retain on LLC.

none

Do not set higher priority to vector load/store/gather/scatter results to retain on LLC.

-msched-<keyword>
Specifies the instruction scheduling kind by kind.
(default: -msched-insns)
none

Dose not perform instruction scheduling.

insns

Performs the instruction scheduling in a basic block. (default)

block

Performs the instruction scheduling in basic block, but to a wider range than -msched-insns does, in order to schedule instructions aggressively.

interblock

Performs the instruction scheduling in two or more basic blocks, in order to schedule instructions aggressively. The compiler may require more time and memory at compilation.

-mvector
Enables automatic vectorization.
(default)
-mvector-advance-gather
Move vector gather operations so that they can be started as advance as possible.
(default)
-mvector-advance-gather-limit=<n>
The number of vector gather operations which is moved by -mvector-advance-gataher is up to n.
(default: -mvector-advance-gather-limit=56)
-mvector-floating-divide-instruction
Uses vector floating divide instruction for division. In default, apporximate instruction sequence is used.
(default: -mno-vector-floating-divide-instruction)
-mvector-fma
Allows to use vector fused-multiply-add instruction.
(default)
-mvector-intrinsic-check
Checks the value ranges of arguments in the mathematical functions the vectorized version.
(default: -mno-vector-intrinsic-check)
The target mathematical functions of this option are as follows.

acos, acosh, asin, atan, atan2, atanh, cos, cosh, cotan, exp, exp10, exp2, expm1, log10, log2, log, pow, sin, sinh, sqrt, tan, tanh

-mvector-iteration
Allows to use vector iteration instruction in the vectorization.
(default)
-mvector-iteration-unsafe
Allows to use vector iteration instruction in the vectorization even when it may give incorrect result.
(default: -mno-vector-iteration-unsafe)
-mvector-low-precise-divide-function
Takes low-precise divide function for vector floating division.
(default: -mno-vector-low-precise-divide-function)
-mvector-merge-conditional
Allows to merge vector load and store in THEN block, ELSE IF block, and ELSE block.
(default: -mno-vector-merge-conditional)
-mvector-packed
Allows to use packed vector instruction in the vectorization.
(default: -mno-vector-packed)
-mvector-power-to-explog
Allows to replace pow(R1,R2) and/or ** operator in a vectorized loop with exp(R2,log(R1)). By the replacement, the execution time would be shortened, but numerical error occurs rarely in the calculation.
(default:-mno-vector-power-to-explog)
-mvector-power-to-sqrt
Allows to replace pow(R1,R2) in a vectorized loop with the expression including sqrt(3C) or cbrt(3C) when R2 is a special value such as 0.5, 1.0/3.0 etc. When it is replaced, the execution time would become faster, but numerical error occurs rarely in the calculation.
(default)
-mvector-reduction
Disallows to use vector reduction instruction in the vectorization.
(default)
-mvector-sqrt-instruction
Uses vector sqrt instruction for SQRT. In default, apporximate instruction sequence is used.
(default: -mno-vector-sqrt-instruction)
-mvector-threshold=<n>
Specifies the minimum iteration count n of a loop for vectorization.
(default: -mvecter-threshold=5)
-o <filename>

Specifies a file name filename to which output is written, where the output is assembler source file.

-p, -pg

Creates an executable file for output profiler information (ngprof).

-proginf
Creates an executable file for PROGINF function.
(default: -proginf)
-traceback[=verbose]
Specifies to generate extra information in the object file and to link run-time library due to provide traceback information when a fatal error occurs and the environment variable VE_TRACEBACK is set at run-time.
When verbose is specified, generates filename and line number information in addition to the above due to provide these information in traceback output. Set the environment variable VE_TRACEBACK=VERBOSE to output these information at run-time.
-v

Displays the invoked commands at each stage of compilation.

Metadata

The following Metadata are used to control NEC LLVM-IR Vectorizer optimization and correspaonds to Compiler Directives. There are two kinds of Metadata. One has two operands and the other has one operand.

The former second operand is a bit. ‘llvm.loop.nec.vector.enable’ Metadata is representative of the former. It corresponds to ‘vector’ and ‘novector’ Compiler Directives. Its notation is as follows.

!{!"llvm.loop.nec.vector.enable", i1 true}
!{!"llvm.loop.nec.vector.enable", i1 false}

The latter has no second operand. ‘llvm.loop.nec.gather_reorder’ Metadata is representative of the latter. It corresponds to ‘gather_reorder’ Compiler Directives. Its notation is as follows.

!{!"llvm.loop.nec.gather_reorder"}

Although there is a difference in the presence or absence of a second operand, both follow the rules of ‘llvm.loop’. For more information on ‘llvm.loop’, please see https://llvm.org/docs/LangRef.html#llvm-loop.

New Metadata list

Metadata

Compiler directives

Second Operand

llvm.loop.nec.advance_gather.enable

advance_gather

true

llvm.loop.nec.advance_gather.enable

noadvance_gather

false

llvm.loop.nec.assume.enable

assume

true

llvm.loop.nec.assume.enable

noassume

false

llvm.loop.nec.gather_reorder

gather_reorder

llvm.loop.nec.ivdep

ivdep

llvm.loop.nec.list_vector.enable

list_vector

true

llvm.loop.nec.list_vector.enable

nolist_vector

false

llvm.loop.nec.lstval.enable

lstval

true

llvm.loop.nec.lstval.enable

nolstval

false

llvm.loop.nec.move.enable

move

true

llvm.loop.nec.move.enable

nomove

false

llvm.loop.nec.move_unsafe

move_unsafe

llvm.loop.nec.nofma

nofma

llvm.loop.nec.packed_vector.enable

packed_vector

true

llvm.loop.nec.packed_vector.enable

nopacked_vector

false

llvm.loop.nec.shortloop

shortloop

llvm.loop.nec.sparse.enable

sparse

true

llvm.loop.nec.sparse.enable

nosparse

false

llvm.loop.nec.vector.enable

vector

true

llvm.loop.nec.vector.enable

novector

false

llvm.loop.nec.verror_check.enable

verror_check

true

llvm.loop.nec.verror_check.enable

noverror_check

false

llvm.loop.nec.vob.enable

vob

true

llvm.loop.nec.vob.enable

novob

false

llvm.loop.nec.vovertake.enable

vovertake

true

llvm.loop.nec.vovertake.enable

novovertake

false

llvm.loop.nec.vwork.enable

vwork

true

llvm.loop.nec.vwork.enable

novwork

false

Remarks

LLVM source inputs must be legal which can be compiled by LLVM llc command.

The llvm-vec generates assembly language outputs which are not compatible with those generated by NEC C/C++ Compiler (ncc/nc++) and mixture among executable from assembly language output generated by them is not supported.

llvm-vec uses ‘SjLj’ exception handling mechanism.