Name

    NV_tessellation_program5

Name Strings

    (none)

Contact

    Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)

Status

    Shipping.

Version

    Last Modified Date:         12/19/2011
    NVIDIA Revision:            3

Number

    391

Dependencies

    OpenGL 1.1 is required.

    This extension is written against the OpenGL 2.1 specification.

    NV_gpu_program5 is required.  This extension is supported if and only
    "GL_NV_gpu_program5" is found in the extension string.  This extension is
    written against the NV_gpu_program5 extension.

    This specification interacts with ARB_tessellation_shader.

    This specification interacts with NV_parameter_buffer_object.

Overview

    This extension, in conjunction with the ARB_tessellation_shader extension,
    introduces a new tessellation stage to the OpenGL primitive processing
    pipeline.  The ARB_tessellation_shader extension provides programmable
    shading functionality using the OpenGL Shading Language as its base; this
    extension provides assembly programmable shaders building on the family of
    assembly programmability extensions including ARB_vertex_program,
    ARB_fragment_program, NV_gpu_program4, and NV_geometry_program4.

    This extension adds a new basic primitive type, called a patch, which
    consists of an array of vertices plus some associated per-patch state.  It
    also adds two new assembly program types:  a tessellation control program
    that transforms a patch into a new patch and a tessellation evaluation
    program that computes the position and attributes of each vertex produced
    by the tesselator.

    When tessellation is active, it begins by running the optional
    tessellation control program, if enabled.  This program consumes a
    variable-size input patch and produces a new fixed-size output patch.  The
    output patch consists of an array of vertices, and a set of per-patch
    attributes.  The per-patch attributes include tessellation levels that
    control how finely the patch will be tessellated.  For each patch
    processed, multiple tessellation control program invocations are performed
    -- one per output patch vertex.  Each tessellation control program
    invocation writes all the attributes of its corresponding output patch
    vertex.  A tessellation control program may also read the per-vertex
    outputs of other tessellation control program invocations, as well as read
    and write shared per-patch outputs.  The tessellation control program
    invocations for a single patch effectively run as a group.  The GL
    automatically synchronizes threads to ensure that when executing a given
    instruction, all previous instructions have completed for all program
    invocations in the group.

    The tessellation primitive generator then decomposes a patch into a new
    set of primitives using the tessellation levels to determine how finely
    tessellated the output should be.  The primitive generator begins with
    either a triangle or a quad, and splits each outer edge of the primitive
    into a number of segments approximately equal to the corresponding element
    of the outer tessellation level array.  The interior of the primitive is
    tessellated according to elements of the inner tessellation level array.
    The primitive generator has three modes:  TRIANGLES and QUADS split a
    triangular or quad-shaped patch into a set of triangles that cover the
    original patch; ISOLINES_NV splits a quad-shaped patch into a set of line
    strips spanning the patch.  Each vertex generated by the tessellation
    primitive generator is assigned a (u,v) or (u,v,w) coordinate indicating
    its relative location in the subdivided triangle or quad.

    For each vertex produced by the tessellation primitive generator, the
    tessellation evaluation program is run to compute its position and other
    attributes of the vertex, using its (u,v) or (u,v,w) coordinate.  When
    computing the final vertex attributes, the tessellation evaluation program
    can also read the attributes of any of the vertices of the patch written
    by the tessellation control program.  Tessellation evaluation program
    invocations are completely independent, although all invocations for a
    single patch share the same collection of input vertices and per-patch
    attributes.

    The tessellator operates on vertices after they have been transformed by a
    vertex program or fixed-function vertex processing.  The primitives
    generated by the tessellator are passed further down the OpenGL pipeline,
    where they can be used as inputs to geometry programs, transform feedback,
    and the rasterizer.

    The tessellation control and evaluation programs are both optional.  If
    neither program type is present, the tessellation stage has no effect.  If
    no tessellation control program is present, the input patch provided by
    the application is passed directly to the tessellation primitive
    generator, and a set of fixed tessellation level parameters (specified via
    the PatchParameterfv function) is used to control primitive generation.
    If no tessellation evaluation program is present, the output patch
    produced by the tessellation control program is passed as a patch to
    subsequent pipeline stages, where it can be consumed by geometry programs,
    transform feedback, or the rasterizer.


New Procedures and Functions

    None

    (Note:  The PatchParameteri and PatchParameterfv functions from
     ARB_tessellation_shader will also be used by this extension.)

New Tokens

    Accepted by the <cap> parameter of Disable, Enable, and IsEnabled, 
    by the <pname> parameter of GetBooleanv, GetIntegerv, GetFloatv, 
    and GetDoublev, and by the <target> parameter of ProgramStringARB,
    BindProgramARB, ProgramEnvParameter4[df][v]ARB,
    ProgramLocalParameter4[df][v]ARB, GetProgramEnvParameter[df]vARB, 
    GetProgramLocalParameter[df]vARB, GetProgramivARB and
    GetProgramStringARB:

        TESS_CONTROL_PROGRAM_NV                         0x891E
        TESS_EVALUATION_PROGRAM_NV                      0x891F

    Accepted by the <target> parameter of ProgramBufferParametersfvNV,
    ProgramBufferParametersIivNV, and ProgramBufferParametersIuivNV,
    BindBufferRangeNV, BindBufferOffsetNV, BindBufferBaseNV, and BindBuffer
    and the <value> parameter of GetIntegerIndexedvEXT:

        TESS_CONTROL_PROGRAM_PARAMETER_BUFFER_NV        0x8C74
        TESS_EVALUATION_PROGRAM_PARAMETER_BUFFER_NV     0x8C75

    Accepted by the <pname> parameter of GetProgramivARB:

        MAX_PROGRAM_PATCH_ATTRIBS_NV                    0x86D8

    (Note:  Various enumerants from ARB_tessellation_shader will also be used
     by this extension.)


Additions to Chapter 2 of the OpenGL 1.5 Specification (OpenGL Operation)

    (Incorporate Section 2.X of the ARB_tessellation_shader specification,
     Tessellation in its entirety.)

    Insert a new section after Section 2.X.1 in ARB_tessellation_shader,
    Tessellation Control Shaders

    Tessellation Control Programs

    Each patch primitive may be optionally processed by a tessellation control
    program, which operates similarly to the tessellation control shader
    described above.  Tessellation control programs are enabled by calling
    Enable with the value TESS_CONTROL_PROGRAM_NV.  If a GLSL program is
    active, the tessellation control program enable is ignored and treated as
    disabled unless the program contains only fragment shaders.

    When enabled, each patch primitive received by the GL will be processed by
    the tessellation control program to produce a new patch.  The tessellation
    control program emits a patch with a fixed number of vertices, given by
    the value specified in the VERTICES_OUT declaration.  It computes the
    attributes of each vertex of the output patch in parallel, and assembles
    the emitted vertices into an output patch.  The program also computes
    per-patch tessellation level values that control the number of vertices
    produced by the tessellation primitive generator when that patch is
    processed.  The program may also compute additional generic per-patch
    attributes that may be accessed by invocations of the tessellation
    evaluation program or a subsequent geometry program when processing the
    patch.  When the tessellation control program completes, the input patch
    is discarded and the output patch is processed by the remainder of the GL
    pipeline.

    Each patch processed by the tessellation control program will result in
    multiple program invocations (threads), with one invocation per output
    patch vertex.  Each program invocation has a corresponding output patch
    vertex, and can write per-vertex attributes only for that vertex.  All
    program invocations may read and write per-patch attributes of the output
    patch, and may read per-vertex attributes of any vertex in the output
    patch.

    The tessellation control program threads are run as a group, and execute
    effectively in lock-step.  In this model, the execution of each
    instruction completes for all active threads before the execution of
    subsequent instruction is started.  All threads in the group are initially
    active, but the set of active threads change as flow-control instructions
    are encountered.  Full details on the execution model are specified in
    Section 2.X.5.

    Tessellation control programs execute using the instruction set documented
    in the GL_NV_gpu_program5 extension specification.  Tessellation control
    programs can read attributes from all vertices of the input patch, and
    each vertex attribute access must identify the vertex number being
    accessed.  For example, "vertex[1].position" and "vertex.in[1].position"
    identify the position of the second vertex (numbered "1") in the input
    patch.  Programs may also read attributes of all vertices of the output
    patch (e.g., "vertex.out[2].position") and per-patch attributes of the
    output patch (e.g., "primitive.out.attrib[3]").  In both cases, the output
    patch vertices or attributes accessed in this manner are undefined unless
    written by a previous instruction executed on one of the threads.
    Programs may also write attributes of their corresponding vertex in the
    output patch (e.g., "result.attrib[0]") and shared per-patch attributes
    (e.g., "result.patch.attrib[4]").  When writing output patch vertex
    attributes, a vertex number is not supplied.
    
    The only input primitives supported by tessellation control programs are
    patches.  The error INVALID_OPERATION is generated by Begin (or vertex
    array functions that implicitly call Begin) if a tessellation control
    program is active and <mode> is not PATCHES_NV.


    Modify section after Section 2.X.2 of ARB_tessellation_shader,
    Tessellation Primitive Generation

    (add to the end of the section describing the operation of the
     tessellation primitive generator when assembly tessellation evaluation
     programs are used)

    If no GLSL program object is active, or if the active program contains
    only a fragment shader, the tessellation primitive generator will be
    active if and only if an assembly tessellation evaluation program is
    enabled.  When a tessellation evaluation program is used, the tessellation
    primitive generator will operate in exactly the manner describe above,
    except that the parameters controlling tessellation will be taken from
    declaration statements in the tessellation evaluation program.  The
    declaration statements used to specify each tessellation parameter are as
    described in Table X.1.  

          GLSL Program Parameter        TEP Declaration
          ------------------------      -----------------
          TESS_GEN_MODE_NV              TESS_MODE
          TESS_GEN_SPACING_NV           TESS_SPACING
          TESS_GEN_VERTEX_ORDER_NV      TESS_VERTEX_ORDER
          TESS_GEN_POINT_MODE_NV        TESS_POINT_MODE

      Table X.1, Parameters used to control tessellation when a program object
      with a tessellation evaluation shader is active and their tessellation
      evaluation program equivalents.

    If no tessellation control program is enabled, the default tessellation
    levels specified by calling PatchParameterfvNV with a <pname> of
    PATCH_DEFAULT_OUTER_LEVEL_NV or PATCH_DEFAULT_INNER_LEVEL_NV.

    If a GLSL program containing only a fragment shader is active, any
    tessellation-related program parameters in effect when the program was
    linked have no effect on tessellation.


    Insert a new section after Section 2.X.3 in ARB_tessellation_shader,
    Tessellation Evaluation Shaders

    Tessellation Evaluation Programs

    If a tessellation evaluation program is active, the tessellation primitive
    generator will subdivide a basic primitive and run the tessellation
    evaluation program on each generated vertex.  Tessellation evaluation
    programs are enabled by calling Enable with the value
    TESS_EVALUATION_PROGRAM_NV.  If a GLSL program is active, the tessellation
    evaluation program enable is ignored and treated as disabled unless the
    program contains only fragment shaders.

    When tessellation evaluation programs are enabled, each patch primitive
    received by the GL will trigger the tessellation primitive generator to
    perform primitive subdivision and generate a new set of vertices.  For
    each generated vertex, the tessellation evaluation program is invoked.
    Each tessellation evaluation program invocation produces a single output
    vertex.  These vertices are assembled into primitives according to the
    subdivision produced by the tessellation primitive generator, and these
    primitives are processed by the remainder of the GL pipeline.  The input
    patch used by the tessellation evaluation program is discarded.

    Tessellation evaluation programs execute using the instruction set
    documented in the GL_NV_gpu_program5 extension specification and in a
    manner similar to vertex programs.  Tessellation control programs can read
    attributes from all vertices of the input patch, and each vertex attribute
    access must identify the vertex number being accessed.  For example,
    "vertex[1].position" identifies the transformed position of "vertex[1]",
    which is the second vertex in the input patch.  Additionally, the special
    attribute variable "vertex.tesscoord" is available to specify the location
    of the vertex within the subdivided primitive.  Per-patch attributes,
    including the tessellation levels, are also available.

    The only input primitives supported by tessellation evaluation programs
    are patches.  The error INVALID_OPERATION is generated by Begin (or vertex
    array functions that implicitly call Begin) if a tessellation evaluation
    program is active and <mode> is not PATCHES_NV.


    Modify Section 2.X.2 of NV_gpu_program4, Program Grammar

    (replace third paragraph)

    Tessellation control programs are required to begin with the header string
    "!!NVtcp5.0".  Tessellation evaluation programs are required to begin with
    the header string "!!NVtep5.0".  These header strings identify the
    subsequent program body as being a tessellation control or evaluation
    program, respectively, and indicate that they should be parsed according
    to the base NV_gpu_program5 grammar plus the additions below.  Program
    string parsing begins with the character immediately following the header
    string.

    (For tessellation control programs, add the following grammar rules to the
     NV_gpu_program5 base grammar)

    <declSequence>          ::= <declaration> <declSequence>

    <attribUseV>            ::= <attribColor> "." <faceType> <swizzleSuffix>
                              | <attribColor> "." <faceType> "." <colorType> 
                                <swizzleSuffix>

    <resultUseW>            ::= <resultVarName> <arrayMem> <optWriteMask>
                              | <resultColor> <optWriteMask>
                              | <resultColor> "." <colorType> <optWriteMask>
                              | <resultColor> "." <faceType> <optWriteMask>
                              | <resultColor> "." <faceType> "." <colorType> 
                                "." <optWriteMask>

    <resultUseD>            ::= <resultColor> <optFaceColorType>
                              | <resultMulti>

    <optFaceColorType>      ::= <optColorType>
                              | "." <faceType> <optColorType>

    <declaration>           ::= "VERTICES_OUT" <int>

    <attribBasic>           ::= <vtxPrefix> "position"
                              | <vtxPrefix> "fogcoord"
                              | <vtxPrefix> "pointsize"
                              | <vtxPrefix> "id"
                              | <attribTexCoord> <optArrayMemAbs>
                              | <attribClip> <arrayMemAbs>
                              | <attribGeneric> <arrayMemAbs>
                              | <primPrefix> "." "id"
                              | <primPrefix> "." "invocation"
                              | <primPrefix> "." "vertexcount"
                              | <attribTessOuter> <arrayMemAbs>
                              | <attribTessInner> <arrayMemAbs>
                              | <attribPatchGeneric> <arrayMemAbs>

    <attribColor>           ::= <vtxPrefix> "color"

    <attribMulti>           ::= <attribTexCoord> <arrayRange>
                              | <attribClip> <arrayRange>
                              | <attribGeneric> <arrayRange>
                              | <attribTessOuter> <arrayRange>
                              | <attribTessInner> <arrayRange>
                              | <attribPatchGeneric> <arrayRange>

    <attribTexCoord>        ::= <vtxPrefix> "texcoord"

    <attribClip>            ::= <vtxPrefix> "clip"

    <attribGeneric>         ::= <vtxPrefix> "attrib"

    <attribTessOuter>       ::= <primPrefix> "." "tessouter"

    <attribTessInner>       ::= <primPrefix> "." "tessinner"

    <attribPatchGeneric>    ::= <primPrefix> "." "patch" "." "attrib"

    <vtxPrefix>             ::= "vertex" "."
                              | "vertex" <arrayMemAbs> "."
                              | "vertex" "." "in" <optArrayMemAbs> "."
                              | "vertex" "." "out" <optArrayMemAbs> "."

    <primPrefix>            ::= "primitive" "."
                              | "primitive" "." "in" "."
                              | "primitive" "." "out" "."

    <resultBasic>           ::= <resPrefix> "position"
                              | <resPrefix> "fogcoord"
                              | <resPrefix> "pointsize"
                              | <resultTexCoord> <optArrayMemAbs>
                              | <resultClip> <arrayMemAbs>
                              | <resultGeneric> <arrayMemAbs>
                              | <resPrefix> "id"
                              | <resultTessOuter> <arrayMemAbs>
                              | <resultTessInner> <arrayMemAbs>
                              | <resultPatchGeneric> <arrayMemAbs>

    <resultColor>           ::= <resPrefix> "color"

    <resultMulti>           ::= <resultTexCoord> <arrayRange>
                              | <resultClip> <arrayRange>
                              | <resultGeneric> <arrayRange>
                              | <resultTessOuter> <arrayRange>
                              | <resultTessInner> <arrayRange>
                              | <resultPatchGeneric> <arrayRange>

    <resultTexCoord>        ::= <resPrefix> "texcoord"

    <resultClip>            ::= <resPrefix> "clip"

    <resultGeneric>         ::= <resPrefix> "attrib"

    <resultTessOuter>       ::= <resPrefix> "." "patch" "." "tessouter"

    <resultTessInner>       ::= <resPrefix> "." "patch" "." "tessinner"

    <resultPatchGeneric>    ::= <resPrefix> "." "patch" "." "attrib"

    <resPrefix>             ::= "result" "."


    (For tessellation evaluation programs, add the following grammar rules to
     the NV_gpu_program5 base grammar)

    <declSequence>          ::= <declaration> <declSequence>

    <attribUseV>            ::= <attribColor> "." <faceType> <swizzleSuffix>
                              | <attribColor> "." <faceType> "." <colorType> 
                                <swizzleSuffix>

    <resultUseW>            ::= <resultVarName> <arrayMem> <optWriteMask>
                              | <resultColor> <optWriteMask>
                              | <resultColor> "." <colorType> <optWriteMask>
                              | <resultColor> "." <faceType> <optWriteMask>
                              | <resultColor> "." <faceType> "." <colorType> 
                                "." <optWriteMask>

    <resultUseD>            ::= <resultColor> <optFaceColorType>
                              | <resultMulti>

    <optFaceColorType>      ::= <optColorType>
                              | "." <faceType> <optColorType>

    <declaration>           ::= "TESS_MODE" <declTessMode>
                              | "TESS_SPACING" <declTessSpacing>
                              | "TESS_VERTEX_ORDER" <declTessVtxOrder>
                              | "TESS_POINT_MODE"

    <declTessMode>          ::= "TRIANGLES"
                              | "QUADS"
                              | "ISOLINES"

    <declTessSpacing>       ::= "EQUAL"
                              | "FRACTIONAL_ODD"
                              | "FRACTIONAL_EVEN"

    <declTessVtxOrder>      ::= "CW"
                              | "CCW"

    <attribBasic>           ::= <vtxPrefix> "position"
                              | <vtxPrefix> "fogcoord"
                              | <vtxPrefix> "pointsize"
                              | <vtxPrefix> "id"
                              | <attribTexCoord> <optArrayMemAbs>
                              | <attribClip> <arrayMemAbs>
                              | <attribGeneric> <arrayMemAbs>
                              | "vertex" "." "tesscoord"
                              | <primPrefix> "id"
                              | <primPrefix> "vertexcount"
                              | <attribTessOuter> <optArrayMemAbs>
                              | <attribTessInner> <optArrayMemAbs>
                              | <attribPatchGeneric> <optArrayMemAbs>

    <attribColor>           ::= <vtxPrefix> "color"

    <attribMulti>           ::= <attribTexCoord> <arrayRange>
                              | <attribClip> <arrayRange>
                              | <attribGeneric> <arrayRange>
                              | <attribTessOuter> <arrayRange>
                              | <attribTessInner> <arrayRange>
                              | <attribPatchGeneric> <arrayRange>

    <attribTexCoord>        ::= <vtxPrefix> "texcoord"

    <attribClip>            ::= <vtxPrefix> "clip"

    <attribGeneric>         ::= <vtxPrefix> "attrib"

    <attribTessOuter>       ::= <primPrefix> "." "tessouter"

    <attribTessInner>       ::= <primPrefix> "." "tessinner"

    <attribPatchGeneric>    ::= <primPrefix> "." "patch" "." "attrib"

    <vtxPrefix>             ::= "vertex" "."
                              | "vertex" <arrayMemAbs> "."
                              | "vertex" "." "in" <optArrayMemAbs> "."
                              | "vertex" "." "out" <optArrayMemAbs> "."

    <primPrefix>            ::= "primitive" "."
                              | "primitive" "." "in" "."

    <resultBasic>           ::= <resPrefix> "position"
                              | <resPrefix> "fogcoord"
                              | <resPrefix> "pointsize"
                              | <resultTexCoord> <optArrayMemAbs>
                              | <resultClip> <arrayMemAbs>
                              | <resultGeneric> <arrayMemAbs>
                              | <resPrefix> "id"

    <resultColor>           ::= <resPrefix> "color"

    <resultMulti>           ::= <resultTexCoord> <arrayRange>
                              | <resultClip> <arrayRange>
                              | <resultGeneric> <arrayRange>

    <resultTexCoord>        ::= <resPrefix> "texcoord"

    <resultClip>            ::= <resPrefix> "clip"

    <resultGeneric>         ::= <resPrefix> "attrib"

    <resPrefix>             ::= "result" "."
   

    (add the following subsection to section 2.X.3.2 of NV_gpu_program4, 
     Program Attribute Variables)

    Tessellation control and evaluation program attribute variables describe
    inputs accessible to the program.  There are several different classes of
    attribute bindings available, identified by the binding prefix.  The set
    of attribute binding classes and their corresponding prefixes are
    described in Table X.2.  The specific attributes for each class are
    identified by a binding suffix.

      Attribute Binding Prefix   Description
      ------------------------   ------------------------------------------
      vertex[m]                  Vertex <m> of the input patch
      vertex.in[m]               Vertex <m> of the input patch
      vertex                     Array spanning vertices of the input patch
                                   or the specific vertex being evaluated
      vertex.in                  Array spanning vertices of the input patch
                                   or the specific vertex being evaluated
      primitive                  Per-patch value of the input patch
      primitive.in               Per-patch value of the input patch
      vertex.out[m]              Vertex <m> of the output patch
      vertex.out                 Array spanning vertices of the output patch
      primitive.out              Per-patch value of the output patch

      Table X.2, Tessellation Control and Evaluation Program Attribute Binding
      Prefixes.  <m> refers to a constant integer vertex number in the input
      or output patch.

    If an attribute binding prefix matches "vertex[m]" or "vertex.in[m]", the
    attribute binding refers to an attribute of the vertex numbered <m> in the
    input patch.  If <m> is greater than or equal to the number of vertices in
    the input patch, the values corresponding to the binding are undefined.

    If an attribute binding prefix matches "vertex" or "vertex.in" and the
    suffix identifies an attribute of the vertex being processed by a
    tessellation evaluation program (e.g., "tesscoord"), the attribute binding
    refers to that attribute.

    If an attribute binding prefix matches "vertex" or "vertex.in" and the
    suffix identifies any other vertex attribute, the attribute binding refers
    to that specific attribute for each of the vertices of the input patch.
    Bindings of this form may only be used in explicit variable declarations.
    If the variable declaration identifies an array, the program will fail to
    load unless each binding in the binding list uses an attribute prefix of
    this form.  When such variables are used in instructions, they must be
    accessed as an array, with the first array index identifying the vertex
    number.  If such variables are declared as an array, a second array index
    must be provided to identify the specific per-vertex attribute to select.
    If the first array index is negative or greater than or equal to the
    number of vertices in the input patch, the value obtained is undefined.

    If an attribute binding prefix matches "primitive" or "primitive.in", the
    attribute binding refers to an attribute of the input patch.

    If an tessellation control program attribute binding prefix matches
    "vertex.out[m]", the attribute binding refers to an attribute of the
    vertex numbered <m> in the output patch.  These attributes correspond to
    per-vertex output values written by the tessellation control program
    thread numbered <m>.  A program will fail to load if the vertex number <m>
    is greater than or equal to the number of vertices in the output patch.
    Tessellation evaluation programs do not have an output patch and do not
    support this attribute binding prefix.

    If an tessellation control program attribute binding prefix matches
    "vertex.out", the attribute binding identifies a specific attribute for
    each of the vertices of the output patch.  Bindings of this form may only
    be used in explicit variable declarations, and all the usage rules
    described above for bindings using the prefix "vertex.in" apply.  If the
    vertex number identified when accessing such variables is negative or
    greater than or equal to the number of vertices in the output patch, the
    resulting values are undefined.  Tessellation evaluation programs do not
    have an output patch and do not support this attribute binding suffix.

    If an attribute binding prefix matches "primitive.out", the attribute
    binding refers to a per-patch attribute of the output patch.  These
    attributes correspond to per-patch result values written by one of the
    tessellation control program threads.  Tessellation evaluation programs do
    not have an output patch and do not support this attribute binding suffix.

    The following examples illustrate various legal and illegal program
    bindings and their meanings.

      ATTRIB pos = vertex.position;
      ATTRIB pos2 = vertex.in[2].position;
      ATTRIB outpos = vertex.out.position; 
      ATTRIB outpos2 = vertex.out[2].position; 
      ATTRIB texcoords[] = { vertex.texcoord[0..3] };
      ATTRIB tcoords1[4] = { vertex[1].texcoord[1..4] };
      ATTRIB outattr[2] = { vertex.out.attrib[0..1] };
      INT TEMP A0;
      ...
      MOV R0, pos[1];                   # position of input vertex 1
      MOV R0, vertex[1].position;       # position of input vertex 1
      MOV R0, pos2;                     # position of input vertex 2
      MOV R0, outpos;                   # ILLEGAL - needs a vertex number
      MOV R0, outpos[1];                # position of output vertex 1 (TCP)
      MOV R0, outpos2;                  # position of output vertex 2 (TCP)
      MOV R0, texcoords[A0.x][1];       # texcoord 1 of input vertex A0.x
      MOV R0, texcoords[A0.x][A0.y];    # texcoord A0.y of input vertex A0.x
      MOV R0, tcoords1[2];              # texcoord 3 of input vertex 1
      MOV R0, outattr[A0.x][1];         # generic attr 1 of output vertex 
                                        # A0.x (TCP)
      MOV R0, vertex[A0.x].texcoord[1]; # ILLEGAL -- vertex number must be 
                                        # constant or must use variables like
                                        # "texcoords" using bindings w/o
                                        # vertex numbers

    Attributes from input patch vertices will be obtained from the per-vertex
    outputs of the previous program used to generate the vertex in question.
    For tessellation evaluation programs, that previous program would be the
    tessellation control program, if enabled, or the vertex program otherwise.
    For tessellation control programs, the previous program is always the
    vertex program.  Tessellation control and evaluation program attributes
    should be read using the same component data type used to write the
    corresponding vertex program results.  If input patch vertices are
    specified to come from vertex program outputs but no vertex program is
    enabled, the values are instead produced from fixed-function vertex
    processing.  The value of any attribute corresponding to a vertex output
    not written by the previous program stage is undefined, as are the values
    of all generic attributes if the vertex was produced by fixed-function
    vertex processing.

    Attributes from output patch vertices are only available in tessellation
    control programs, and will be obtained from the per-vertex outputs of the
    same program.  When executing an instruction, the values of any output
    patch vertex attribute are undefined unless the corresponding program
    output was written by a previously executed instruction.

    Per-patch attributes of the input patch are only available in tessellation
    evaluation and geometry programs.  If a tessellation control program is
    enabled, they will be obtained from the corresponding per-patch outputs of
    the tessellation control program producing the patch, and any attributes
    not written by any thread of the control program are undefined.  If no
    tessellation control program is enabled, the inner and outer tessellation
    levels are taken from the default tessellation levels, and all other
    per-patch attributes are undefined.

    Per-patch attributes of the output patch are available only in
    tessellation control programs and will be obtained from the per-patch
    outputs of the same program.  When executing an instruction, the values of
    any output patch attribute are undefined unless the corresponding program
    output was written by a previously executed instruction.

    The attributes of the vertices of an input or output patch vertex are
    selected by an attribute binding suffix, as identified in Table X.3.  All
    such bindings correspond to one of multiple patch vertices and require a
    vertex number, either in the binding prefix used in the instruction or as
    the first array index when using an explicitly declared attribute variable
    whose bindings have no vertex number.

      Vertex Binding Suffix      Components   Description
      ------------------------   ----------   ----------------------------
      position                    (x,y,z,w)   clip coordinates
      color                       (r,g,b,a)   front primary color
      color.primary               (r,g,b,a)   front primary color
      color.secondary             (r,g,b,a)   front secondary color
      color.front                 (r,g,b,a)   front primary color
      color.front.primary         (r,g,b,a)   front primary color
      color.front.secondary       (r,g,b,a)   front secondary color
      color.back                  (r,g,b,a)   back primary color
      color.back.primary          (r,g,b,a)   back primary color
      color.back.secondary        (r,g,b,a)   back secondary color
      fogcoord                    (f,-,-,-)   fog coordinate
      pointsize                   (s,-,-,-)   point size
      texcoord                    (s,t,r,q)   texture coordinate, unit 0
      texcoord[n]                 (s,t,r,q)   texture coordinate, unit n
      attrib[n]                   (x,y,z,w)   generic interpolant n
      clip[n]                     (d,-,-,-)   clip plane distance
      texcoord[n..o]              (s,t,r,q)   array of texture coordinates
      attrib[n..o]                (x,y,z,w)   array of generic interpolants
      clip[n..o]                  (d,-,-,-)   array of clip distances
      id                          (id,-,-,-)  vertex id

      Table X.3, Tessellation Control and Evaluation Program Per-Patch Vertex
      Attribute Bindings.  <n> and <o> refer to integer constants.

    If an attribute binding suffix matches "position", the "x", "y", "z" and
    "w" components of the attribute variable are filled with the "x", "y",
    "z", and "w" components, respectively, of the transformed position of the
    specified vertex, in clip coordinates.

    If an attribute binding suffix matches any binding in Table X.3 beginning
    with "color", the "x", "y", "z", and "w" components of the attribute
    variable are filled with the "r", "g", "b", and "a" components,
    respectively, of the corresponding color of the specified vertex.
    Bindings containing "front" and "back" refer to the front and back colors,
    respectively.  Bindings containing "primary" and "secondary" refer to
    primary and secondary colors, respectively.  If face or color type is
    omitted in the binding, the binding is treated as though "front" and
    "primary", respectively, were specified.

    If an attribute binding suffix matches "fogcoord", the "x" component of
    the attribute variable is filled with the fog coordinate of the specified
    vertex.  The "y", "z", and "w" components are undefined.

    If an attribute binding suffix matches "pointsize", the "x" component of
    the attribute variable is filled with the point size of the specified
    vertex.  If the vertex was produced by fixed-function vertex processing,
    the point size attribute is undefined.  The "y", "z", and "w" components
    are always undefined.

    If an attribute binding suffix matches "texcoord" or "texcoord[n]", the
    "x", "y", "z", and "w" coordinates of the attribute variable are filled
    with the "s", "t", "r", and "q" coordinates of texture coordinate set <n>
    of the specified vertex.  If <n> is omitted, texture coordinate set zero
    is used.

    If an attribute binding suffix matches "attrib[n]", the "x", "y", "z", and
    "w" components of the attribute variable are filled with the "x", "y",
    "z", and "w" coordinates of generic interpolant <n> of the specified.  All
    generic interpolants will be undefined when the vertex is produced by
    fixed-function vertex processing.

    If an attribute binding suffix matches "clip[n]", the "x" component of the
    attribute variable is filled the clip distance of the specified vertex for
    clip plane <n>, as written by the vertex program.  If the vertex was
    produced by fixed-function vertex processing or a position-invariant
    vertex program, the clip distance is obtained by computing the per-clip
    plane dot product:

      (p_1' p_2' p_3' p_4') dot (x_e y_e z_e w_e),

    at the vertex location, as described in section 2.12.  The clip distance
    for clip plane <n> is undefined if clip plane <n> is disabled.  The "y",
    "z", and "w" components of the attribute are undefined.

    If an attribute binding suffix matches "texcoord[n..o]", "attrib[n..o]",
    or "clip[n..o]", a sequence of 1+<o>-<n> texture coordinate, generic
    attribute, or clip distance bindings is created.  For texture coordinate
    bindings, it is as though the sequence "vertex[m].texcoord[n],
    vertex[m].texcoord[n+1], ...  vertex[m].texcoord[o]" were specfied.  These
    bindings are available only in explicit declarations of array variables.
    A program will fail to load if <n> is greater than <o>.

    If an attribute binding suffix matches "id", the "x" component is filled
    with the vertex ID of the specified vertex.  If the vertex was generated
    by a previous program, the attribute variable is filled with the vertex ID
    result written by that program.  Otherwise, the vertex ID is undefined.
    The "y", "z", and "w" components of the attribute are undefined.

    Attribute bindings other than those corresponding to individual vertices
    in input and output patch are identified in Table X.4.  All of these items
    except for "vertex.tesscoord" are per-patch attributes, and require one of
    the prefixes beginning with "primitive".

      Primitive Binding Suffix   Components  Description
      ------------------------   ----------  ----------------------------
      id                         (id,-,-,-)  primitive number
      invocation                 (id,-,-,-)  tess. control invocation
      vertexcount                (c,-,-,-)   vertices in primitive
      tessouter[n]               (x,-,-,-)   outer tess. level n
      tessinner[n]               (x,-,-,-)   inner tess. level n
      patch.attrib[n]            (x,y,z,w)   generic patch attribute n
      tessouter[n..o]            (x,-,-,-)   outer tess. levels n to o
      tessinner[n..o]            (x,-,-,-)   inner tess. levels n to o
      patch.attrib[n..o]         (x,y,z,w)   generic patch attrib n to o
      vertex.tesscoord (*)       (u,v,w,-)   tess. coordinate in [0,1]

      Table X.4, Tessellation Control and Evaluation Miscellaneous Attribute
      Bindings.  <n> and <o> refer to integer constants.

    If an attribute binding suffix matches "id", the "x" component is filled
    with the number of primitives received by the GL since the last time Begin
    was called (directly or indirectly via vertex array functions).  The first
    primitive generated after a Begin is numbered zero, and the primitive ID
    counter is incremented after every individual point, line, or polygon
    primitive is processed.  Restarting a primitive topology using the
    primitive restart index has no effect on the primitive ID counter.  The
    "y", "z", and "w" components of the variable are always undefined.  This
    suffix may only be used with the prefixes "primitive", "primitive.in", or
    "primitive.out", and produces the same value in all cases.

    If an tessellation control program attribute binding suffix matches
    "invocation", the "x" component is filled with the thread number of the
    program invocation.  The invocation number identifies the number of the
    vertex in the output patch whose attributes are produced by this
    invocation, and is in the range [0..<n>-1], where <n> is given by the
    VERTICES_OUT declaration.  The "y", "z", and "w" components of the
    variable are always undefined.  This suffix is not available to
    tessellation evaluation programs and may only be used with the prefixes
    "primitive", "primitive.in", or "primitive.out", and produces the same
    value in all cases.

    If an attribute binding suffix matches "vertexcount", the "x" component is
    filled with the number of vertices in the input primitive being processed.
    The "y", "z", and "w" components of the variable are always undefined.
    This suffix is available only with the prefixes "primitive" and
    "primitive.in".

    If an attribute binding suffix matches "tessouter[n]", the "x" component
    is filled with the per-patch outer tessellation level numbered <n> of the
    identified input or output patch.  <n> must be less than four.  The "y",
    "z", and "w" components are always undefined.  This suffix is available
    only with the prefixes "primitive", "primitive.in", and "primitive.out".
    For tessellation control programs, this suffix is available only with
    "primitive.out".

    If an attribute binding suffix matches "tessinner[n]", the "x" component
    is filled with the per-patch inner tessellation level numbered <n> of the
    identified input or output patch.  <n> must be less than two.  The "y",
    "z", and "w" components are always undefined.  This suffix is available
    only with the prefixes "primitive", "primitive.in", and "primitive.out".
    For tessellation control programs, this suffix is available only with
    "primitive.out".

    If an attribute binding suffix matches "patch.attrib[n]", the "x", "y",
    "z", and "w" components are filled with the corresponding components of
    the per-patch generic attribute numbered <n> of the identified input or
    output patch.  This suffix is available only with the prefixes
    "primitive", "primitive.in", and "primitive.out".  For tessellation
    control programs, this suffix is available only with "primitive.out".

    If an attribute binding suffix matches "tessouter[n..o]",
    "tessinner[n..o]", or "patch.attrib[n..o]", a sequence of 1+<o>-<n> outer
    tessellation level, inner tessellation level, or per-patch generic
    attribute bindings is created.  For per-patch generic attribute bindings,
    it is as though the sequence "primitive.patch.attrib[n],
    primitive.patch.attrib[n+1], ...  primitive.patch.attrib[o]" were
    specfied.  These bindings are available only in explicit declarations of
    array variables.  A program will fail to load if <n> is greater than <o>.

    If a tessellation evaluation program attribute binding suffix matches
    "vertex.tesscoord", the "x", "y", and "z" components are filled with the
    floating-point (u,v,w) values, respectively, corresponding to the vertex
    being processed by the tessellation evaluation program.  For triangle
    tessellation, the (u,v,w) values are barycentric coordinates that specify
    the location of the vertex relative to the three corners of the subdivided
    triangle.  The (u,v,w) values are in the range [0,1] and sum to one.  For
    quad and isoline tessellation, the (u,v) values are in the range [0,1] and
    specify the relative horizontal and vertical position in the subdivided
    quad.  The third component of the (u,v,w) vector is undefined for quad and
    isoline tessellation.  The "w" component of the variable is always
    undefined.  This suffix is not available to tessellation control shaders
    and may only be used with the prefix "vertex".


    (add the following subsection to section 2.X.3.5 of NV_gpu_program4,
     Program Results.)

    The attributes of individual output vertices are written by tessellation
    control and evaluation programs.  For tessellation control programs, these
    attributes are those of the output patch vertex corresponding to the
    program invocation.  For tessellation evaluation programs, these
    attributes specify the attributes of the vertex in the tessellated patch
    corresponding to the program invocation.  The set of allowable per-vertex
    result variable bindings is the same for tessellation control and
    evaluation programs correspond to attributes of output vertices and is
    given in Table X.5.

      Binding                        Components  Description
      -----------------------------  ----------  ----------------------------
      result.position                (x,y,z,w)   position in clip coordinates
      result.color                   (r,g,b,a)   front-facing primary color
      result.color.primary           (r,g,b,a)   front-facing primary color
      result.color.secondary         (r,g,b,a)   front-facing secondary color
      result.color.front             (r,g,b,a)   front-facing primary color
      result.color.front.primary     (r,g,b,a)   front-facing primary color
      result.color.front.secondary   (r,g,b,a)   front-facing secondary color
      result.color.back              (r,g,b,a)   back-facing primary color
      result.color.back.primary      (r,g,b,a)   back-facing primary color
      result.color.back.secondary    (r,g,b,a)   back-facing secondary color
      result.fogcoord                (f,*,*,*)   fog coordinate
      result.pointsize               (s,*,*,*)   point size
      result.texcoord                (s,t,r,q)   texture coordinate, unit 0
      result.texcoord[n]             (s,t,r,q)   texture coordinate, unit n
      result.attrib[n]               (x,y,z,w)   generic interpolant n
      result.clip[n]                 (d,*,*,*)   clip plane distance
      result.texcoord[n..o]          (s,t,r,q)   texture coordinates n thru o
      result.attrib[n..o]            (x,y,z,w)   generic interpolants n thru o
      result.clip[n..o]              (d,*,*,*)   clip distances n thru o

      Table X.5:  Tessellation Control and Evaluation Program Per-Vertex 
      Result Variable Bindings.  Components labeled "*" are unused.

    If a result variable binding matches "result.position", updates to the
    "x", "y", "z", and "w" components of the result variable modify the "x",
    "y", "z", and "w" components, respectively, of the transformed vertex's
    clip coordinates.  Final window coordinates of vertices used for
    rasterization will be generated for the vertex as described in section
    2.14.4.4.

    If a result variable binding match begins with "result.color", updates to
    the "x", "y", "z", and "w" components of the result variable modify the
    "r", "g", "b", and "a" components, respectively, of the corresponding
    vertex color attribute in Table X.3.  Color bindings that do not specify
    "front" or "back" are consided to refer to front-facing colors.  Color
    bindings that do not specify "primary" or "secondary" are considered to
    refer to primary colors.

    If a result variable binding matches "result.fogcoord", updates to the "x"
    component of the result variable set the transformed vertex's fog
    coordinate.  Updates to the "y", "z", and "w" components of the result
    variable have no effect.

    If a result variable binding matches "result.pointsize", updates to the
    "x" component of the result variable set the transformed vertex's point
    size.  Updates to the "y", "z", and "w" components of the result variable
    have no effect.

    If a result variable binding matches "result.texcoord" or
    "result.texcoord[n]", updates to the "x", "y", "z", and "w" components of
    the result variable set the "s", "t", "r" and "q" components,
    respectively, of the transformed vertex's texture coordinates for texture
    unit <n>.  If "[n]" is omitted, texture unit zero is selected.

    If a result variable binding matches "result.attrib[n]", updates to the
    "x", "y", "z", and "w" components of the result variable set the "x", "y",
    "z", and "w" components of the generic interpolant <n>.

    If a result variable binding matches "result.clip[n]", updates to the "x"
    component of the result variable set the clip distance for clip plane <n>.

    If a result variable binding matches "result.texcoord[n..o]",
    "result.attrib[n..o]", or "result.clip[n..o]", a sequence of 1+<o>-<n>
    bindings is created.  For texture coordinates, it is as though the
    sequence "result.texcoord[n], result.texcoord[n+1],
    ... result.texcoord[o]" were specfied.  These bindings are available only
    in explicit declarations of array variables.  A program will fail to load
    if <n> is greater than <o>.

    In addition to per-vertex attribute bindings, a set of per-patch result
    bindings are available to tessellation control programs, as described in
    Table X.6.  These bindings are not available to tessellation evaluation
    programs.

      Binding                        Components  Description
      -----------------------------  ----------  ----------------------------
      result.patch.tessouter[n]      (x,*,*,*)   tessctl outer level n
      result.patch.tessinner[n]      (x,*,*,*)   tessctl inner level n
      result.patch.attrib[n]         (x,y,z,w)   per-patch generic attrib n
      result.patch.tessouter[n..o]   (x,*,*,*)   tessctl outer levels n thru o
      result.patch.tessinner[n..o]   (x,*,*,*)   tessctl inner levels n thru o
      result.patch.attrib[n..o]      (x,y,z,w)   per-patch attribs n thru o

      Table X.4:  Tessellation Control Per-Patch Result Variable Bindings.  
      Components labeled "*" are unused.

    If a result variable binding matches "result.patch.tessouter[n]", updates
    to the "x" component set the outer tessellation level numbered <n> for the
    output patch.  Updates to the "y", "z", and "w" components have no effect.

    If a result variable binding matches "result.patch.tessinner[n]", updates
    to the "x" component set the inner tessellation level numbered <n> for the
    output patch.  Updates to the "y", "z", and "w" components have no effect.

    If a result variable binding matches "result.patch.attrib[n]", updates to
    the "x", "y", "z", and "w" components of the result variable set the "x",
    "y", "z", and "w" components of the per-patch generic attribute numbered
    <n> for the output patch.

    If a result variable binding matches "result.patch.tessouter[n..o]",
    "result.patch.tessinner[n..o]", or "result.patch.attrib[n..o]", a sequence
    of 1+<o>-<n> bindings is created.  For per-patch generic attributes, it is
    as though the sequence "result.patch.attrib[n], result.patch.attrib[n+1],
    ...  result.patch.attrib[o]" were specfied.  These bindings are available
    only in explicit declarations of array variables.  A program will fail to
    load if <n> is greater than <o>.

    
    Modify Section 2.X.5 of NV_gpu_program4, Program Flow Control

    (modify spec language at the end of the section to account for the
     different flow control model for tessellation control programs)

    Tessellation Control Program Flow Control

    For tessellation control programs, there are multiple program invocations
    for each patch processed that run as a group.  Any given program
    invocation can read per-vertex or per-patch attributes of the output
    patch, which may be computed during the execution of the program and may
    be computed by a different program invocation.  To provide defined
    behavior for such accesses, we specify that all threads for each patch run
    as a group.  When executing any block of instructions, all active threads
    will complete the excecution of one instruction before starting the
    execution of the subsequent instruction.  Flow control instructions may
    cause the flow of threads in a group to diverge and will modify the set of
    active threads.  The handling of flow control instructions is described in
    more detail below.

    A tessellation control program is handled by executing all instructions in
    a block of instructions corresponding to the main subroutine, with all
    threads initially active.  This block consists of all instructions between
    the "main" label and the next subroutine label.  If no "main" label is
    present, the block starts with the first instruction in the program.  If
    there is no subroutine label following the beginning of the block, the
    block ends at the END instruction.  Instructions in the block are executed
    in order until all threads reach a termination condition.  A thread will
    terminate:

      * if it executes a RET anywhere within the main subroutine, unless the
        RET instruction is conditional and the condition code test fails; or

      * if it completes the execution of all instructions in the subroutine
        block.

    When an individual thread terminates processing of the main subroutine,
    the thread will become inactive and remain inactive for the remainder of
    program execution.  When all threads have terminated the main subroutine
    block, program execution is complete and the output patch is passed to
    subsequent pipeline stages.

    When a CAL instruction is executed, the current set of active threads will
    execute a block of instructions corresponding to the specified subroutine
    label.  This block consists of all instructions between the specified
    label and the next subroutine label.  If there is no subroutine label
    following the beginning of the block, the block ends at the END
    instruction.  Instructions in the block are executed in order until all
    active threads reach a termination condition.  A thread will complete
    execution of a subroutine block:

      * if the CAL instruction is conditional and the condition code test
        fails;

      * if it executes a RET anywhere within the subroutine block, unless the
        RET instruction is conditional and the condition code test fails; or

      * if it completes the execution of all instructions in the subroutine
        block.

    When an individual thread terminates processing of a called subroutine,
    the thread will become inactive and remain inactive until all threads have
    reached their termination condition.  When all threads have terminated the
    subroutine, execution continues at the instruction following the CAL
    instruction.  All threads active for initial CAL instruction become active
    again; all other threads will remain inactive.
    
    When a REP instruction is executed, the current set of active threads will
    repeatedly execute the instructions between the REP and corresponding
    ENDREP instruction in order.  Execution of this instruction loop will
    continue until all threads active when the REP instruction is executed
    reach a termination condition.  A thread will terminate the processing of
    a REP/ENDREP block:

      * if the REP instruction specifies a loop count, and the initial loop
        count is not positive;

      * if the REP instruction specifies a loop count, and the current value
        of the loop count for the thread reaches zero when decremented by an
        ENDREP instruction;

      * if a RET instruction is executed anywhere within the REP/ENDREP block,
        unless the RET instruction is conditional and the condition code test
        fails; or

      * if a BRK instruction is executed inside the REP/ENDREP block, unless
        the BRK instruction is contained inside a more-deeply nested
        REP/ENDREP block or the BRK instruction is conditional and the
        condition code test fails.

    When an individual thread terminates processing of a REP/ENDREP loop, the
    thread will become inactive and remain inactive until all threads have
    terminated the loop.  When all threads have terminated the loop, execution
    continues at the instruction following the ENDREP instruction.  All
    threads active for initial REP instruction become active again, unless
    they executed a RET instruction inside the REP/ENDREP block.  All other
    threads will be inactive.

    If a conditional CONT instruction is executed inside a REP/ENDREP block,
    all active threads passing the condition code test will become inactive
    and remain inactive until the next ENDREP instruction.  If all active
    threads become inactive following the completion of a CONT instruction,
    processing continues at the next ENDIF or ENDREP instruction.  An
    unconditional CONT instruction is treated identically to a conditional
    CONT instruction where all active threads pass the condition code test.

    When an IF instruction belonging to an IF/ELSE/ENDIF block is executed,
    the current set of active threads is split into two groups.  The first
    group consists of all active threads passing the condition code test, and
    will execute a block of instructions between the IF and ELSE.  The second
    group consists of all active threads failing the condition code test, and
    will execute a block of instructions between the ELSE and ENDIF.
    Instructions within each group are executed in lock-step order.  However,
    the order of execution of instructions for threads in the first group are
    undefined relative to those in the second group.

    When executing a block of instructions for either of the two groups in an
    IF/ELSE/ENDIF block, instructions within the block will be executed in
    order with only the threads in that group active.  The instructions of the
    block are executed until all threads in the group reach a block
    termination condition.  A thread will terminate the processing of its
    block:

      * if it executes a RET instruction, unless the RET instruction is
        conditional and the condition code test fails;

      * if it executes a BRK or CONT instruction inside the IF/ENDIF block,
        unless that instruction is contained in a more-deeply nested
        REP/ENDREP block or if the instruction is conditional and the
        condition code test fails; or

      * if it completes the execution of all instructions in the instruction
        block.

    When both groups have completed their instruction blocks, execution
    continues at the instruction following the ENDIF.  No instruction
    following the ENDIF will be executed until both groups have completed.  At
    that point, any thread active for the IF instruction will become active
    again unless the execution of its instruction block was terminated due to
    the execution of a RET, BRK, or CONT instruction.  All other threads will
    be inactive.

    An IF instruction belonging to an IF/ENDIF block (with no corresponding
    ELSE) is handled as above, except that only one thread group created.
    That group will consists of all active threads passing the condition code,
    and it executes a block of instructions between the IF and ENDIF.

    The order of execution imposed by this flow control model typically
    produces defined results when a tessellation control shader writes an
    output patch attribute, and then reads it (possibly on a different thread)
    for further computation.  There are two cases where undefined instruction
    execution order will lead to undefined attribute values.  When two or more
    threads access an attribute in a single executed instruction:

      * the value of the attribute after the instruction completes will be
        undefined if multiple threads write different values; and

      * the value of the attribute read by one thread will be undefined if the
        same attribute is written by another thread executing the same
        instruction.

    Also, when an IF/ELSE/ENDIF block is executed and a thread from each of
    the two thread groups access an attribute within its block:

      * the value of the attribute after the completion of the block will be
        undefined if both threads write different values;

      * the value of the attribute read by one thread will be undefined if the
        same attribute is written by another thread.

    If either thread group in an IF/ELSE/ENDIF block issue CAL instructions,
    these restrictions also apply to the instructions executed in the called
    subroutine.

    The additional complexities of this tessellation control program flow
    control model are not fundamentally incompatible with the simpler flow
    control rules above.  They are simply intended to provide a useful model
    allowing for multiple cooperating threads.  In particular, two models are
    completely equivalent if there is only number of tessellation control
    program threads per patch is one.


    (add the following subsections to section 2.X.6 of NV_gpu_program4,
     Program Options.)

    Section 2.X.6.Y, Tessellation Control Program Options

    No options are supported at present for tessellation control programs.


    Section 2.X.6.Y, Tessellation Evaluation Program Options

    No options are supported at present for tessellation evaluation programs.


    (add the following subsections to section 2.X.7 of NV_gpu_program4,
     Program Declarations.)

    Section 2.X.7.Y, Tessellation Control Program Declarations

    Tessellation control programs support one type of declaration statement,
    as described below.

    - Output Vertex Count (VERTICES_OUT)

    The VERTICES_OUT statement declares the number of vertices in the output
    patch produced by the tessellation control program, which also specifies
    the number of program invocations for each input patch.  The single
    argument must be a positive integer less than or equal to the value of the
    implementation-dependent limit MAX_PATCH_VERTICES_NV.  Each program
    invocation will have the same inputs except for the built-in input
    variable "primitive.invocation".  This variable will be an integer between
    0 and <n>-1, where <n> is the declared number of invocations.  A program
    will fail to load unless it contains exactly one VERTICES_OUT declaration.


    Section 2.X.7.Y, Tessellation Evaluation Program Declarations

    Tessellation evaluation programs support several declaration statements.
    Each of these may be included at most in a tessellation evaluation
    program.

    - Tessellation Primitive Generation Mode (TESS_MODE)

    The TESS_MODE statement declares the type of subdivision performed by the
    tessellation primitive generator when the tessellation evaluation program,
    as described for the TESS_GEN_MODE_NV parameter in Section 2.X.2.  The
    single argument must be "TRIANGLES", "QUADS", or "ISOLINES".  A
    tessellation evaluation program will fail to load if it has no primitive
    generation mode declaration.

    - Tessellation Primitive Spacing (TESS_SPACING)

    The TESS_SPACING statement declares the type of spacing the tessellation
    primitive generator applies when subdivides primitive edge, as described
    for the TESS_GEN_SPACING_NV parameter in Section 2.X.2.  The single
    argument must be "EQUAL", "FRACTIONAL_ODD", or "FRACTIONAL_EVEN".  If a
    program omits a spacing declaration, "EQUAL" will be used.

    - Tessellation Vertex Order (TESS_VERTEX_ORDER)

    The TESS_VERTEX_ORDER statement declares the order of the vertices in the
    triangles emitted by the tessellation primitive generator in TRIANGLES or
    QUADS mode, as described for the TESS_GEN_VERTEX_ORDER_NV parameter in
    Section 2.X.2.  The single argument must be "CW" or "CCW".  If a program
    omits a vertex order declaration, "CCW" will be used.

    - Tessellation Point Mode (TESS_POINT_MODE)

    The TESS_POINT_MODE statement declares that the tessellation primitive
    generator will emit points for each vertex in the subdivided primitive
    instead of lines or triangles, as described for the TESS_GEN_POINT_MODE_NV
    parameter in Section 2.X.2.  The declaration takes no arguments.  If a
    program omits a point mode declaration, the primitives emitted will be
    lines (for ISOLINES mode) or triangles (for TRIANGLES and QUADS mode).


Additions to Chapter 3 of the OpenGL 1.5 Specification (Rasterization)

    None.

Additions to Chapter 4 of the OpenGL 1.5 Specification (Per-Fragment
Operations and the Frame Buffer)

    None.

Additions to Chapter 5 of the OpenGL 1.5 Specification (Special Functions)

    None.

Additions to Chapter 6 of the OpenGL 1.5 Specification (State and
State Requests)

    None.

Additions to Appendix A of the OpenGL 1.5 Specification (Invariance)

    None.

Additions to the AGL/GLX/WGL Specifications

    None.

GLX Protocol

    None.

Errors

    The error INVALID_OPERATION is generated if Begin, or any command that
    implicitly calls Begin, is called when tessellation control programs are
    enabled and the currently bound tessellation control program object does
    not contain a valid program.

    The error INVALID_OPERATION is generated if Begin, or any command that
    implicitly calls Begin, is called when tessellation evaluation programs
    are enabled and the currently bound tessellation evaluation program object
    does not contain a valid program.

    The error INVALID_OPERATION is generated if Begin, or any command that
    implicitly calls Begin, is called when tessellation control programs are
    enabled and <mode> is not PATCHES_NV.

    The error INVALID_OPERATION is generated if Begin, or any command that
    implicitly calls Begin, is called when tessellation evaluation programs
    are enabled and <mode> is not PATCHES_NV.

New State

    (Modify ARB_vertex_program, Table X.6 -- Program State)

                                                     Initial
    Get Value                  Type    Get Command    Value  Description              Sec.    Attribute
    -------------------------  ----    -----------   ------- ------------------------ ------  ---------
    TESS_CONTROL_PROGRAM_NV     B      IsEnabled      FALSE  Tessellation control     2.14.6  enable
                                                             program enable
    TESS_EVALUATION_PROGRAM_NV  B      IsEnabled      FALSE  Tess. evaluation         2.14.6  enable
                                                             program enable

    TESS_CONTROL_PROGRAM_       Z+     GetIntegerv      0    Active tess control      2.14.1  -
      PARAMETER_BUFFER_NV                                    program buffer object
                                                             binding
    TESS_CONTROL_PROGRAM_       nxZ+   GetInteger-      0    Buffer objects bound for 2.14.1  -
      PARAMETER_BUFFER_NV              IndexedvEXT           tess. control program use

    TESS_EVALUATION_PROGRAM_    Z+     GetIntegerv      0    Active tess evaluation   2.14.1  -
      PARAMETER_BUFFER_NV                                    program buffer object
                                                             binding
    TESS_EVALUATION_PROGRAM_    nxZ+   GetInteger-      0    Buffer objects bound for 2.14.1  -
      PARAMETER_BUFFER_NV              IndexedvEXT           tess. eval. program use


    Additionally, some tessellation-related state applicable to this extension
    is added by ARB_tessellation_shader.

New Implementation Dependent State

                                                             Minimum
    Get Value                         Type  Get Command       Value   Description             Sec.     Attrib
    --------------------------------  ----  ---------------  -------  ----------------------- -------- ------
    MAX_PROGRAM_PATCH_ATTRIBS_NV       Z+   GetProgramivARB     30    number of generic patch 2.X.3.2    -
                                                                      attribute vectors
                                                                      supported

    Additionally, some tessellation-related state applicable to this extension
    is added by ARB_tessellation_shader.


Dependencies on ARB_tessellation_shader

    This spec incorporates the text of ARB_tessellation_shader in its
    entirety.  If ARB_tessellation_shader is not supported, language
    documenting GLSL tessellation control and evaluation shaders should be
    removed; tessellation would be available only using the assembly
    interface.  Language describing the operation of patch primitives and the
    tessellation primitive generator would be retained.

Dependencies on NV_parameter_buffer_object

    The NV_parameter_buffer_object (PaBO) extension provides the ability to
    bind buffer objects to be read by vertex, geometry, and fragment programs.

    If NV_parameter_buffer_object is supported, this extension adds the
    ability to bind buffer objects to be accessed by tessellation control and
    evaluation programs.  The NV_parameter_buffer_object should be modified to
    accept the enums TESS_CONTROL_PROGRAM_PARAMETER_BUFFER_NV and
    TESS_EVALUATION_PROGRAM_PARAMETER_BUFFER_NV where the three previously
    defined enums (for vertex, geometry, and fragment programs) are accepted.

    If NV_parameter_buffer_object is not supported, references to the two new
    buffer object binding points should be removed.

Issues

    (1) How does tessellation fit into the existing GL pipeline?

      RESOLVED:  See issue (1) in the ARB_tessellation_shader specification,
      which contains beautifully crafted ASCII art depicting the pipeline.

    (2) What other considerations were involved in the design of the
        tessellation API?

      RESOLVED:  Go look at the detailed issues section of the GLSL-based
      ARB_tessellation_shader specification.  There are a good number of
      issues that apply equally to the assembly APIs that won't be duplicated
      here.

    (3) Should the tessellation-related parameters (e.g., the primitive
        decomposition, spacing, vertex orientation) be context state or
        provided with the program?  If the latter, how should they be
        provided.

      RESOLVED:  We are providing declaration statements to specify each of
      these parameters in the tessellation evaluation program.  Because they
      are part of the program text, they can't be changed independently of the
      program.  We don't think that limitation is serious, and the same
      limitation applies to GLSL shaders (you need to re-link when changing
      these parameters).  

      Putting these declarations in the shader means that it wasn't necessary
      to create a new "tessellation parameter" API to set this state.  Such an
      API would only apply to assembly programs and could be a source of
      confusion if developers thought it might apply to GLSL shaders as well.

    (4) The programming model for tessellation control programs supports
        multiple threads, each providing attributes for a single vertex.  But
        it also supports the ability to read the per-vertex outputs written by
        other threads and to read and write shared per-patch attribute
        outputs.  The latter capabilities require some sort of synchronization
        to ensure consistently ordered reads and writes whenever possible.
        How should this be handled?

      RESOLVED:  We will expose a programming model where we run groups of <N>
      parallel threads in lock-step.  In this model, all <N> threads
      effectively retire one instruction before starting the next.  This
      execution model provides a simple abstraction, and provides an obvious
      instruction order allowing an application to avoid most read-write and
      write-write hazards.

      There are three places where we have explicitly undefined behavior:

        * If flow control diverges in an IF/ELSE/ENDIF block, the relative
          order of writes in the "IF" side of the block and those in the
          "ELSE" side of the block is undefined.

        * If multiple threads write different values to the same per-patch
          attribute in the same instruction, the order in which the writes
          land is undefined.

        * If any single instruction has one thread reading a per-vertex output
          or a per-patch attribute and another thread writing the same output,
          the order in which the reads and writes land is undefined.

      Implementations need not actually run the threads in this manner, as
      long as the compiler properly synchronizes threads at the points where
      execution order dependencies do occur.  Since the NV_gpu_program4
      programming model uses structured branching (e.g., IF/ELSE/ENDIF
      blocks), the points at which threads may diverge and converge again are
      easily identified.  We expect that the number of such synchronization
      points will be low for most tessellation control programs.

      One other approach considered is to limit the flow control model and the
      capabilities of the system to result in a minimal number of required
      synchronization points.  For example, the tessellation control program
      might be split into phases where the capabilities of each thread to
      access outputs would be limited.  For example, one might have a
      three-phase model like the following:

                      Per-Vertex Outputs           Per-Patch Outputs
         Phase     can read?    can write?      can read?    can write?
         -----     ---------    ----------      ---------    ----------
           1          NO           YES             NO           NO
           2          YES          NO              NO           YES(a)
           3          YES          NO              YES(a)       YES(b)

     In this model, there would be two explicit synchronization points --
     between each pair of phases.  The limits on access prevent most cases
     where conficts could occur (e.g., you can't read any per-vertex outputs
     until you're completely done writing all).  To further limit conflicts,
     per-patch attributes might be divided into two sets -- set (a) can be
     written only in phase 2 and read only in phase (3), and set (b) can be
     written only in phase 3.

     We decided to expose a general model on the grounds that having the
     compiler automatically determine possible synchronization points was easy
     enough.  Optimizing compilers that reorder instructions already have to
     deal with this exact type of issue -- they can't move instructions that
     write a variable past subsequent instructions that read it.

     The programming model adopted for GLSL in ARB_tessellation_shader
     similarly has a set of parallel threads running one executable, but it
     provides a barrier() call that serves as a synchronization point and can
     be used to split shader execution into phases.

     Note that while all previous OpenGL programmability extensions exposed a
     model of completely independent threads (i.e., one thread can't read the
     outputs of another), threads weren't always completely independent!  In
     fragment programs/shaders, some texture and all partial derivative
     built-ins (dFdx, dFdy in GLSL) require screen-space derivatives.  If the
     quantity used for derivatives is computed by the shader, OpenGL
     implementations generally run threads in groups arranged by screen-space
     location and approximate derivatives by computing differences of the
     inputs between threads.  This approach requires the same sort of
     automatic synchronization between threads, since derivatives implicitly
     read values computed by other threads.


Revision History

    Rev.    Date    Author    Changes
    ----  --------  --------  -----------------------------------------
     3    12/19/11  pbrown    Clarify that "primitive.tessouter[n]",
                              "primitive.tessinner[n]", and "primitive.
                              patch.attrib[n]" are not available on the input
                              patch for tessellation control programs.  Remove
                              stray language referring to a non-existent
                              vector tessellation level.

     2    03/22/10  pbrown    Rename references to ARB_tessellation_shader
                              (formerly EXT).  Minor other cleanups, including
                              the issues section.

     1              pbrown    Internal revisions.

