    --------------------------------------------------------------------------
                               GC FIFO Details.
    --------------------------------------------------------------------------

    This document was created, after hard reversing of GX fifo module. It was
    not easy to handle whole thing, but I made it :) Hope it will be useful
    for all, who was stuck on GX fifo.

    Send your comments or questions to kvzorganic@mail.ru.

    --------------------------------------------------------------------------

    Thats how I understand GC graphics fifo logic.

    There are two fifos : CPU (PI) and GP (CP). They are both implemented as
    hardware circular buffers, and each fifo has its own registers set.

    CP registers are used to control GP fifo flow and PI registers are used
    for CPU fifo.
    
    Fifos can run in two modes: immediate and multibuffer. CPU fifo is used 
    together with GP fifo in multibuffer mode to imitate N64 double-buffered 
    display lists, for easy N64's games porting. 

    Immediate fifo mode is same as "linked" fifo mode.

    Details of GP Fifo
    ------------------

    Important detail : GP command proccessor (which is inside ArtX GP logic, 
    not 'CP' interface) always reading data by 32-byte chunks. Im pretty sure,
    that it doesnt matter, when sent 32-byte "packet" is teared in middle
    of it, as example 'float' vertex X-position data. GP will stalls all
    commands execution, until it get full set of current primitive or its
    parameters. I dont think, that it has cache for big primitives, like 
    triangle-strips, its just stalls, until dont get whole current vertex 
    data to render. But Im sure, emulators can collect whole single primitive
    in accumulation buffer to draw it all at once - it will be much faster :)

    The 'read idle' and 'command idle' bits are showing GP state. When GP
    have no new data to read - 'read idle' is set. And when GP command processor
    is not busy by drawing primitive, or executing command, 'command idle' is
    set. When both bits are cleared, then GP is free and ready for action.
    Note : 'read idle' can be set in breakpoint condition, when GP stalls
    until breakpoint is not cleared.

    If 'GP read' bit enabled in CP control register, GP logic starts to read
    fifo data by 32-byte chunks, using CP read pointer. When CP read pointer
    wraps 'top' address, it automatically changes back to 'base'. No interrupts
    generated.

    If 'GP link' bit enabled in CP CR, following features are become enabled :
    breakpoint, low watermark and high watermark. These features are used only
    in immediate ('linked') mode.

    Conditions for immediate fifo mode :

    Breakpoint : if CP read pointer reaches breakpoint address, CP generates
    breakpoint interrupt. GP stops reading fifo data, until breakpoint is not
    cleared. Actually this condition can be set even in multibuffer mode, but 
    there is no reason to use it.

    Low watermark : if CP read pointer reaches low watermark address, CP
    generates low watermark interrupt.

    High watermark : if CP write pointer reaches high watermark address, CP
    generates high watermark interrupt.

    Watermarks are used to sync GP and CPU. When high watermark is asserted,
    CPU is stopping to write any commands/data to fifo, until GP reading it.
    And when GP reaches low watermark, CPU will continue to write data. It is
    also used to protect fifo data to be overwritten by CPU. On real hardware
    GP reading will never overdrive CPU writing, so they work finely together
    in single circular fifo buffer.

    When CP write pointer wraps its 'top' address, it automatically changes 
    back to 'base'. No interrupts generated. Actually CP write pointer is dummy
    in fifo processing, because GP can only read data, its used only to
    calculate CP fifo "stride" = wrptr - rdptr. I believe, CP write pointer
    is always the same as PI write pointer in immediate mode.

    All pointers should be 32-byte aligned, when set.

    Details of CPU Fifo
    -------------------

    CPU Fifo is used only to write data. The application of CPU fifo is to
    create display lists, while GP is executing others from CP fifo (in
    multibuffer mode).

    When CPU fifo wraps the top fifo buffer, special 'wrap' bit is setting
    in PI write pointer register. When CPU writes any new fifo data after
    wrap, this bit is cleared. No interrupts generated.

    All pointers should be 32-byte aligned, when set.

    Application of write gather buffer
    ----------------------------------

    Write gather buffer is used to collect and then quickly blast 32-byte
    memory chunks, during any CPU writes. Thats why all fifo's are aligned
    to 32-byte boundary. See YAGCD or any other Gekko manuals for more
    information about it.

    There are two applications for gather buffer. In first case it is used
    together with PI fifo, otherwise it is assigned to be "redirected" and
    used for other purposes. When its pointed to 0x0C008000, its said, that 
    gather buffer is not redirected. Well, I havent seen yet, that gather
    buffer was redirected by any programs, and you can think, that its used
    only together with 0x0C008000.

    So whats exactly 0x0C008000 (or 0xCC008000, when used with Dolphin OS) ?
    Thats simply hardware "pointer to pointer". When you're writing any 
    data through the bus to this address, Flipper hardware (PI) is redirecting
    all writes to PI write pointer, and controlling further fifo flow. So 
    programmer dont need to care about increasing pointers, wrapping fifo
    from the top to begin etc.

    --------------------------------------------------------------------------

    Appendix : Hardware registers.

    Related hardware registers and bit layouts (cut from Dolwin source code) :

    // CP registers (16-bit access)
    #define CP_SR       (0x0C000000)
    #define CP_CR       (0x0C000002)
    #define CP_CLR      (0x0C000004)
    #define CP_BASE     (0x0C000020)
    #define CP_TOP      (0x0C000024)
    #define CP_HIWMARK  (0x0C000028)
    #define CP_LOWMARK  (0x0C00002C)
    #define CP_CNT      (0x0C000030)
    #define CP_WRPTR    (0x0C000034)
    #define CP_RDPTR    (0x0C000038)
    #define CP_BPPTR    (0x0C00003C)

    // PI registers (32-bit access)
    #define PI_BASE     (0x0C00300C)
    #define PI_TOP      (0x0C003010)
    #define PI_WRPTR    (0x0C003014)

    // CP status register
    #define CP_SR_OVF       (1 << 0)
    #define CP_SR_UVF       (1 << 1)
    #define CP_SR_RD_IDLE   (1 << 2)    // read idle
    #define CP_SR_CMD_IDLE  (1 << 3)    // command idle
    #define CP_SR_BPINT     (1 << 4)

    // CP control register
    #define CP_CR_RDEN      (1 << 0)
    #define CP_CR_BPCLR     (1 << 1)
    #define CP_CR_OVFEN     (1 << 2)
    #define CP_CR_UVFEN     (1 << 3)
    #define CP_CR_LINK      (1 << 4)
    #define CP_CR_BPEN      (1 << 5)

    // CP clear register
    #define CP_CLR_OVFCLR   (1 << 0)
    #define CP_CLR_UVFCLR   (1 << 1)

    // PE status register
    #define PE_SR_DONE      (1 << 0)
    #define PE_SR_TOKEN     (1 << 1)
    #define PE_SR_DONEMSK   (1 << 2)
    #define PE_SR_TOKENMSK  (1 << 3)

    // PI wrap bit
    #define PI_WRPTR_WRAP   (1 << 27)

    --------------------------------------------------------------------------

    Referencies :

        gxfifo.txt      - GX fifo module reversing.
        tips.txt        - For fifo emulation tips.

    This document is maintaining by Dolwin team.
    Last edited by org, 16 Aug 2004
