nes_ntsc 0.2.0: NES NTSC Video Filter
-------------------------------------
Author  : Shay Green <gblargg@gmail.com>
Website : http://www.slack.net/~ant/
Forum   : http://groups.google.com/group/blargg-sound-libs
License : GNU Lesser General Public License (LGPL)
Language: C or C++


Overview
--------
To perform NTSC filtering, first allocate memory for a nes_ntsc_t object
and call nes_ntsc_init(), then call nes_ntsc_blit() to perform
filtering. You can call nes_ntsc_init() at any time to change image
parameters.

Nes_ntsc_blit() reads NES pixels and writes RGB pixels (16-bit by
default). The NES pixels are raw palette values (0 to 0x3F) stored in
the 8-bit unsigned char type. If your emulator outputs RGB pixels and
can't easily be made to output raw NES palette indicies, use my
snes_ntsc library instead, which accepts 16-bit RGB pixels and will
produce the same result as this library. If you use snes_ntsc, read the
burst phase section below since it still applies.

For support of the three color emphasis bits at the top of PPU register
0x2001, use nes_ntsc_emph_t, nes_ntsc_init_emph(), and
nes_ntsc_blit_emph(). Input pixels must be stored in the 16-bit unsigned
short type to allow for the three emphasis bits stored just above the
6-bit palette index (9 bits per pixel):

    index = (ppu_2001 << 1 & 0x1C0) | (palette_index & 0x3F).


RGB Palette Generation
----------------------
An RGB palette can be generated for use in a normal blitter, either 64
colors (192 bytes) or including all 8 color emphasis combinations to
result in 512 colors (1536 bytes). The 64-color version matches the
format of the common NES .pal file. In your nes_ntsc_setup_t structure,
point palette_out to a buffer to hold the palette, then call
nes_ntsc_init() (nes_ntsc_init_emph() for the 512-color version). If you
only need the palette and aren't going to be using the NTSC filter, pass
0 for the first parameter.


Image Parameters
----------------
Many image parameters can be adjusted and presets are provided for
composite video, S-video, RGB, and monochrome. Most are floating-point
values with a general range of -1.0 to 1.0, where 0 is normal. The
ranges are adjusted so that one parameter at an extreme (-1 or +1) and
the rest at zero shouldn't result in any internal overflow (garbage
pixels). Setting multiple parameters to their extreme can produce
garbage. Put another way, the state space defined by all parameters
within the range -1 to +1 is not fully usable, but some extreme corners
are very useful so I don't want to reduce the parameter ranges.

The sharpness and resolution parameters have similar effects. Resolution
affects how crisp pixels are. Sharpness merely enhances the edges by
increasing contrast, which makes things brighter at the edges. Artifacts
sets how much "junk" is around the edges where colors and brightness
change in the image, where -1 completely eliminates them. (Color) bleed
affects how much colors blend together and the artifact colors at the
edges of pixels surrounded by black. (Color) fringing affects how much
color fringing occurs around the edges of bright objects, especially
white text on a black background.


Image Size
----------
Use the NES_NTSC_OUT_WIDTH() and NES_NTSC_IN_WIDTH() macros to convert
between input and output widths that the blitter uses. For example, if
you are blitting an image 256 pixels wide, use NES_NTSC_OUT_WIDTH( 256 )
to find out how many output pixels are written per row. Another example,
use NES_NTSC_IN_WIDTH( 640 ) to find how many input pixels will fit
within 640 output pixels. The blitter rounds the input width down in
some cases, so the requested width might not be possible. Use
NES_NTSC_IN_WIDTH( NES_NTSC_OUT_WIDTH( in_width ) ) to find what a given
in_width would be rounded down to.

For proper aspect ratio, the image generated by the library should be
doubled vertically. This can be done in software, as the demo does, done
in a custom blitter for more efficiency (see below), or using a modern
video card's hardware rescaling. On a TV pixels never have crisp edges,
so simply doubling scanlines doesn't give a very authentic image. I've
found reducing the doubled scanline's brightness to 75% looks
significantly better without darkening the image much. To drop
brightness to 75%, calculate 25% brightness using simple bit shifting
and masking, then subtract this from the original.


RGB Format
----------
By default, the blitters output 16-bit RGB pixels. To use a different
format, #define NES_NTSC_OUT_DEPTH to 15, 16, 24, or 32 (same as 24).
You can specify this either in the compiler command-line options

	-DNES_NTSC_OUT_DEPTH=32

or by wrapping nes_ntsc.c with your own source file and using this in
place of nes_ntsc.c.

	/* my_nes_ntsc.c */
	#define NES_NTSC_OUT_DEPTH 32
	#include "nes_ntsc.c"


Burst Phase
-----------
The burst_phase parameter to nes_ntsc_blit() should generally toggle
values between frames, i.e. 0 on first call to nes_ntsc_blit(), 1 on
second call, 0 on third call, 1 on fourth, etc. If merge_fields is
enabled (see below), you should always pass 0. Read further for more
detailed operation.

If you're using nes_ntsc_blit() to do partial screen updates,
burst_phase should be calculated as (burst_phase + row) % 3, where row
is the starting row (0 through 239). For example, if burst_phase is 1
for the current frame and you make two calls to nes_ntsc_blit() to blit
rows 0 to 100, then rows 101 to 239, for the first call you should pass
1 for burst_phase, and for the second call you should pass 0 for
burst_phase: (1 + 101) % 3 = 0.

For more accurate burst_phase operation, it should be adjusted at the
beginning of a frame based on the length of scanline 20: if 341 clocks
(normal), burst_phase = (burst_phase + 1) % 3, otherwise burst_phase =
(burst_phase + 2) % 3 (for shorter 340 clock scanline).

The included test ROMs verify proper burst_phase implementation. They
must pass in order; an earlier failing test means that later tests will
give meaningless results. The first two tests will pass with either
method of burst_phase handling described above; the third will only pass
with the more accurate handling. The tests flash sets of dots quickly
with the dot color being the only important aspect.

1.line_phase.nes - Tests for proper burst_phase on each scanline. All
dots on screen should be the same color.

2.frame_phase.nes - Tests for proper burst_phase toggling between
frames. Each row of dots should alternate between the same two colors
(if merge_fields is set to 1, they should all be the same color).

3.special_frame_phase.nes - Tests for proper burst_phase incrementing
between frames when $2001 rendering is enabled late in the frame. Each
rectangle of dots should be one color, and there should be three
different colors of rectangles (if merge_fields is set to 1, each
rectangle should be made of three colors of dots). There is a visual
glitch near the top of the screen for the first line of dots; this is
unrelated the test and should be ignored.


Flickering
----------
The displayed image toggles between two different pixel artifact
patterns at a steady rate, making it appear stable. For an emulator to
duplicate this effect, its frame rate must match the host monitor's
refresh rate, it must be synchronizing to the refresh (vsync), and it
must not be skipping any frames. If any of these don't hold, the image
will probably flicker much more than it would on a TV. It is important
that you play around with these factors to get a good feel for the
issue, and document it clearly for end-users, otherwise they will have
difficulty getting an authentic image.

The library includes a partial workaround for this issue, for the cases
where all the conditions can't be met. When merge_fields is set to 1,
nes_ntsc_blit() does the equivalent of blitting the image twice with the
two different phases and then mixes them together, but without any
performance impact. The result is similar to what you'd see if the
monitor's refresh rate were the same as the emulator's. It does reduce
the shimmer effect when scrolling, so it's not a complete solution to
the refresh rate issue.

The merge_fields option is also useful when taking a screenshot. If you
capture without merge_fields set to 1, you'll only get the even or odd
artifacts, which will make the image look more grainy than when the
emulator is running. Again, play around with this to get an idea of the
difference. It might be best to simply allow the user to choose when to
enable this option.

Note that when you have merge_fields set to 1, you should always pass 0
for the burst_phase parameter to nes_ntsc_blit(). If you don't, you'll
still get some flicker.


Custom Blitter
--------------
You can write your own blitter, allowing customization of how NES pixels
are obtained (you might want to run through a palette first), the format
of output pixels (15, 16, or 32-bit RGB), optimizations for your
platform, and additional effects like efficient scanline doubling during
blitting.

Macros are included in nes_ntsc.h for writing your blitter so that your
code can be carried over without changes to improved versions of the
library. The default blitters at the end of nes_ntsc.c show how to use
the macros. For an example of a flexible blitter written in C++ and
optimized for the x86 architecture, refer to Nestopia 1.28's
NstVideoFilterNtsc.hpp.

NES_NTSC_BEGIN_ROW macro allows starting up to three pixels. The first
pixel is cut off; its use is in specifying a background color other than
black for the sliver on the left edge. The next two pixels can be used
to handle the extra one or two pixels not handled by the main chunks of
three pixels. For example if you want to blit 257 NES pixels on a row
(for whatever odd reason), you would start the first two with
NES_NTSC_BEGIN_ROW( ... nes_ntsc_black, nes_in [0], nes_in [1] ), then
do the remaining 255 in chunks of three (255 / 3 = 85.0).


Limitations
-----------
The library's horizontal rescaling is too wide by about 3% in order to
allow a much more optimal implementation. This means that a 256 pixel
wide NES image should appear as 581 output pixels, but with this library
appears as 602 output pixels. TV aspect ratios probably vary by this
much anyway. If you really need unscaled output, contact me and I'll see
about adding it.


Thanks
------
Thanks to NewRisingSun for his original code, which was a starting point
for me learning about NTSC video and decoding. Thanks to the Nesdev
forum for feedback and encouragement. Thanks to byuu (bsnes author) and
pagefault (ZSNES team) for integrating a SNES version of this in their
emulators. Thanks to Martin Freij (Nestopia author) for using an earlier
development version in Nestopia and for feedback about the custom
blitter interface. Thanks to Charles MacDonald for feedback on the SMS
version's interface.

-- 
Shay Green <gblargg@gmail.com>
