Author Topic: ReGBA, GBA emulator version 1.45.5 (hardware scaling) (Read 206020 times)

Nebuleon · « **Reply #160 on:** November 21, 2013, 08:09:45 pm »

Quote from: caiolreboucas on November 20, 2013, 10:37:57 pm

Nebuleon, thanks for all the hard work! Your REGBA runs smooth on my ds! Is there any chance we will receive a PSP port?

While this appears to be ironic at first sight ("there's a ReGBA for PSP; it's gpSP"), I see what you mean: you're asking if there's a port of gpSP with the improvements of ReGBA on the PSP.

I do not have a PSP, and even if I had the PSP SDK, I would essentially have to make a blind port, compile it and hope it works on a PSP first try. However, the PSP does have a head start: it uses a MIPS32 kinda-R2 processor, just like the GCW Zero.

If you are interested in a PSP port, you (or someone else who's also interested) could take the original gpSP-Exophase/PSP sources, copy the GUI, adapt the Makefile and hook up the per-platform functions ReGBA requires (e.g. the function that tells ReGBA what file a saved state at slot #n should be, the one that allocates memory for the GBA ROM, and so on). You'd find these declarations in common.h. A few definitions are already in the source/psp directory, but they are quite incomplete. I'd be happy to accept patches from interested people!

K-77 · « **Reply #161 on:** November 21, 2013, 10:15:32 pm »

For those of you which can't pass the Kingdom Hearts first cutscene, you can play it on pc emulator then save after and copy save file to your console.

Skyline969 · « **Reply #162 on:** November 21, 2013, 10:31:33 pm »

Quote from: K-77 on November 21, 2013, 10:15:32 pm

For those of you which can't pass the Kingdom Hearts first cutscene, you can play it on pc emulator then save after and copy save file to your console.

I believe I did this on gpSP, but it still works - when the freeze happens, use a save state. Close and re-open the game, and then load from the save state. The video should progress after that.

K-77 · « **Reply #163 on:** November 21, 2013, 11:09:03 pm »

Here you can find a save file if needed. Just put it in .gpsp folder in home directory.

Code: [Select]

http://www.qfpost.com/file/d?g=mcTzMwann

kuwanger · « **Reply #164 on:** November 24, 2013, 03:42:21 am »

So, yeah, yeah, subpixel rendering again. It's not exactly a perfect fit, but it does make blurry text more readibly at the expense of the whole Apple II colored text look (which also did the whole subpixel thing too).

Anyways, here's most the code (minus the bits of adding the "Sub-Pixel" option to the menu since that's trivial):

Code: [Select]

static inline uint16_t bgr555_to_rgb565_16(uint16_t px)
{
 return ((px & 0x7c00) >> 10)
   | ((px & 0x03e0) << 1)
   | ((px & 0x001f) << 11);
}

#define X(a,b) ((a & 0xf800) | (b & 0x07ff))
#define Y(b,c) ((b & 0xffe0) | (c & 0x001f))
#define Z(A,B) ((((A) >> 1) & 0x7bef) + (((B) >> 1) & 0x7bef))

/* Upscales an image based on subpixel rendering; also does color conversion
 * using the function above.
 * Input:
 *   from: A pointer to the pixels member of a src_x by src_y surface to be
 *     read by this function. The pixel format of this surface is XBGR 1555.
 *   src_x: The width of the source.
 *   src_y: The height of the source.
 *   src_pitch: The number of bytes making up a scanline in the source
 *     surface.
 *   dst_pitch: The number of bytes making up a scanline in the destination
 *     surface.
 * Output:
 *   to: A pointer to the pixels member of a (src_x * 4/3) by (src_y * 4/3)
 *     surface to be filled with the upscaled GBA image. The pixel format of
 *     this surface is RGB 565.
 */
static inline void gba_upscale_subpixel(uint16_t *to, uint16_t *from,
   uint32_t src_x, uint32_t src_y, uint32_t src_pitch, uint32_t dst_pitch)
{
 /* Before:
  *    a b c
  *    d e f
  * After (multiple letters = (average)/subpixel overlap):
  *    a    ab        bc      c
  *    (ad) (ad)(be) (be)(cf) (cf)
  *    d    de       ef       f
  */
 uint16_t a, b, c, d, e, f;
 uint16_t *src, *dst;

 uint32_t x, y;

 const uint32_t sp = src_pitch / sizeof(uint16_t), dp = dst_pitch / sizeof(uint16_t);

 for (y = 0; y < src_y/2; y++) {
  src = from;
  dst = to;
  for (x = 0; x < src_x/3; x++) {
   a = bgr555_to_rgb565_16(src[0]);
   b = bgr555_to_rgb565_16(src[1]);
   c = bgr555_to_rgb565_16(src[2]);
   d = bgr555_to_rgb565_16(src[sp]);
   e = bgr555_to_rgb565_16(src[sp+1]);
   f = bgr555_to_rgb565_16(src[sp+2]);

   dst[0]    = a;      dst[1]      = X(a,b);           dst[2]      = Y(b,c);           dst[3]      = c;
   dst[dp]   = Z(a,d); dst[dp+1]   = X(Z(a,d),Z(b,e)); dst[dp+2]   = Y(Z(b,e),Z(c,f)); dst[dp+3]   = Z(c,f);
   dst[dp*2] = d;      dst[dp*2+1] = X(d,e);           dst[dp*2+2] = Y(e,f);           dst[dp*2+3] = f;

   src += 3;
   dst += 4;
  }
  from = (uint16_t *) (((uint8_t *) from) + src_pitch * 2);
  to   = (uint16_t *) (((uint8_t *) to  ) + dst_pitch * 3);
 }
}

And here's the package: regba 1.45 w/ sub-pixel rendering

Nebuleon · « **Reply #165 on:** November 24, 2013, 07:56:16 am »

I've made ReGBA version 1.45.1 with two "rainbow fringe" subpixel scalers: kuwanger's full-screen one and an aspect-ratio-preserving one based on it.

Download: https://dl.dropboxusercontent.com/u/106475413/gcw-zero/regba-1.45.1.opk

(For the record, I still prefer the original one

- and I think this one is way too rainbow-fringey, it should average it more like FreeType)

TimeDevouncer · « **Reply #166 on:** November 24, 2013, 04:16:23 pm »

Thanks fot the update

.

I still prefer "None scaling" too

kuwanger · « **Reply #167 on:** November 24, 2013, 04:24:33 pm »

Something more like this, perhaps?

Code: [Select]

#define X(a,b) ((a & 0xf800) | (b & 0x07ff))
#define Y(b,c) ((b & 0xffe0) | (c & 0x001f))
#define Z(A,B) ((((A) >> 1) & 0x7bef) + (((B) >> 1) & 0x7bef))

static inline void gba_upscale_mostly_subpixel(uint16_t *to, uint16_t *from,
    uint32_t src_x, uint32_t src_y, uint32_t src_pitch, uint32_t dst_pitch)
{
  /* Before:
   *    a b c
   *    d e f
   * After (multiple letters = (average)/subpixel overlap):
   *    a    ab        bc      c
   *    (ad) (be) (be) (cf)
   *    d    de       ef       f
   */
  uint16_t a, b, c, d, e, f;
  uint16_t *src, *dst;

  uint32_t x, y;

  const uint32_t sp = src_pitch / sizeof(uint16_t), dp = dst_pitch / sizeof(uint16_t);

  for (y = 0; y < src_y/2; y++) {
    src = from;
    dst = to;
    for (x = 0; x < src_x/3; x++) {
      a = bgr555_to_rgb565_16(src[0]);
      b = bgr555_to_rgb565_16(src[1]);
      c = bgr555_to_rgb565_16(src[2]);
      d = bgr555_to_rgb565_16(src[sp]);
      e = bgr555_to_rgb565_16(src[sp+1]);
      f = bgr555_to_rgb565_16(src[sp+2]);
      dst[0]    = a;      dst[1]      = X(a,b);     dst[2]      = Y(b,c);     dst[3]      = c;
      dst[dp]   = Z(a,d); dst[dp+1]   = Z(b,e); dst[dp+2]   = Z(b,e); dst[dp+3]   = Z(c,f);
      dst[dp*2] = d;      dst[dp*2+1] = X(d,e);     dst[dp*2+2] = Y(e,f);     dst[dp*2+3] = f;
      src += 3;
      dst += 4;
    }
    from = (uint16_t *) (((uint8_t *) from) + src_pitch * 2);
    to   = (uint16_t *) (((uint8_t *) to  ) + dst_pitch * 3);
  }
}

And, yea, the rainbow-ringy thing is rather annoying. But then you have games like Scurge Hive whose status bar text is blurry under the original. My various attempts to blur/average other ways has either (a) resulted in equally blur text, (b) left/right or top/bottom ripples, or (c) a grid of dots. I'm sure there's a better way to do it. I'm just not readily aware of that way.

segakiki · « **Reply #168 on:** November 24, 2013, 07:08:11 pm »

Hey kuwanger, would it be possible to make a custom scaler for snes9x?
It's a great emu but the fullscreen blurriness really puts me off using it.

Not in this thread, Sonic.

Nebuleon · « **Reply #169 on:** November 24, 2013, 10:21:04 pm »

Quote from: kuwanger on November 24, 2013, 04:24:33 pm

Something more like this, perhaps?

<code>

And, yea, the rainbow-ringy thing is rather annoying. But then you have games like Scurge Hive whose status bar text is blurry under the original. My various attempts to blur/average other ways has either (a) resulted in equally blur text, (b) left/right or top/bottom ripples, or (c) a grid of dots. I'm sure there's a better way to do it. I'm just not readily aware of that way.

It still appears to be rainbow-fringey (and worse, now I can see a grid of dots every 3 by 3 pixels). What I meant was more like the technique most commonly known as "ClearType" today, where the spatial anti-aliasing applies to subpixels like they were whole new pixels. So your yellow would really be a kind of dark orange, and your cyan would really be a kind of dark blue, after averaging with all the new pixels.

http://en.wikipedia.org/wiki/File:Subpixel_demonstration_%28Quartz%29.png

kuwanger · « **Reply #170 on:** November 25, 2013, 04:38:06 am »

Third time's a charm, maybe?

Code: [Select]

/*  Takes pixels a, b, and c and writes out four pixels to out.
 *   Does a blend of pixels based upon R G B order of LCD.
 *   Note:  Left most and right most pixels have extra red from a and blue from c, respectively,
 *             as the alternative is to blend from pixels to the left and right of the grouping
 */
static inline void F(uint16_t *out, uint16_t a, uint16_t b, uint16_t c)
{
  *out++ = (((a & 0xf800)*3/4) & 0xf800) |
           (((a & 0x07e0)*3/4) & 0x07e0) |
            ((a & 0x001f)*3/4);
 *out++ = ( ((a >> 1)&0x7800) + ((b >> 2)&0x3800) ) |
           ( ((a >> 2)&0x01e0) + ((b >> 1)&0x03e0) ) |
            ((b & 0x001f)*3/4);
  *out++ = (((b & 0xf800)*3/4) & 0xf800) |
           ( ((b >> 1)&0x03e0) + ((c >> 2)&0x01e0) ) |
           ( ((b >> 2)&0x0007) + ((c >> 1)&0x000f) );
  *out++ = (((c & 0xf800)*3/4) & 0xf800) |
           (((c & 0x07e0)*3/4) & 0x07e0) |
            ((c & 0x001f)*3/4);

}
static inline void gba_upscale_subpixel(uint16_t *to, uint16_t *from,
   uint32_t src_x, uint32_t src_y, uint32_t src_pitch, uint32_t dst_pitch)
{
 /* Before:
    RRRR   RRRRrr   rrrrrr   RRRRRR   RR
     GGGGGG   GGgggg   ggggGG   GGGGGG
   BB   BBBBBB   bbbbbb   bbBBBB   BBBB
  * After (merges r/R, g/G, b/B groups into four pixels)
    RR GG BB rr gg bb RR GG BB rr gg bb 
    RR GG BB rr gg bb RR GG BB rr gg bb 
    RR GG BB rr gg bb RR GG BB rr gg bb 
         */
 const uint32_t dst_x = src_x * 4 / 3;
 uint16_t a, b, c, d, e, f;
 uint16_t *src, *dst;

 uint32_t x, y;

 const uint32_t sp = src_pitch / sizeof(uint16_t), dp = dst_pitch / sizeof(uint16_t);

 for (y = 0; y < src_y/2; y++) {
  src = from;
  dst = to;
  for (x = 0; x < src_x/3; x++) {
   a = bgr555_to_rgb565_16(src[0]);
   b = bgr555_to_rgb565_16(src[1]);
   c = bgr555_to_rgb565_16(src[2]);
   d = bgr555_to_rgb565_16(src[sp]);
   e = bgr555_to_rgb565_16(src[sp+1]);
   f = bgr555_to_rgb565_16(src[sp+2]);

   F(dst, a, b, c);
   F(&dst[dp], Z(a,d), Z(b,e), Z(c,f));
   F(&dst[dp*2], d, e, f);
   src += 3;
   dst += 4;
  }
  from = (uint16_t *) (((uint8_t *) from) + src_pitch * 2);
  to   = (uint16_t *) (((uint8_t *) to  ) + dst_pitch * 3);
 }
}

Note, this results in a darker image which might be more of the reason for the less rainbow fringe than anything, but then it could just be me seeing things. :/

Nebuleon · « **Reply #171 on:** November 25, 2013, 05:55:52 am »

Quote from: kuwanger on November 25, 2013, 04:38:06 am

Third time's a charm, maybe?

<code>

Note, this results in a darker image which might be more of the reason for the less rainbow fringe than anything, but then it could just be me seeing things. :/

Yeah, must be the *3/4 everywhere. But if I take a screenshot of the full-screen sub-pixel scaler now, and raise its brightness with an image editing program, there is much less rainbow fringe, so maybe this is good and just needs some adjustment:

kuwanger · « **Reply #172 on:** November 25, 2013, 04:16:53 pm »

A little bit more blending and further brightened:

Code: [Select]

static inline void F0(uint16_t *out, uint16_t a, uint16_t b, uint16_t c)
{
 *out++ = ( ((a&0xf800)*2/3) & 0xf800 ) |
             (a&0x07e0) |
             (a&0x001f);
 *out++ = ( ((a&0xf800)*2/3 + (b&0xf800)/3)   & 0xf800 ) |
           ( ((a&0x07e0)/3   + (b&0x07e0)*2/3) & 0x07e0 ) |
             (b&0x001f);
  *out++ = (b & 0xf800) |
           ( ((b&0x07e0)*2/3 + (c&0x07e0)/3)   & 0x07e0 ) |
           ( ((b&0x001f)/3   + (c&0x001f)*2/3) & 0x001f );
  *out++ = c;
}

static inline void F(uint16_t *out, uint16_t z, uint16_t a, uint16_t b, uint16_t c)
{
  *out++ = (z & 0xf800) |
           (z & 0x07e0) |
           ( ((z&0x001f)*2/3 + (a&0x001f)/3)   & 0x001f );
 *out++ = ( ((z&0xf800)/3   + (a&0xf800)*2/3) & 0xf800 ) |
             (a&0x07e0) |
             (a&0x001f);
 *out++ = ( ((a&0xf800)*2/3 + (b&0xf800)/3)   & 0xf800 ) |
           ( ((a&0x07e0)/3   + (b&0x07e0)*2/3) & 0x07e0 ) |
             (b&0x001f);
  *out++ = (b & 0xf800) |
           ( ((b&0x07e0)*2/3 + (c&0x07e0)/3)   & 0x07e0 ) |
           ( ((b&0x001f)/3   + (c&0x001f)*2/3) & 0x001f );
 *out++ = c;
}

...
static inline void gba_upscale_subpixel(uint16_t *to, uint16_t *from,
   uint32_t src_x, uint32_t src_y, uint32_t src_pitch, uint32_t dst_pitch)
{

...
 for (y = 0; y < src_y/2; y++) {
  src = from;
  dst = to;
  for (x = 0; x < src_x/3; x++) {
   a = bgr555_to_rgb565_16(src[0]);
   b = bgr555_to_rgb565_16(src[1]);
   c = bgr555_to_rgb565_16(src[2]);
   d = bgr555_to_rgb565_16(src[sp]);
   e = bgr555_to_rgb565_16(src[sp+1]);
   f = bgr555_to_rgb565_16(src[sp+2]);

   if (x == 0) {
    F0(dst, a, b, c);
    F0(&dst[dp], Z(a,d), Z(b,e), Z(c,f));
    F0(&dst[dp*2], d, e, f);
   } else {
    F(dst-1, dst[-1], a, b, c);
    F(&dst[dp-1], dst[dp-1], Z(a,d), Z(b,e), Z(c,f));
    F(&dst[dp*2-1], dst[dp*2-1], d, e, f);
   }
...

There's still that noticeable pattern unfortunately, but I think it looks better. :/

Edit: I've been playing with the scaler for a while. Trying to eliminate the pattern directly seems to restore the fringe (which makes sense since the whole point of sub-pixel rendering is to add a fringe that blends in, but it obviously doesn't do well with large solid blocks of the same color). So, I decided to use a PocketNES trick and do dual-frame blending. It gets rid of the pattern, but it does cause more of a visible fringe I think. :/ Feel free to toy with this.

Code: [Select]

static inline uint16_t bgr555_to_rgb565_16(uint16_t px)
{
 return ((px & 0x7c00) >> 10)
   | ((px & 0x03e0) << 1)
   | ((px & 0x001f) << 11);
}

/* Tries to keep the least significant bits when averaging to avoid rippling */
#define Z0_(A,B) (((((A) >> 1) & 0x7bef) + (((B) >> 1) & 0x7bef)))
#define Z0(A,B) (Z0_(A,B) | ((Z0_(A,B)>>3) & 0x1803) | ((Z0_(A,B)>>4) & 0x60))

/* Directly uses A if A == B or mix the two. */
#define Z(A,B) ((A == B) ? A : Z0(A,B))

/* These macros define mixing RGB components from A then B with always more from A
 *  than B.  So,  M210 == 3/3 red A + 1/3 red, 1/3 green A + 2/3 green B, 3/3 blue B
 *  Note:  M233 and M123 are special, only using A as they're used on the left border of the
 *  screen and hence there's nothing else to mix with.

#define M333(A,B) (A)
#define M233(A) ( ((((A)&0xF800)*2/3) & 0xF800) | ((A) & 0x07FF) )
#define M123(A) ( ((((A)&0xF800)/3) & 0xF800) | \
                    ((((A)&0x07E0)*2/3) & 0x07E0) | \
                     ((A) & 0x001F) )
#define M210(A,B) ( ((((A)&0xF800)*2/3 + ((B)&0xF800)/3) & 0xF800) | \
                    ((((A)&0x07E0)/3 + ((B)&0x07E0)*2/3) & 0x07E0) | \
                    ((B) & 0x001F) )
#define M321(A,B) ( ((A)&0xF800) | \
                    ((((A)&0x07E0)*2/3 + ((B)&0x07E0)/3) & 0x07E0) | \
                    ((((A)&0x001F)/3 + ((B)&0x001F)*2/3) & 0x001F) )
#define M332(A,B) ( ((A) & 0xFFE0) | \
                    ((((A)&0x001F)*2/3 + ((B)&0x001F)/3) & 0x001F) )
#define M100(A,B) ( ((((A)&0xF800)/3 + ((A)&0xF800)*2/3) & 0xF800) | \
                    ((B) & 0x07FF) )

static inline void F0(uint16_t *out, uint16_t a, uint16_t b, uint16_t c)
{
 *out++ = M233(a);
 *out++ = M210(a,b);
  *out++ = M321(b,c);
 *out++ = c;
}

static inline void F1(uint16_t *out, uint16_t a, uint16_t b, uint16_t c)
{
 *out++ = M123(a);
 *out++ = M321(b,c);
  *out++ = M332(b,c);
 *out++ = c;
}

static inline void F2(uint16_t *out, uint16_t a, uint16_t b, uint16_t c)
{
 *out++ = M321(a,b);
 *out++ = M332(b,c);
  *out++ = M100(b,c);
 *out++ = c;
}

static inline void F_0(uint16_t *out, uint16_t z, uint16_t a, uint16_t b, uint16_t c)
{
  *out++ = M332(z,a);
 *out++ = M100(z,a);
 *out++ = M210(a,b);
  *out++ = M321(b,c);
 *out++ = c;
}

static inline void F_1(uint16_t *out, uint16_t y, uint16_t z, uint16_t a, uint16_t b, uint16_t c)
{
  *out++ = M100(y,z);
 *out++ = M210(z,a);
 *out++ = M321(a,b);
  *out++ = M332(b,c);
 *out++ = c;
}

static inline void F_2(uint16_t *out, uint16_t z, uint16_t a, uint16_t b, uint16_t c)
{
  *out++ = M210(z,a);
 *out++ = M321(a,b);
 *out++ = M332(b,c);
  *out++ = M100(b,c);
 *out++ = c;
}

static inline void gba_upscale_subpixel(uint16_t *to, uint16_t *from,
   uint32_t src_x, uint32_t src_y, uint32_t src_pitch, uint32_t dst_pitch)
{
 /* Before:
  * RRRR   RRRRrr   rrrrrr   RRRRRR   RR
  *  GGGGGG   GGgggg   ggggGG   GGGGGG
  *BB   BBBBBB   bbbbbb   bbBBBB   BBBB
  * After (merges r/R, g/G, b/B groups into four pixels)
  * RR GG BB rr gg bb RR GG BB rr gg bb 
  * RR GG BB rr gg bb RR GG BB rr gg bb 
  * RR GG BB rr gg bb RR GG BB rr gg bb 
   */
 const uint32_t dst_x = src_x * 4 / 3;
 uint16_t a, b, c, d, e, f;
 uint16_t *src, *dst;

 uint32_t x, y;

 static int frame = 0;

 frame = (frame + 1) % 3;

 const uint32_t sp = src_pitch / sizeof(uint16_t), dp = dst_pitch / sizeof(uint16_t);

 for (y = 0; y < src_y/2; y++) {
  src = from;
  dst = to;
  if (frame == 0) {
   for (x = 0; x < src_x/3; x++) {
    a = bgr555_to_rgb565_16(src[0]);
    b = bgr555_to_rgb565_16(src[1]);
    c = bgr555_to_rgb565_16(src[2]);
    d = bgr555_to_rgb565_16(src[sp]);
    e = bgr555_to_rgb565_16(src[sp+1]);
    f = bgr555_to_rgb565_16(src[sp+2]);

    if (x == 0) {
     F0(dst, a, b, c);
     F0(&dst[dp*2], d, e, f);
     dst[dp]   = Z(dst[0],dst[dp*2]);
     dst[dp+1] = Z(dst[1],dst[dp*2+1]);
     dst[dp+2] = Z(dst[2],dst[dp*2+2]);
     dst[dp+3] = Z(dst[3],dst[dp*2+3]);
    } else {
     F_0(dst-1, dst[-1], a, b, c);
     F_0(&dst[dp*2-1], dst[dp*2-1], d, e, f);
     dst[dp-1] = Z(dst[-1],dst[dp*2-1]);
     dst[dp]   = Z(dst[0],dst[dp*2]);
     dst[dp+1] = Z(dst[1],dst[dp*2+1]);
     dst[dp+2] = Z(dst[2],dst[dp*2+2]);
     dst[dp+3] = Z(dst[3],dst[dp*2+3]);
    }

    src += 3;
    dst += 4;
   }
  } else if (frame == 1) {
  for (x = 0; x < src_x/3; x++) {
   a = bgr555_to_rgb565_16(src[0]);
   b = bgr555_to_rgb565_16(src[1]);
   c = bgr555_to_rgb565_16(src[2]);
   d = bgr555_to_rgb565_16(src[sp]);
   e = bgr555_to_rgb565_16(src[sp+1]);
   f = bgr555_to_rgb565_16(src[sp+2]);

   if (x == 0) {
    F1(dst, a, b, c);
//    F1(&dst[dp], Z0(a,d), Z0(b, e), Z0(c, f));
    F1(&dst[dp*2], d, e, f);
    dst[dp]   = Z(dst[0],dst[dp*2]);
    dst[dp+1] = Z(dst[1],dst[dp*2+1]);
    dst[dp+2] = Z(dst[2],dst[dp*2+2]);
    dst[dp+3] = Z(dst[3],dst[dp*2+3]);
   } else {
    F_1(dst-1, dst[-2], dst[-1], a, b, c);
//    F_1(&dst[dp-1], dst[dp-2], dst[dp-1], Z0(a,d), Z0(b,e), Z0(c,f));
    F_1(&dst[dp*2-1], dst[dp*2-2], dst[dp*2-1], d, e, f);
    dst[dp-1] = Z(dst[-1],dst[dp*2-1]);
    dst[dp]   = Z(dst[0],dst[dp*2]);
    dst[dp+1] = Z(dst[1],dst[dp*2+1]);
    dst[dp+2] = Z(dst[2],dst[dp*2+2]);
    dst[dp+3] = Z(dst[3],dst[dp*2+3]);
   }

   src += 3;
   dst += 4;
  }
  } else {
  for (x = 0; x < src_x/3; x++) {
   a = bgr555_to_rgb565_16(src[0]);
   b = bgr555_to_rgb565_16(src[1]);
   c = bgr555_to_rgb565_16(src[2]);
   d = bgr555_to_rgb565_16(src[sp]);
   e = bgr555_to_rgb565_16(src[sp+1]);
   f = bgr555_to_rgb565_16(src[sp+2]);

   if (x == 0) {
    F2(dst, a, b, c);
//    F1(&dst[dp], Z0(a,d), Z0(b, e), Z0(c, f));
    F2(&dst[dp*2], d, e, f);
    dst[dp]   = Z(dst[0],dst[dp*2]);
    dst[dp+1] = Z(dst[1],dst[dp*2+1]);
    dst[dp+2] = Z(dst[2],dst[dp*2+2]);
    dst[dp+3] = Z(dst[3],dst[dp*2+3]);
   } else {
    F_2(dst-1, dst[-1], a, b, c);
//    F_1(&dst[dp-1], dst[dp-2], dst[dp-1], Z0(a,d), Z0(b,e), Z0(c,f));
    F_2(&dst[dp*2-1], dst[dp*2-1], d, e, f);
    dst[dp-1] = Z(dst[-1],dst[dp*2-1]);
    dst[dp]   = Z(dst[0],dst[dp*2]);
    dst[dp+1] = Z(dst[1],dst[dp*2+1]);
    dst[dp+2] = Z(dst[2],dst[dp*2+2]);
    dst[dp+3] = Z(dst[3],dst[dp*2+3]);
   }

   src += 3;
   dst += 4;
  }
  }
  from = (uint16_t *) (((uint8_t *) from) + src_pitch * 2);
  to   = (uint16_t *) (((uint8_t *) to  ) + dst_pitch * 3);
 }
}

Nebuleon · « **Reply #173 on:** December 01, 2013, 07:55:46 am »

ReGBA version 1.45.2, the Scalers Galore Edition, is now available.

Download: https://dl.dropboxusercontent.com/u/106475413/gcw-zero/regba-1.45.2.opk

In this release, software bilinear scaling is available, and subpixel-aware scaling is now a special case of software bilinear scaling and is much less rainbow-fringey.

Awakened · « **Reply #174 on:** December 01, 2013, 08:46:41 am »

I think Aspect, bilinear is my favorite. Very consistent looking compared to fast and without the slight rainbow effect of subpixel. The blur you get doesn't look bad at all on such a low res screen either. Thanks! I really like having all the options if my mood changes

It'll be awesome if other emulators implement that type of scaling too.

Surkow · « **Reply #175 on:** December 01, 2013, 12:57:36 pm »

Quote from: Awakened on December 01, 2013, 08:46:41 am

I think Aspect, bilinear is my favorite. Very consistent looking compared to fast and without the slight rainbow effect of subpixel. The blur you get doesn't look bad at all on such a low res screen either. Thanks! I really like having all the options if my mood changes

It'll be awesome if other emulators implement that type of scaling too.

In the future we'll be able to use the IPU for bicubic and bilinear scaling. This means no overhead for the CPU and programs won't have to implement scaling in software.

BlockABoots · « **Reply #176 on:** December 01, 2013, 06:21:48 pm »

Hmm, in this release a few games dont appear to work, and just get a white screen after the boot logo, the game ive tried are:

F-Zero
F-zero GP Legends
Double Dragon Advanced

any ideas?

hi-ban · « **Reply #177 on:** December 01, 2013, 06:33:22 pm »

They work for me. Which BIOS are you using? Maybe you might have to try a different BIOS...

Gab1975 · « **Reply #178 on:** December 01, 2013, 07:31:13 pm »

They work for me too... (I use the original gba bios)

Awakened · « **Reply #179 on:** December 01, 2013, 07:55:55 pm »

Quote from: Surkow on December 01, 2013, 12:57:36 pm

In the future we'll be able to use the IPU for bicubic and bilinear scaling. This means no overhead for the CPU and programs won't have to implement scaling in software.

I think I saw someone mention that in #gcw. That's gonna be a really nice feature.