I Restored the RPG Maker 2000 RTP Sound Effects with UniverSR

I grew up with RPG Maker 2000. Today, let's see if the classic RTP sound effects can be restored with modern audio super-resolution techniques.

RPG Maker 2000 (RM2k) is a game development software released in 2000. It came with a set of pre-made assets, including sound effects, that many indie game developers used to create their games. These sound effects are iconic and nostalgic for many people who grew up playing RM2k games. But they are low quality by today's standards.

Could we restore them using super-resolution techniques? Super-resolution is basically image upscaling, but for sound.

320x240 reference for low-detail analogy

Original

High quality reference for restored-detail analogy

Super resolution

Why is this difficult? A tiny image can be made larger in Paint, but that does not magically add real detail. It just makes it blurrier and pixelated. The same is true for audio: simply converting it to a larger format does not restore what was missing. The goal is to add back detail.

Let's take an example. Here is the original "Sword2.wav" effect from the RM2k RTP.

Original Sword2 spectrogram

The image above is a spectrogram: it displays the frequency content of the sound over time. Brighter areas are frequencies and moments where the sound is louder.

The audio sounds like it comes from a 1980s cassette of a Karate movie. This is because the original sound is encoded in .WAV indeed, but in 22.05 kHz and 16-bit to save disk space. The frequency scale only goes up to 10 kHz. But humans hear up to around 20 kHz, so we're missing a lot of details.

Now, here is the restored version, processed with UniverSR on Neural Analog:

Restored Sword2 spectrogram

Look at the spectrogram: now, the scale goes up to 24KHz. The new audio is in 48kHz, 24-bit. This is better than CD quality! When you listen to it, you hear a brighter sound.

The sound is not just louder, equalized differently, or with a reverb effect. The spectrogram has logical missing frequencies, similar to the original, as if we asked someone to "paint" the details back in.

This is basically audio restoration. And I did that for all of the RTP sound effects.

Download all of the restored RPG Maker 2000 RTP sound effects

I processed the 203 sounds effects from the RPG Maker 2000 RTP to turn them into high quality 48khz versions.

Download all of the restored sounds

More examples

Jump1.wav

Original RTP

Restored

Absorption1.wav

Original RTP

Restored

Bite.wav

Original RTP

Restored

Bell.wav

Original RTP

Restored

Explosion2.wav

Original RTP

Restored

Sheep.wav

Original RTP

Restored

How it works? The UniverSR restoration model

How did I do it? I didn't find the original high quality version of these files. I didn't recorded from scratch. I didn't look through sound banks. I used a machine learning model.

UniverSR is an audio super resolution model developed by the University of Seoul. It takes low quality audio as input, and reconstructs a higher quality version of the same sound.

To explain the model simply, the audio is transformed into an image (the spectrogram) thanks to STFT (short-time Fourier transform). Then, the model reconstructs frequencies using flow matching, a popular technique in image generation behind the super realistic image models like Nano Banana or Midjourney. Finally, the spectrogram is transformed back into audio.

UniverSR was trained on a huge variety of copyright-free audio datasets (speech, music, and sound effects). These audios are downsampled to create training pairs. The model looks at the lower-resolution version and infer what a believable higher-resolution version should sound like.

The model learns to "paint in" the missing details in a way that is consistent with the original sound. Expect it to work better on sounds similar to its training data.

How to run UniverSR?

UniverSR is open source, but requires some technical knowledge to run and a beefy GPU. I was able to run it on Neural Analog. I batch-imported the 203 RTP sounds. Then, I selected them all and ran the UniverSR-based restoration in one click (config: universr-audio model, mono, 4khz to 24khz preset).

The processing took about 20 minutes for the whole pack. This model is not the fastest, but the results were better than the other options I tried (AudioSR).

I downloaded the files in a zip. Here is the link again if you want to check them out:

Download the restored RPGRT archive

If you are participating in a game jam or just poking around in old RM2k nostalgia, I hope these are useful.

What other audio can be restored?

As time passes and our digital footprins get bigger, we are accumulating more and more low-quality audio. Voice memos, social media videos, musics, field recordings, and more.

Some of this audio is irreplaceable, and deserves to be heard in the best possible quality. Audio super-resolution models like UniverSR can help us restore and preserve these sounds, giving them new life and making them more enjoyable to listen to on modern equipment.