Aaru A vision of the past, present and future

Aaru: A vision of the past, present and future

The paleolithic: The FOBOS API

When I was young, I studied the filesystems APIs for OS/2 and Windows NT, and decided I would create an interface, that would allow me to create filesystems that worked on both of them.

The design of the interface went well, specially thanks to the similarities between their filesystems APIs (called Installable File System APIs).

In no time I had implemented a parser for the Master Boot Record and the Apple Partition Map.

However soon a problem arised: I needed to dynamically load filesystem drivers from DLLs (libraries), and I didn’t know how to do that, in what basically is, a plugin system, using C. I lacked the expertise.

So I left the project aside for a very long time (about a decade).

The bronze age: FileSystemIDAndChk

At the start of the century a new language came around, C#. It offered an object oriented syntax more similar to C than C++ offered. I liked it.

It was not very good at the start, good ol’ hateful .NET Framework 1.0 days.

And it was Windows only, by then I ran away from Windows like the plague.

But there came Mono, with compatibility with .NET Framework 2.0, in Linux, and all the goodness of C#. For me it was like discovering metal forging. I could do C, with objects! Just amazing!

At the end of the first decade of the century a necessity arised. I had been preserving operating systems, and not all of them came from the most reliable places. If you know what I mean.

So I needed a way to separate the wheat from the chaff.

So I started porting the partition schemes parsers from FOBOS to C# in a new project called FileSystemIDAndChk.

The whole goal of the project was to be able to easily identify an image that was modified from one that was not. Microsoft used CDIMAGE to create Compact Discs, not Nero. An Apple disc that shows a last write time a decade later than the publication date of the software it contains means it has been mounted read-write, therefore, is modified. Etc.

In no time I had to add more filesystems, I had to add image formats (Nero, Alcohol, etc), and filters (for files that came compressed with GZip, BZip2, etc, or encoded with MacBinary et al).

And it grow and grow and grow until…

The iron age: DiscImageChef

…it had grown beyond the simple ability to identify filesystems.

It was able to do all kind of common operations with disc (and disk) images.

It could analyze them and identify their contents. It could hash and checksum them. It could print sectors on screen. And many other operations.

But it could not create images from the physical media. I was relying on other software, and most of this software would run only on Windows, and I was full on Mac OS X at the time, with Mono.

At the same time I was reverse engineering the Lisa filesystem.

This added two unexpected features to the project:

  • Creation of images from real media
  • Complete read support for filesystem: list, access, and extract files from images

The first thing I added was the simpler filesystems: UCSD Pascal, MFS, CP/M, and other, setting down the interfaces that would later be used by my Lisa filesystem implementation.

Then I added support for dumping media. I started with Linux, but I wanted it to work also in Windows. And obviously Mac OS X as it was my primary platform.

Soon I learnt that Mac OS X doesn’t allow to send commands to drives, from user applications, unless the drive is a recognized CD/DVD/BD recorder. So don’t come asking for the feature, it’s not gonna happen, ask Apple first, sorry.

Classical age: DiscImageChef dumping, deep analysis for image formats

I had it dumping images from all kind of ATA and SCSI devices. Tapes, CDs, DVDs, USB floppies, hard disks, etc. In no time.

At first I was creating only raw images. That is the sector user data one after the other. That you call ISO, BIN, IMG, DSK, etc.

And I added support for dumping in FreeBSD. Just out of boring.

But this felt incomplete.

So, I expanded the interfaces, so media images would not only be readable, but also writable.

Then took the most common, and better known, images: Alcohol, CloneCD, CDRWin, QEMU Copy On Write, VirtualBox, VirtualPC, etc.

And added support to create them all from real media, and to convert images from one format to another.

This allowed me to learn a lot that would influence a decision later on.

Renaissance: DiscImageChef becomes public knowledge, DicFormat is born

I was doing all my work out of passion, in the dark, not really looking for users, or people to use it.

But as part of my philosophy, software for preservation, must be public. So all of this was on github, opensourced.

And people found it. It became public.

I was invited to join some videogame preservation groups. Got into contact with some museums.

A lot of people, a quite bit of stress. But all looked brilliant, except for one thing.

Image formats…..

Disc/disk/tape image formats were not designed for archival, for preservation.

They were designed for piracy (Alcohol) or for emulation (QEMU CoW).

All of them had problems, big problems:

  • Most of them were propietary secrets, so compatibility was not really possible. That’s why there are many more readable formats than writable ones implemented. It would take an insurmountable amount of effort to reverse engineer them to the level of being able to write them.
  • They do not store all data that a disk contains, that may, or may not, be needed for preservation. It is not my call to decide if a museum wants this or that of a disc. I should be able to offer the end user the ability to knowingly lose this information.

These problems made a new necessity: there was the need to have an open source, documented, known, format, able to store all a disc, disk, tape, or any other media, has on it, expandable to things we don’t know, things that don’t yet exist, etc.

So DicFormat was born.

I designed that format to be able to store all kind of data a drive returns from the medium it has inserted (or it is part of, in case of non removable media).

A format designed for the proper preservation, and archival, of any kind of media, past, present, or future. And expandable. And opensource. Not so good yet at documenting it.

Baroque: Rename to Aaru. The limitations start to arise

After all these years, and evolution, it was not anymore something to handle Disc Images. It was not anymore logical to have that name.

So I looked for another name, one that would fit better with the project.

Aaru, the ancient egyptian word for the afterworld. Where your soul is reborn into paradisiaic lands. (and DicFormat became AaruFormat)

This was the software to take media and preserve it, so it can be reborn into paradise. Or an emulator. Or a museum study.

Archeology of computer storage media.

It fits.

But then the limitations started to appear.

The whole of it was designed around what already existed. Around tracks, because all other CD images used tracks. Around LBAs, because all other disk images used LBA.

But this is not how media works.

A CD can have negative sectors. No image had them. Therefore AaruFormat didn’t support them.

And tracks are a logical construct, copy protections usually violate. Again AaruFormat depends on tracks.

People want to archive things now. Big disks. AaruFormat was not designed to store 1Tb of data.

Floppies, old hard disks, they do not use LBAs.

This is my fault. I know. I didn’t do a good enough job at the time.

Industrial revolution: Errors found, the infrastructure needs to be destroyed. Aaru 5.3 LTS

I’ve studied the problems. And I have the solutions.

But it requires to take out a lot.

New buildings require old buildings to be demolished.

But there’s still people that are, and need, to use Aaru, even with its current limitations.

There comes: Aaru 5.3 Long Term Support

Yes, while there are known limitations, what Aaru does now, does well. And what needs to be fixed will be, for next version, 5.3.

And this will be last of an era.

I’m currently fixing all bugs I can. I’m testing all image formats, creation, conversion, etc.

All that is currently implemented is checked, twice, thrice, and more.

And when all is working well I will publish 5.3.

And it will be a Long Term Support, because, I guarantee you, that if you find any critical bug, that is not one that requires the new infrastructure, I will do a point fix relase, and the images will still be archival quality.

Coming into the modern age: The road to Aaru 6.0

Aaru 6.0 will have a long (undetermined) development time.

AaruFormat needs to be changed from the ground. V2 is designed but it needs to be implemented.

All plugins, interfaces, filesystems, images, need to change, to support how real media is, not how all formats thought it was.

And even if I get full funding (that I have not yet gotten, so this is a good time to go to my Patreon as any) it would not come soon.

So while I work on Aaru 6.0, and I do breaking changes, that might TEMPORARILY affect the quality of dumps, Aaru 5.3 will still be there and supported with fixes.

Nonetheless after each major change of Aaru 6.0 I will publish beta releases. However use them at your own risk, unless I’ve specifically said “doing X the image will be good”, assume not, and use Aaru 5.3.

The Future

  • AaruFormat V2 with support for negative sectors, C/H/S addressable media, linearly (byte) addressable media, a portable C library usable by emulators (e.g. MAME), support for very big disks (tebibytes).
  • Mountable support. You could mount any supported image with any supported filesystem (read-only implementation, not just detection) on Linux and Windows.
  • Better support for AaruRemote. Including dumping from a Wii. Possibly other consoles as well.
  • A fully flegded GUI.
  • Image edition: changing the metadata of an image without converting.
  • Richer console support.
  • Multilanguage.
  • And many, many, many more.

Thanks for your confidence in me. Special thanks to:

  • Silas for his endless testing, even when it becomes an economical burden for him
  • Rebecca and Michael for their code contributions
  • Matt for his confidence on the quality compared to alternatives
  • Joseph and Robin for their input on museums needs
  • Jonas for his (still ongoing) help on being more open, less in the dark, kind of project
  • Noah for his massive economical support to get rare drives, and the confidence and friendship he gladly offered unasked for
  • Anubis, for kisses, paw giving, and looks, when I’m stuck on a stupid bug

Merry Yule and Better 2021 than any year before.