Category Archives: software

flacfs, again

I wrote a piece on a software that exposes a flac file as a virtual file system a few years back, and promised that I would release the code when it was “fit for release.” Left to my own devices, that would never happen. So… someone came across the post and asked about the code. I promised that I would do something about it in a few days. Here it is, warts and all.

They sat, they surveyed, and …

From Shannon’s 1950 paper [pdf] on programming a chess-playing computer-

… this approximate evaluation f(P) has a more or less continuous range of possible values, while with an exact evaluation there are only three possible values. This is as it should be. In practical play a position may be an “easy win” if a player is, for example, a queen ahead, or a very difficult win with only a pawn advantage. The unlimited intellect assumed in the theory of games, on the other hand, never make a mistake and a smallest winning advantage is as good as mate in one. A game between two such mental giants, Mr. A and Mr. B, would proceed as follows. They sit down at the chessboard, draw the colours, and then survey the pieces for a moment. Then either:-
(1) Mr. A says, “I resign” or
(2) Mr. B says, “I resign” or
(3) Mr. A says, “I offer a draw,” and Mr. B replies, “I accept.”


Regarding hdd wiping, Gutmann writes

In the time since this paper was published, some people have treated the 35-pass overwrite technique described in it more as a kind of voodoo incantation to banish evil spirits than the result of a technical analysis of drive encoding techniques. As a result, they advocate applying the voodoo to PRML and EPRML drives even though it will have no more effect than a simple scrubbing with random data…. As the paper says, “A good scrubbing with random data will do about as well as can be expected”. This was true in 1996, and is still true now.

Looking at this from the other point of view, with the ever-increasing data density on disk platters and a corresponding reduction in feature size and use of exotic techniques to record data on the medium, it’s unlikely that anything can be recovered from any recent drive except perhaps a single level via basic error-cancelling techniques. In particular the drives in use at the time that this paper was originally written have mostly fallen out of use, so the methods that applied specifically to the older, lower-density technology don’t apply any more. Conversely, with modern high-density drives, even if you’ve got 10KB of sensitive data on a drive and can’t erase it with 100% certainty, the chances of an adversary being able to find the erased traces of that 10KB in 80GB of other erased traces are close to zero.

Another point that a number of readers seem to have missed is that this paper doesn’t present a data-recovery solution but a data-deletion solution. In other words it points out in its problem statement that there is a potential risk, and then the body of the paper explores the means of mitigating that risk.

Stallman’s stall

CNET (read it) links to this very interesting letter addressed to the European Competition Commission which has Richard Stallman among its co-signatories. Oracle’s acquisition of Sun will give it, among other things, control over MySQL. And the sole purpose of the letter is to keep that from happening. The irony is that the “problem” has probably been created because of the GPL, at least that’s what Stallman seems to thinks. The interesting bit-

Defenders of the Oracle acquisition of its competitor naively say Oracle cannot harm MySQL, because a free version of the software is available to anyone under GNU GPL version 2.0, and if Oracle is not a good host for the GPL version of the code, future development will be taken up by other businesses and individual programmers, who could freely and easily “fork” the GPL’d code into a new platform. This defense fails for the reasons that follow.

MySQL uses the parallel licensing approach to generate revenue to continue the FLOSS development of the software. If Oracle acquired MySQL, it would then be the only entity able to release the code other than under the GPL. Oracle would not be obligated to diligently sell or reasonably price the MySQL commercial licenses. More importantly, Oracle is under no obligation to use the revenues from these licenses to advance MySQL. In making decisions in these matters, Oracle is facing an obvious conflict of interest – the continued development of a powerful, feature rich free alternative to its core product.

As only the original rights holder can sell commercial licenses, no new forked version of the code will have the ability to practice the parallel licensing approach, and will not easily generate the resources to support continued development of the MySQL platform.

The acquisition of MySQL by Oracle will be a major setback to the development of a FLOSS database platform, potentially alienating and dispersing MySQL’s core community of developers. It could take several years before another database platform could rival the progress and opportunities now available to MySQL, because it will take time before any of them attract and cultivate a large enough team of developers and achieve a similar customer base.

Yet another way in which Oracle will have the ability to determine the forking of MySQL relates to the evolution of the GNU GPL license. GPL version 2.0 (GPLv2) and GPL version 3.0 (GPLv3) are different licenses and each requires that any modified program carry the same license as the original. There are fundamental and unavoidable legal obstacles to combining code from programs licensed under the different GPL versions. Today MySQL is only available to the public under GPLv2.

What the GPL was intended to do, and rightly so, was to give users the right to do anything they wished with the source code of the software as long as they gave similar rights to downstream users. No one has to choose the GPL, and quality of service is not a function of the software license. But such a license does allow those using the software to hire other people to work on the code in case the original developers stop providing the service for any number of reasons, or fork it into a different project if they are not satisfied with the direction the current project is taking. There is a small problem in all this, as Stallman recognizes, and I am sure he did not have that epiphany when he wrote that letter—those who own the copyright to the code have the ability to license it any which way they please, and this ability is denied to downstream users. But the GPL was not meant to prevent this.

If MySQL has the rights to its code, and Oracle decides to torpedo the db in favor of its own line, the only two alternatives available to those affected by the situation are to fork the code and continue its development as free software, and somehow find the cash to sustain it, or move to some other db. If “intellectual property” is bad, using state power rather than individual initiative to force a company to behave in a particular manner is worse. That most “big” companies no longer do business but lobby and play venal politics is a different matter altogether.

The idea behind flacFS and virtualFLAC

[I am not providing the complete source code for what I have described here right now. It is not fit for release. If and when it will be, I will GPL it.]

A very substantial part of my audio library consists of cds ripped in the [single flac image + embedded cue sheet] format. All metadata is stored in .apl files—one for each track—which foobar2000 can generate based on the cue sheet. While this is a neat arrangement, the problem with it is that foobar is one of the very few players which can deal with the situation. If I want to access the collection on linux, I have to run foobar under wine. To my knowledge, no native application is capable of playing the tracks.

The problem
There are three main parts to the problem. One, the apl files. These files are “link” files which specify a particular sample range within an audio image and any player that supports Monkey’s Audio should be able to handle them. But from what I have read, while the MA decoder is free and may even be termed as open source, the license agreement is not compatible with GPL. Hence many players on linux (I have tried three or four) don’t use the SDK to support either ape files or apl ones. Those which do use gstreamer don’t play apl files. Two, cue sheets. Not many players support cue sheets either. Three, metadata. The metadata is locked in the apl files along with the track locations and timings. Thus if a player doesn’t support apl, but does support cue sheets, you lose access to metadata. If it doesn’t support cue sheets either, you are left with a huge flac file with no way to identify the contained tracks.

Further, even if some player does meet all these expectations, there will always be “something” that it will miss. For example (and this is only an example—I haven’t thought about how this might work), I want the ability to create “virtual albums.” Not play lists, but albums. Many times, older albums are not available, but some songs from them are available in special collections. In other cases, cds come with bonus tracks from other albums, or worse, advertisement tracks! (I have one such cd.) Virtual albums will allow one to use the same track in multiple albums, or use only a part of an album. The same track could have different metadata for different albums. And so on.

One more thing. The fact that the metadata is presently stored in apl files doesn’t mean it has to be stored there for eternity. If there were some platform agnostic way to manage it, the same could also be stored in a database—sqlite, firebird, anything—and be applied to the track when it is played, and carried back when it is changed. Thus, the single most important idea would be that the audio file and metadata management should be independent of any audio/ media player. Note that these are mere musings. As I said I haven’t given much thought to how all this might be implemented. Now to the specific problem. Getting the apl + flac combination playing on linux.

The theory
The problem, one should note again, lies at the player end. Players don’t understand that the single image actually has multiple tracks within it. There are three ways I can think of to attempt a solution. Somehow write a decoder/ plugin for the player of your choice. Or stream the tracks to the player. Or use FUSE. I don’t know if I can do the first (apart from the fact that the solution is player specific), and I don’t even know where to begin as far as the second is concerned. But FUSE has promise. If you are not aware of it, FUSE stands for Filesystem in Userspace. (Read the wiki article, and also try the one on IFS for more information.) In short, while filesystems—FAT, NTFS etc—are generally implemented in the kernel space, as drivers, FUSE allows you to create a fully functional filesystem within a regular application/ service. ntfs-3g on linux, which allows access to NTFS volumes, is a great example. pytagsfs and mp3fs too are fuse-based applications.

Deciding on FUSE is the easy part. FUSE the software, with bindings for many languages including .net (see Mono-Fuse) is available on linux. FUSE the concept also has a few Windows implementations, including one called Dokan. One can thus show the media player files which do not really exist, and it will play them. The question is how to show one flac file as many. Its here that I hit a roadblock. I tried on-the-fly decoding using the flac reference decoder. And the application went crazy with multiple decoding processes starting every time a directory was accessed. Maybe I missed something and it can still be implemented in this way, but I dropped the idea quite soon. The only alternative was creating the file, on-the-fly, as it was being read.

Its here that the need to understand the flac format specification arose. The overview, and the specification on the flac site were helpful when it came to understanding the file structure. But that still didn’t answer the question about the contents of the file. I have no experience when it comes to the technicalities of digital audio. So I didn’t know why some things were happening the way they were. Its here that Brian Langenberger’s “audio formats reference” was extremely useful. And jFLAC which I could observe under a debugger when I faced some problems, and from which I borrowed some utility functions.

I will only provide a brief description of the format and then move on to what can be done to tie all of it together. Anyone who is interested should refer to the many links I have provided above. Once you understand the specification, a lot of what follows will make sense. A FLAC file is made up of a header (which is a 4 byte signature), one or more metadata blocks, and one or more frames. The first metadata block is a streaminfo one which contains information on the flac stream. Other blocks follow. The last block is identified by a flag set in the block header (every block has one). And the frames follow after that. Each frame consists of one or more subframes.

A note on audio before I proceed. Consider cd-audio. It is described as being 16-bit stereo audio at 44.1kHz. What does that mean? What it means is that the audio has two channels left and right, and every channel is sampled 44,100 times every second. Further, every sample is stored using 16 bits, or 2 bytes. Thus one second of uncompressed audio of the above description will require 44,100 * 2 (channels) * 2 (bytes) = 176,400 bytes of space. The 44.1 is called the sample rate, and the 16-bit is referred to as the sample size.

Back to flac. Each frame has a “block size” which specifies the number of samples it contains. And each frame has as many subframes as the number of channels. Thus, in the above example, a frame with a blocksize of 4096 will store 4096 samples for channel 0 and another 4096 for channel 1. The encoded data lies in the subframes. A lot of unary encoding takes place and the only way to know the byte size of the frame is by seeking through it to the end.

This is the end of the specification in general. But how does it help us in locating tracks? Every frame has a header (all data structures have one) which contains, among other things, three important pieces of data. First is the blocking strategy. In theory every frame can be of a different size (meaning they can contain a variable number of samples) – a variable strategy can be used. But in practice, all frames except the last one have the same size. The residual data is put in the last frame. This is called a fixed strategy. But make note of the possibility that variable blocking is legal. The second is the frame/ sample number. If the strategy is fixed this field contains a zero-based frame number and is stored as a UTF-8 encoded 32-bit integer; if the strategy followed is variable, the field contains the number of the first sample in the frame, and is stored as a UTF-8 encoded 64-bit integer. The last piece of information is the block size, whatever it is. After all we need to know how many samples exist in the frame.

The idea is this. Given the starting and ending sample numbers of a track, we can locate the first and last frames of the track in the stream. The track will start either at the beginning of a frame (like the first track of a cd) or somewhere in the middle of a frame, and will end similarly (except in case of the last track of the cd where the last sample of the cd is the last sample of the track). A legal track can be extracted from a flac image by writing, at the very least, the stream header, and the streaminfo block updated to the values of the track in question, and a variable first frame which will only contain the samples relevant to the track, and the complete samples of all the frames except the last one, and a variable last frame. Because the specification states that a stream can either be variable or fixed, that is, a variable frame, and a fixed frame cannot legally exist in the same stream, we have to set the “variable” flag on every frame even when the block size is fixed. Further the header of every frame will have to be updated to record the starting sample number of the frame instead of the frame number. Another very important step. The subframes, two of them at least, the fixed one and the lpc one, contain encoded samples. This is no problem when we grab the complete frames. But when we need a frame where the sample we are interested in is located in the middle, that frame will have to be decoded, the data will have to be shifted to the front, and the frame will have to be re-encoded. This process will only have to be done at most once per track. At the end of all this we end up with a perfectly legal track. And we hardly did any encoding or decoding.

In my solution, all of the above is taken care of by the flac module. Before calling its decode function, one must subscribe to the “metadata read” and “before frame read” events. The subscriber can hand in a writable flacstream to any flac structure and the structure will write itself to the stream. It is the subscriber’s responsibility to set the correct values of the variables I mentioned above based on values already read from the input stream, and the samples targeted.

I wrote three test functions to verify the accuracy of the flac module. The first one analyzed a supplied flac file, calculating various checksums along the way and checking them against those in the stream. The second one copied a supplied file to another one – one flacstream did the reading, the other did the writing. The third function did what was my intention all the time—extract a track if the starting and ending sample numbers, and a flac image were provided.

But track extraction is a partial solution. If a FUSE application is implemented, we have to deal in offsets, not in samples. The audio player may be seeking to a particular sample, but it is actually issuing a read call to the underlying file system which has to be trapped and the necessary data provided. This is the function of the virtualFlac module. It contains two of the most complex classes in the whole suite. I think I have complicated things more than is necessary. If this part of the process can be implemented more elegantly, I would definitely do it. But I will describe what I have done anyway.

The first class is the VirtualFlacCreator. Given a flac file and an array of starting and ending sample numbers (parsed from apl files in this case), it will create a .virtualflac file, an xml file, which contains general information on every track – its name, size (byte size calculated while the stream is being read) and information required to patch the streaminfo block of a flacstream. Further, each track will contain a list of all the frames over which the track is spread. What this means is that nearly every time, a frame will be listed as the last one of a previous track as well as the first one of the current track. For each frame, its starting offset in the image, its virtual offset when it is part of a single track, and a few other values are recorded. To save space and processing power (I hope), only the track information is stored in human readable form. The frame information is stored in base64-encoded form. The .virtualflac generated at the end of this process can be used in conjunction with its companion flac file to generate one flac file for every track. And that, precisely, is the function of the second class—VirtualFlacTrack (call it VFT).

VirtualFlacTrack takes two arguments—the path to the flac image, and a reference to a VirtualTrack object. I know the names are confusing—I gave them the first names that came to mind— but VirtualTrack is the representation of the track node in the .virtualflac file, basically a dumb structure, whereas VirtualFlacTrack uses the information available in the structure to perform on-the-fly substitutions. VFT exposes exactly one function, a standard filesystem read call which takes a preallocated buffer, an offset and a count as arguments and returns the number of bytes read.

This is the end of the virtualFlac module. While the virtual file system can simply call a VFT object and get data as if it already exists on the disk, as I mentioned above, VFT can also perform another function. If you have xyz.flac and xyz..virtualflac, the following function

private static void WriteTracks(string virtualFlacFile)
	VirtualTrack [] tracks = VirtualTrack.GetTracks(virtualFlacFile);
	foreach(VirtualTrack vt in tracks)
		string dir = Directory.GetParent(virtualFlacFile).FullName;
		string flacImage = Path.Combine(dir, vt.ImageFile);
		VirtualFlacTrack ft = new VirtualFlacTrack(flacImage, vt);

		FileStream fs = new FileStream(Path.Combine(dir, vt.FileNameWithExt), FileMode.Create);

		const int BUFFER_SIZE = 1024 * 128;
		byte[] data = new byte[BUFFER_SIZE];
		int offset = 0;
		while (offset < vt.Size)
			int read = BUFFER_SIZE;
			ft.Read(data, offset, read);
			offset += read;
			fs.Write(data, 0, read);

will dump all the tracks in the flac image to the same directory.

Putting it all together – flacFS
I said at the beginning that the FUSE application is the easy part. Because the whole project only uses what can be termed as the lowest common denominator when it comes to .net class usage, and is a console application to boot, it is platform agnostic. It references both Dokan as well as Mono-Fuse. At runtime, it detects the environment and uses the fuse implementation applicable to the platform. To avoid writing the same code in both classes—flacDokan and flacFuse—the whole filesystem is implemented in a third class and the Dokan and Fuse classes simply talk to this third class—flacFs, which is implemented like so. (It is a .cs file with a .docx extension.)

One just has to provide the directory to be mounted and a mountpoint (drive letter in the case of Dokan) and a new volume will appear with virtual flac files on them.

This is how a .virtualflac file looks.

These are the application files involved.

And this one shows the application at work.

A lot remains to be done. For one, metadata support is missing though it should be easy to add, at least the “readonly” part. A vorbis comments metadata block is defined by the flac specification. The application will simply translate between vorbis comments and ape tags or wherever else the metadata is stored. It might even be possible to store it in the virtualflac container. Further, the flac module is brittle because of assumptions regarding the size of the file. And the second Rice partition method is not yet implemented. As for performance, I am not sure how it will work with a few thousand tracks. But I will leave these for another day. The fact that this works is, the pun is very much intended, music to my ears.