Free Software, Free Data

Many articles are around in these issues but it seems that the simple campaigning just isn’t enough. So a new article would not hurt the cause either but probably won’t achieve much. Perhaps the problem is that FS activists are confined by the verbal propaganda. Also, with as the “alternative” movements appeared and appeared just in a form that many people would call more practical compared to the “radical” Free Software movement, our camp was decimated. And than there’s rms and the FSF troubled relationship with the wider movement of free software activists.

Let’s deal with the actual issue at hand. A fair bit of our society is today computer and computer-alike device user. According to the Internet World Statistics there are already 2.267 billion users of the internet. That’s a pretty huge number. They don’t necessary have a computer or similar device, but they could use computers and the internet in internet caf├ęs. But we need to take this further. Today every government agency, NGOs or larger business organisations are heavily relying on computers in general. I can risk to say, that most of the people of our planet is either a computer user herself, or the organisation that is responsible some of the dealings of her are using computer, and internet connected applications. I have no data on this, so don’t take it as a fact but I think it is an educated guess given the number of the internet users above and the fact that computers were invented for statistical, bureaucratic tasks in the first place. Also consider that widespread web-cam sex business where there’s a considerable number of women who are from countries like Thailand or Nigeria. Also just as a quick search will show, there’s more than 90 million GSM subscriber in Nigeria alone, and according to the wikipedia page about the telecommunication sector of Nigeria, internet access points are widespread in the cities just like every European countries. In short, there are many hundreds of millions of computer users and many more who are a indirect user of telecommunication, and is affected by these technologies, willingly or unwillingly. Indeed it could be just as well as the most of the human race. (Not to mention cats. Just google-ing the term ‘cat video’ gives 1.75 billion hits which is quite impressive in itself.)

With this staggering number of users one can imagine what a huge market opened in the last 44 years. For great part of the people today life is inconceivable without the constant communication lines between the world and themselves. For many, it would equal to the physiological loss of navigation, if one looses the connection to google maps combined with GPS, or catastrophic isolation, if one looses the access to their Facebook account or Blackberry messengers. It is not exaggeration to say, that if in this heavily computerised world the software isn’t free (free as in free speech) and where the so-called public institutions do not make use of the Free Software, our life is exposed to all sorts of threats ranging from severe disruption of our social life to giving away the control over our life completely (I know it is the case for most our times already, due to wage labour, patriarchal relationships, central democracy but at least in consumption so far we were treated at least as if we would be free and sovereign).

But this had come to an end because the economical life has its natural drive to invade whatever territory it can find to proliferate. It started with undisclosed source code and now we have hardware-supported DRM, walled gardens, software patents, cloud and copyright mafia. Of course, the attention of the internet is mostly on the internet-related policies, but there’s something profound of the sheer magnitude of effort that individual companies and entire business cartels are willing to put in to their favoured internet legislation lobby, or to “protect” their products from their own user(!). The number of different type of businesses related to the internet reveals the proportion of the involved investment and capital that flows in the computing and telecommunication sector. That’s because these are strategical industries from a business and political point of view.

For some reason, most of the user doesn’t care about this whole thing. Even numerous software developer ignores the free software movement at best, while others would be openly hostile. So what are these reasons? One of the most fundamental question that the free software movement should answer at this point of time, why are we ignored at large? Of course, there’s a hype around the Open Source and at some times it looks (or rather looked) trendy even among the politically motivated users. Activists of all sorts discovered that if their efforts are going against some powerful interest they have to make damn sure that they use tools are clear of any hijacking, that their paths of communication is the most secure relative to the level of threat they are facing with. As the communication, and especially the internet is a big copy machine it is the most reasonable to be cautious and suspicious of the electronic communication. This need of electronic privacy brings the question of Free Software to a practical level. I emphasize here the Free Software over Open Source because the latter is only designed to be open for production while Free Software also emphasize (among other things) the freedom of user of any level. Any tool is as reliable as the user could make sure of this fact. Black box products are extremely dangerous on any area of life, so they are in the software/hardware world of computation and communication. Indeed, in some areas people tend to be more concious of the products they consume, like it is the case with food, and there’s already an ongoing movement to raise the general conciousness over the matter of what one eats, what composition of materials are used in the process, and also, advocates non-processed foods, for they could and do have untold, unknown components and processes which are posing a significant risk over our health and general well being. It is however bewildering to see that the same people who argue zealously for consuming unprocessed, organic food, do not pay the same attention to their computerized, or other devices, and are happy to jump on the current trendy technological bandwagon at any given turn. Although it is just personal experience I can talk of, I can see some correlation between the Apple product consumers and the movement for healthy, unprocessed food.

So why aren’t people concious of their gadgets? Why aren’t people conscious of their personal creations and data in general? The previous metaphor can work even further. Although, the healthy food movement is way more successful than the Free Software movement, they are light years away from complete success. The ready-made food business, along with the fast-food chains or restaurants in general are flourishing despite of the nowadays widespread organic food stores and constant networked propaganda for healthy food, healthy lifestyle. Although it is a complicated issue, I believe most of it can be traced to single cause that is common in healthy food, hardware and software. That is capital.

Many Free Software advocate would explicitly refuse that Free Software is socialist, communist or any of that sort of movement. And they are right of course, because Free Software in it self do not have anything inherently socialist feature. One can demonstrate an ideal production process based on FS and well within the capitalist and market driven framework. Just give a possible scenario and let’s say, that the Intel creates a new CPU with new instruction set. We also have a free compiler collection, called GNU Compiler Collection. Intel’s management is well aware that the success of its new product depends on how quickly the new features can be adopted. They can either hire a company to create an exclusive set of compiler which is enormous task to begin with and takes years of development to reach a reasonable stability. And, while they would be doing so, they loose a lot of time in order to introduce their new product to the world of software developers. Or, they can hire a company, to create a back end of the GCC. The following advantages are already there: Given the compatibility between the different generations of Intel CPUs they have a good chance to find a back end which lacks only the very new features that the Intel want to introduce to the world. The other thing is, that the GNU Compiler Collection is a Free Software project meaning that you can access and read the source code, your can modify the source code, and you can distribute the changes what you made. So the development company, bearing the obligation of course, that they will publish the source code of their modifications, can implement the new features in the compiler collection based on the technological support they get from the Intel. Also, after they release their new back end for the new generation of CPU, the GCC implementation can serve as a reference project for many others where developers can take advantage of the already working, running and useful software to implement whatever software they would like to. Financially, Intel’s well being has been improved because the new CPU will be supported in many software environments, which in turn will sell the new CPU better. The company that has been hired will be paid for this job and can pay for the developers who participated in the project. Nothing has been given away for free in this process, a contract has been made, and the contract has been satisfied. Nothing socialist, nothing communist, no intervention in to the market process. And not at least, people who worked, got paid.

The world is however, not an ideal place. Software companies, as any other company are seeking new ways to make bigger profits. And for doing so, they realised that if they withhold the source code of their products, giving away only the binaries no other company can be hired for the next generation of changes in the software. In the previous description the source code of GCC represented the public domain and the Intel and the software company the private sector. In the example above I described an free-market ideal, where the private economical entities, like Intel and the software development company are acting in their own interest, making profit, while the general public gains more wealth during this process. But if a business entity withhold the source code, they effectively destroy the public domain, they do not act any more in the interest of the society as such, their profit motives do not improve the life of the rest of the society. In other words these companies are expropriating the public domain to become a monopoly.

This blog entry was only the beginning. I try to dig deeper in the topic as I have many ideas and I spend a good time to thinking about the future of our computing, indeed the methods organisation of the society, and producers.

ANTLR based parser passes work phase 1.

This one took a while. Almost too long, I almost dropped the whole thing at least twice. But fortunately I didn’t. So I’m proudly pronounce the ANTLR parser, which in its current status is able to satisfy the “DDT Tests – Core” suite and I’m proud of it. The first difficulty at hand was that I had no previous experience with ANTLR, indeed I had no previous experience with compiler/parser generators at all. Then, there were the problems to understand the way how ambiguous parsing rules can be resolved in ANTLR.

Then, there was the problem of getting out of sync of Bruno’s work. And there was an enormous amount of work with the AST classes to have the proper constructors in place, thus creating the AST without the descent.compiler format converter. And then I had to chew my self through 300 test cases to make them work. All done.

This work started with a few attempts before I settled the basic framework of how I’ll proceed with it. At this point of time, I can see major drawbacks of it works, but this is due to the problem that I didn’t want to invent a completely different AST management, so that all the rest of the DDT could go on as no parser replacement had occurred. This is the reason why the grammar file is so bloated with action code.

Let’s see what is yet to be done:

  • Error handling. There’s only a minimal error reporting/recovering support in the parser to get it running but it is not sufficient at all from the user’s point of view.
  • Review of all the latest and supported language features and the correct the functionality accordingly.
  • Incremental parsing. This is something I would not be able to address in any near future because it seems quite tedious piece of work. However, it seems quite reasonable to implement it, as it seems a huge possible improvement for the performance. (Let’s say, the user is has a line like this: “class Foo {}” and types in the block: “class Foo { int a; }”. In an incremental way it would cut right to the declDefs of the DefinitionClass instead of going a new round completely build the AST from scratch.)
  • Performance: At the moment the grammar file results in a huge parser code, and there are plenty of forward checks in the grammars. Perhaps some of them is avoidable with some cleverness, and the parsing is where all the performance improvement is much needed.
  • ANTLR compatible AST hierarchy. This would result some linear performance gains and some code clarity (removing the action code blocks). The problem here, when I visited this solution is that it seems ANTLR hasn’t got decent heterogeneous AST-support. All we have is a token based constructor. It could be useful though in case of operators or single-token rules, like some attributes.
  • Building the parser will need the ANTLR binaries and additional build rules. For the releases the lexer/parser sources should be attached to the code tree as they are.

All these things would be nice to have, but only the first is needed for the complete replacement of the descent.compiler.

Corresponding code here: http://code.google.com/a/eclipselabs.org/r/gyulagubacsi-ddt-contrib/source/list?name=feature-antlr-parser