Well, the story of Rune is both short and long. It's short because I've rarely discussed the language in detail outside of a few friends, and long because my desire to write a language began in the late 1980's after I had learned C and started looking for something better. Fifteen years later and people will note that I am still a hard-core C programmer, working mainly on the DragonFly kernel code base. You might think (if you don't know me) that I'd be able to find *something* out there that would satisfy me. But I haven't.
For various reasons I have very little love for most of the programming languages out in the world today. I hate the design-by-committee mess people call C++, I hate the kitchen-sink-control-the-programmer approach taken by Eiffel, the cacophony that is perl, the limited Objective-C, the ever-changing lets-break-everyone-again TCL (though I will admit that, of late, it seems to have settled down. Still, TCL isn't for me!). I'm not interested in the politics associated with Java that has destroyed it as a viable programming language, or the wordy semantics and constructs. (Java is a toy that should never have seen the light of day IMHO). And my reasons go on. I have been unenthused with nearly every language I've come across. Languages like Scheme, Python, and Ruby are better, but still don't have the level of flexibility of abstraction that I desire.
I'm a great believer in the concept of a class hierarchy and in object-oriented (or at least object-centric) design, yet as you can see I have done little but write in C in all the intervening time.
Of course, another reason I have not switched to a more object-oriented language is simply because every time I try I find myself constantly comparing it to my ideal, and since Rune is culmination of almost everything I want in a language it would be rather difficult for me to like anything else better!
Throughout this period (15+ years), I began thinking of my dream language as "D", in my head, but I was ambivalent about trying to name it that in public. I finally came up with a great name, "Rune", after running through literally a thousand different possibilities. I don't care if it's already being used for other things... so is virtually every interesting word in the dictionary.
So why has it taken so long to write Rune? Well, there are many reasons. Some of you might remember the 'DICE' system I wrote for the Amiga and sold as shareware in the early 90's. DICE is a full ANSI-C compiler. And a preprocessor. And an assembler. And a linker. And a full set of standard C libraries, and even a profiler. I wrote it all, for the 68000. Unfortunately this points to a severe character flaw of mine when it comes to me and writing programs... I am a perfectionist, and other people's work rarely achieves the level of perfection I am comfortable with. I was able to build DICE quickly because the C language standard already existed, as well as standards for all the other necessary pieces, so I just had to start coding it up. But Rune is a different matter. With Rune I am designing the language as well as implementing it, and that amplifies the problem when one has my particular character flaw.
Over these many years my language has evolved. I have added many features and removed many more. I have written over two dozen drafts of the grammer but it has only been the last five or so years that it's settled down into something I consider to be reasonable. Another time factor is what I call the 'Dillon' factor. When I set out to design something I expect to be able to grasp and hold the entirety of the project in my head without external references. Rune is complex enough that it has taken quite a while to achieve this state of mind. I want to be able to scale simple concepts into extremely sophisticated capabilities. I hate special cases, so when I set out to write the lexical scanner, the parser, the semantic search code, and the other pieces that make up a language, I expected to get the core written and proven out in a week or less or it was no good. Needless to say I have had many false starts. A false start for me typically means wiping the code (rm -rf) in disgust. The only thing left after a false start is the memory of what went right and what went wrong, and I used that to begin the next iteration.
The reason I do things like this is actually quite simple. If I've done the low level pieces correctly the sophisticated pieces wind up being trivial to implement. I believe, strongly, that getting the core right greatly increases the flexibility of a project and I knew that flexibility would be key in further development of the language once the initial release occured. Whatever time I've apparently wasted getting this far is going to be repaid ten fold over in the future.
Well, I finally did it! I got over the hump in late January 2008 and now have something I really like. In late January of 2008 I managed to get all the major language constructs for Rune either working or proven. I knew I had hit the mark when I looked up from the keyboard one day and realized that I had just implemented variable arguments, constant procedure and operator call optimizations, and default value aggregation for types all in one day without realizing it. After second week I could actually start writing code in Rune, and after the third week I could actually start writing *sophisticated* code in Rune. And, as a bonus, the interpreter has wound up being pretty damn fast, able to execute a simple Rune for() loop counting from 0 to 10,000,000 in 0.80 seconds on a 1GHz desktop. I expect to eventually get this down into the 0.3 to 0.5 seconds range, but it's good enough now that I can simply not worry about it and focus on other issues. After spending a month coding I set it all aside to catch up on other things, and now I've dusted it off again, found that I still understand all the concepts and all the code (which must mean the code must be pretty good in my view), and I'm going to move forward with the work necessary for the first release.
So that's the history in a nutshell. I've conveniently left out what few discussions I've had with friends and my many attempts to kickstart the process. In the end it's only the final draft that matters, and this is it. This is the end of the beginning and the beginning of a new beginning.
Right off the bat I will say that I absolutely can't stand languages that try to implement everything. Kitchen sink languages tend to be so complex that they wind up being unusable or unreadable or unlearnable. When a programmer is able to implement some concept ten thousand different ways, you will wind up with ten thousand different implementations and you can shuck any efficient, collaborative development right out the window. I have thought long and hard in regards to which concepts I should include. Should I have inheritance, multiple-inheritance, subclassing, organization of class hierarchies, and so forth? I've thrown out a lot of concepts as simply not meshing well with the rest of Rune. For example, you will not find traditional multiple-inheritance in Rune. Why? It's overkill to the point of confusion (kinda like C++ actually). But you will find the concept of an 'Interface' which gives you multiple-inheritance-like features. You will not find fancy exception handling features like multiple code paths (which I consider to be junk since exception code paths are rarely tested and not likely to work as intended in a complex project anyway), though due to the implementation Rune could actually implement a raise-like feature. You might see a raise in the future. You will not find a formal single-root class hierarchy. Been there, done that, stupid idea. Nor will you find complex overloading features. Rune is designed in a manner that avoids most potential namespace conflicts without over-burdening the programmer with long dotted identifier sequences or complex specifications. Since most conflicts are naturally avoided in the first place, Rune doesn't need sophisticated overloading or conflict handling features. What will you find in Rune? Read on!
* Import mechanism. Rune uses an import mechanism to collect project files and libraries together into a program. There is NO include mechanism. If you import a directory then Rune will locate "$directory/main.d" and import that. "main.d" then typically imports whatever libraries are required as well as the other project files and provides a main() to start execution of the program. library-library dependancies are typically resolved simply by having the library itself import whatever other libraries it needs (and as a welcome side effect this makes libraries self contained). Duplicate imports of self-contained entities are allowed and do not result in any significant extra overhead unless you are really, really stupid.
* Semantic search is not class-hierarchical. Rune locates an identifier through a semantic search based on how you stack and name imports. Within a procedure the semantic search begins in a manner similar to C, where the search checks local declarations within the procedure and then within the current file. If the identifier cannot be found the semantic search progresses upward through the import hierarchy until it hits the root (of the import hierarchy) -- the original file that kicked the whole thing off in the first place. There is no object root. Identifiers are searched level-by-level and can be forward referenced (Rune makes no distinction between backward and forward references except in one parse-time convenience case). You can push down into an import, type, or class, by using a dotted sequence of identifiers. For example, if "main.d" imports "File" (file support) and "a.d", "b.d", and "c.d", then code sitting in "c.d" can access the file support library simply with a construct like File.FILE *fi = ....;. Rune also includes a mechanism which allows you to push into an import without specifying the import's id, called the import ... as self mechanism.
The import ... as self mechanism is typically used only to collect tightly integrated source files together... for example, source files in a subdirectory. The main system library is typically imported by a project as self, but library imports in general are almost universally named entities rather then self entities in order to avoid namespace conflicts.
* Typedefs. Rune supports typedefs. Typedefs are used to 'alias' classes, with or without modification, making them available at mulitple semantic points. A typedef can be placed anywhere a declaration can be placed: At the top level, in a class definition, in a procedure... even inside a compound type if you want (though I might call you crazy). Typedefs are extremely convenient constructs which allow you to short-cut common types to avoid having to specify class paths with lots of dots. For example, if your project uses File.FILE a lot, you could simply typedef File.FILE FILE; in your project's "main.d" source file and just use FILE in all the project source files it imports.
Typedefs can also be used to change default assignments. For example, typedef myint int a = 4; creates a new type myint. When you instantiate myint the object will default to a contents of 4 unless you specifically give it another default.
* Classes. Rune supports a class mechanism. All types except compound types are based on a class somewhere. This includes core Rune types such as int. What you know as an int in C serves the same function in Rune but int is actually a class in Rune. Classes contain declarations: typedefs, storage declarations, operators, procedures, and constants. It is important to note that Rune uses a C-like pass-by-value model. Rune also supports bounded pointers and so can pass objects by reference, and also suppors passing and returning an object by lvalue. An lvalue is effectively an object which appears to be passed by value but is actually passed by reference, allowing the procedure to modify it. lvalue is necessary to implement operators like "++" and "+=". Rune makes no distinction between core classes and user-defined classes.
* Subclasses. Rune supports a subclassing mechanism. A subclass is basically a merge of its superclass plus additional declarations and/or refinements of declarations inherited from the superclass. Namespace overloading is explicit.. if you declare something using an identifier that conflicts with a declaration in the superclass and you do not explicitly say that you are refining the superclass declaration, then your declaration will be used by anything you define in your subclass but the 'hidden' declaration in the superclass will be used by anything defined by the superclass. Rune allows you to refine methods, typedefs, and storage (changing the size of preexisting storage elements in an object). However, because I have no wish to force the entire language to be dynamically typed at run time only subclasses which refine procedures and extend existing storage are compatible with dynamically bound reference pointers (see later).
* Refinement. When you subclass a class you can refine declarations made in the superclass. You can only refine declarations by making them more specific. For example, if the declaration in the superclass is Integer x = 4; then you could refine it to be int8 x; but you could not refine it to be float x;. You can refine types and default values and you can partially resolve procedure arguments (and/or refine argument and return types). Refinement is an extremely powerful mechanism. The core library uses refinement to declare all of Numeric's operators in the Numeric superclass using a typedef'd type called 'T'. The core library then simply refines the typedef T in various subclasses in order to automatically re-form the supeclass operators and procedures into more specific operators -- without having to redeclare the operators or procedures at all. This means, in effect, that you can write generic procedures in the superclass that operate on declarations defined by the superclass in the context of the subclass. Subclas refinement is one of the most sophisticated and important features of the Rune language.
Refinement also works well with method procedures. These are procedures which execute with an object context called this. If you create a subclass which refines a superclass declaration, and then call an unrefined superclass method, the superclass method will use your refined declaration(s). Superclass procedures which are overriden by subclass refinement are not automatically called. Instead your refined procedure must use the super.funcname(...) mechanism to chain the superclass method.
* Pass-by-value and embedded objects. Rune uses the C pass-by-value and embedded-type model. In C you can embed one structure inside another. You can do the same with classes in Rune. Rune also has what we call an lvalue extension which allows a procedure to enforce a pass-by-reference and/or return-by-reference for an object that is passed to it literally, which Rune operators use to implement variable-modifying behavior (such as ++x) and which you will be able to use to for the same purpose.
* Forward referenced identifiers. Rune does not impose many requirements on the ordering of declarations within a file or between files and libraries. There are no prototypes in Rune. The only thing you cannot do is directly embed classes to form a loop (class A into class B and class B into class A). You can, of course, embed mutual pointers. Certain identifiers cannot be forward referenced due to being handled at parse time or being used before being initialized in a procedure, but most do not have that restriction.
* Bounded pointers. Rune's interpreter uses bounded pointers. For example, a pointer into an array can be manipulated with addition and subtraction, just like C. But Rune will not allow the pointer to be accessed if it goes out of bounds. Rune also does not allow you to arbitrarily cast pointers from one type to another (XXX ok, it does at the moment, but soon it wont). You can think of this as C with 'safe' pointers. Pointers have been given a bad name by C, but pointers are not inherently a bad concept.
* Dynamically bound pointers (known as reference types). In Rune something like 'Frame *x;' represents a pointer to a frame. A dynamically bound reference is something like 'Frame @x;'. This creates a pointer object that can be bound to the specified class or any compatible subclass of the specified class at run-time. You can only access the fields defined by the superclass (Frame in this case) through a reference type, but that's why we have refinement. Your method calls will call into the version of the procedures refined by the associated subclass, for example. Reference types are most often used to allow a superclass to define procedures to manage linked lists of the object which mix various subclassed versions of the object.
* There is no union. Believe me, this is a feature. I might still wind up adding a union feature later but, so far, I have found that the subclassing mechanism is sufficient to handle situations that would normally use a union in C. And if the subclassing mechanism is not sufficient, I will make it sufficient.
* User-defined operators. All operators in Rune are physically declared. Even Rune's core operators on integers and other types are declared in the core Rune "sys" library. You can define your own unary and binary operators and you can use almost any sequence of operatic characters to do it. Rune imposes one bit of sanity, however. You cannot specify the precedence. The precedence of an operator is fixed based on the characters making up the operator. All C-like operators wind up with C-like precedences. Your own operators will wind up with C-like precedences too, using simple rules defined in the Rune grammer.
User-defined operators can be emplaced in any class belonging to the arguments you supply to the operator when you use it, and may also be emplaced in your main semantic search path. So, for example, you cannot add a new operator to the core Integer class but you can certainly add a new operator that operates on various kinds of ints semantically, such as in your main.d module.
* LValue operators and procedures (deserves separate mention). An LValue operator is capable of assigning the left hand side as well as returning a result. The left hand side must be an lvalue. For example, the C "+=" and "++" operators are LValue operators, while "+" is a normal RValue operator which returns a temporary result. LValue operators allow Rune to implement almost the full complement of C operators and additionally allows you to build your own user operators with similar characteristics.
* User-defined casts. All casts in Rune are physically declared, including core castings such as int8->int16. The Rune class library implements the more common casts but does not implement all casts. For example, you cannot cast int8 directly to uint32. You would have to do something like (uint32)(uint8)int8_value.
* Method calls. A method call is something like file->setMode(10). That is, making a call through an object. In Rune a method declarations or call is similar to a procedure call passing the object as an lvalue as the first argument to the procedure and, in fact, the Rune resolver converts method declarations and calls to exactly that. You can also create global method calls. These can be called through the object or through the object's type or class, but rather then passing an object as the first argument the type is instead passed. Something like FILE *fi = FILE.fopen(...) is an example of a global method call.
Rune normally constructs the first argument for a method call silently and automatically. The argument is named this and will be the object the method is acting upon as an lvalue. However, there are cases where you might want to declare this argument yourself. Specifically there are cases where you may wish to pass an lvalue pointer to an object rather then the lvalue object itself. Constructors are typically used to initialize objects in-place, while method calls are typically used to create new objects. Having a method call create an object can get complex if the caller really wants the method to create a subclass instance of the object. While the caller could certainly refine the creation method to allocate the subclass, it is almost always easier to simply have the original method procedure take a reference pointer lvalue as the this argument and allocate the correct subclass object itself. For example, the createFrame method in "classes/gfx/window.d" needs to be passed a reference pointer as an lvalue so it can allocate a new Frame object and then assign it to the pointer. The reference pointer is typically NULL to begin with.
* Rune implements the equivalent of multiple-inheritance through the
concept of an interface. In Rune, an interface is simply a
subclass definition nested within a class or subclass definition.
The nested subclass has full access to its parent's object and the
parent object may be passed to, assigned, or cast to entities
expecting a superclass of the subclass.
It is somewhat confusing to describe, but the jist is that you
can very trivially declare as many interfaces to other classes
within your class definition as you like and then refine those
interfaces as appropriate, with full access to your larger class.
For example, if you want the standard I/O FILE.show() function to
work on objects of a custom class you define, all you need to do
is declare interface Stdio.Showable { ... refinements... }
within your custom class definition and it will just work.
* Declaration/Class merging. You do not have to build all the declarations making up a class in the class definition itself. You can place the declarations outside the class definition and even place them in other source files as long as library scope is adhered to and the declarations do not forward-reference the class (this is because the merge occurs at parse time rather then resolve-time). This allows very large classes to be built without ruining the readability of the source code. You cannot merge a declaration into a class outside of the library defining the class, but you can of course create a subclass and add elements to the subclass (remember the part where I said being able to do something a thousand different ways is not a good idea? Well, this is one of those).
* No magic type promotion. It should be noted that Rune does not automatically promote integer types for standard operators. If you add two int8's together the result is going to be another int8. Rune will not allow you to compare an int8 against an int32, at least not unless you create a specific operator to do it. My intent is to remove the single largest problem C has when porting between architectures (including tiny 8-bit architectures). If you want something else you need to cast the arguments before hand or create your own operator (which you can do, in fact). This allows Rune to be used across the entire range of platforms, from 8 bit cpu's to 128 bit cpu's, without having to change the language specification or create special cases.
Automatic type promotion is not the convience you might think it is. I have occassionally tried to compare or assign integers with characters while programming in Rune but it has not been a burden. In fact I believe requiring the use of a specific cast or integer constant suffix in these cases has actually clarified the code in question.
* Inherent casts. If Rune knows the type an expression needs to be, for example an argument to a procedure, it will attempt to cast your expression to that type. Rune can only cast based on cast operators in the class hierarchy, so Rune can automatically cast, say, an int8 to an int32 but cannot automatically cast an int8 to a uint32 without a little help from the programmer. Keep in mind the such casting occurs after the expression has been evaluated. If you add two int8's together, you will get an int8 result and that result might then be cast to something larger. The cast will not change the fact that the "+" operator in this case still returns an int8 result.
* Object-oriented. Everything is an object and eventually bases at some class or compound type. Oh wait, I said that. Ok, I'll say it again. Even the lowly int is a class.
* Partial procedure argument resolution. This allows you to make a more complex procedure compatible with a less complex call interface by pre-evaluating and pre-supplying missing arguments. For example, a library that takes a procedure callback does not need to know about additional information you supply with the procedure, such as application-specific structures and pointers. Partial resolution looks like a procedure call in C but you take the address of the call. For example, if you have a function that takes two arguments you can turn it into a function taking one argument by partially resolving the other, like '&func(a:23)'. alias can also be used to partially resolve a procedure.
* Memory and code efficient. I stick to a model that does not require structural or support bloat. The run-time is separated from the relatively static information (such as statement and expression trees) in order to allow the relatively static info to be cached in a file (e.g. so it can be shared across multiple instances of the program and to avoid unnecessary copy-on-writes). This will also allow us to implement run-time threading fairly trivially. I've only written an interpreter so far but I fully intend to take advantage of the compartmentalization I've built into the language to produce an intermediate (pre-parsed) form as well as assembly/object code.
* Built-in Interpreter. My intention with Rune is to be able to interpret it, compile it, and even do an on-the-fly hybrid of the two. I also intend to eventually allow dynamic compilation... loading and unloading source modules on the fly. At the moment I am concentrating on the interpreter. I cannot really start work on the compiler until the interpreter is nearly finished because the compiler is going to rely on the interpreter's constant expression resolution/optimization code.
* No #includes. No preprocessor. Yes, this is a feature. The language implements an import mechanism. You pull everything. There is no export mechanism but there are three levels of semantic scope: private, library, and public. My intention is to give the lexer a simple preprocessing capability, similar to what you might find in make, but I have no intention of having a full fledged C-like preprocessor (even though I could just borrow the one from DICE) because I believe that preprocessing, and most especially #include files, take what could be a nice modular object-oriented application and library set and turns it into a portability nightmare..
* Compartmentalized lexer, parser, interpreter, compiler, and library manager. At each stage the intention is to be able to save an image to the file that can be mmap()'d and accessed directly in a later stage, eventually allowing whole projects to be partially pre-parsed, pre-compiled, etc. The features will also be necessary when we allow code modification on the fly.
* Threads. Very cool threads. The traditional problem one has with threading is that one must explicitly declare all sorts of infrastructure to create and manage multiple threads and related data structures. Rune simplifies this greatly by introducing the concept of a threaded procedure call. If thread A calls the threaded procedure X, thread A will be suspended until the threaded procedure X issues a result(exp); statement. The value is returned to A and A's thread resumes execution. The threaded procedure also continues to run. For example, a threaded procedure could allocate and return an object representing the thread to the caller, then continue running as the new thread. Rune threads are cooperative and single-tasking by default. I'm sure lots of people will hate that, since one can't take advantage of multiple cpu's when running a single Rune program, but the fact of the matter is that it is far easier to write a program in a cooperative threading environment then in a preemptive environment. You can avoid nearly all the locking you would otherwise have to deal with and you are not as likely to introduce bugs in your program due to data races. I have not entirely given up on preemption, but I do not intend to turn the language into a dogpile just to make preemption work.
Rune provides statements to force a thread switch, request a 'reasonable' thread switch, and to turn on and off preemption (the interpreter just checks its cycle counter automatically in that case). The mode you use depends on what you are doing. There is no preemption by default and Rune supplies simple scope qualifiers to allow preemption to be turned on on a procedure-by-procedure basis.
It is important to note that controlled preemption is a design feature, not a mistake. I'm sure some people will believe it to be a mistake simply by the fact that they cannot have a single program utilize all available cpu's simultaniously, but those people almost certainly have never actually tried building a large project in a fully preemptive threading environment. Or, if they have, they are still working the bugs out to this day. Full preemption requires a lot of object locking and very careful programming and can actually wind up being slower on multiple cpus then cooperative scheduling or controlled preemption is on a single cpu. Even languages like Java which attempt to integrate object locking into the run time are still unable to protect programmers from making stupid mistakes even with simple threaded programs. Rune's threading mechanisms bring the power of threads to the programmer without bring the mess that usually accompanies it.
My intention is to eventually allow multi-cpu threading through the use of additional scope qualifiers on procedures, allowing mixed scheduling types, but this sort of threading is intended only for those core pieces of a project for which it might actualy make sense to do so. Trying to preemptive multi-thread elements that do not require that level of performance is just plain stupid.
* Compound types and expressions. Rune implements compound types and expressions via a clever use of parenthesis and commas. There is no comma-operator in Rune. Also, note that the arguments you supply to a procedure are, in fact, interpreted as one big compound type. Instead, there is the concept of the compound expression. A compound expression is something like this: (1, 2, 3). Additionally, the elements of a compound type can be named in a manner similar to the declaration of a procedure's arguments. For example, you can do this:
(int a = 2, int b, int c) x = (b:23, c:44);
x.c = x.a + x.b;
Now is that cool or what? As with any declaration you can specify default values. When you make a call to a procedure which returns a compound type you can either store the result into a compatible compound type (throwing away elements you don't request), or you can store the result into a normal type in which case only the first element of the returned compound type is retained. When building a compound type any elements which have defaults are optional, and since a procedure's argument set is a compound type you can extend a procedure in a backwards compatible fashion by adding a new argument to it and giving it a default. 'Older' users of the procedure who don't supply the argument are still ok.
* Multi-valued returns. If you haven't guessed yet, a multi-valued return is simply a compound type. The intention here is to eventually provide a mechanism to allow a multi-valued return to be cast into a compatible compound type or even a degenerate non-compound type. While Rune has no 'raise' or exception mechanism, the intent is to eventually tie one into this casting mechanism so, for example, we can raise out of a procedure if the caller does not explicitly access the element representing the exception.
* Constructors and Destructors. Classes and compound types may have any number of constructors and a destructors. A constructor or destructor is a method procedure that takes no arguments. Constructors are called when an object is instantiated, after any type and declarative defaults are set. Destructors are called when the last reference to an object goes away. The constructors and destructors associated with the superclass will be called before the constructors and destructors associated with a subclass. Constructors and destructors are called in order, first to last.
Destructors can resurrect an object. When an object's ref count reaches 0 Rune will set the ref count back to 1 and call the destructors. Rune then checks the ref count. If the ref count is still 1 Rune blows the object away. If the ref count is greater then 1 then the object is considered to have been resurrected.
* Heap storage via pointers. In Rune pointers are bounded and can be ref-counted. Heap storage is nothing more then a bounded pointer that is always ref-counted. To create storage on the heap you simply declare a pointer and execute the new method on it. For example, MyObject *a; a.new();. It's that simple. Rune will free the storage when the last reference goes away. Note that Rune does not garbage-collect. Creating an unreachable cycle is considered a run-time error though, at the moment, Rune does not detect the condition. I consider creating an unreachable cycle to be sloppy programming. Also note that while Rune will recursively free chains, it is not a good idea to depend on the feature to delete very long chains because you might run the thread out of stack space. Instead we recommend that you NULL-out the chain pointers manually.
* Scripting. Rune supplies a nifty little scripting mechanism which makes writing a program in Rune and executing it via the interpreter trivial.
* No version control. Rune does not attempt to integrate version control mechanisms into the language. Folks, the plain fact of the matter is that even the most robust Eiffel-like version control mechanism does not make up for actually testing a project with the latest set of so-and-so library. In fact, I strongly believe that integrating such features into a language create even more problems then they solves because, like exceptions, it results in code paths which are virtually untestable and as likely to blow up as to work the first time they occur in real life. In balancing one evil against the other I would much prefer documentation over enforced language-level compatibility. Rune provides mechanisms, such as argument defaulting, that allow you to create backwards compatible libraries but does not require that you use them. There is also nothing preventing you from including a major version number in the names of the libraries you import, nor is there anything preventing you from importing different versions of the same library (though the usability of doing so is suspect). Rune is designed to facilitiate, but not enforce, backwards compatibility.
Please refer to the Rune core class module, ../classes/sys/class.d.
Generally speaking core types are as you might expect coming from a C environment, except:
* char's are alwys unsigned 8 bit quantities regardless of the architecture.
* int's are always signed 32 bit quantities regardless of the architecture.
* long's are always signed 64 bit quantities regardless of the architecture.
Types in Rune do not have the same kind of wiggle room as you see in C. 'int' is always 32 bits, for example, and 'char' is always an unsigned 8 bit quantity. Period end of story..
* Lexer
The Rune lexer converts a Rune source file into a stream of tokens. The lexer is responsible for detecting and skipping comments and whitespace, and for separting tokens out.
The Rune lexer is writen directly in C and is extremely fast. Tokens are tracked through a token_t structure which is typically manipulated by reference, and an integer identifier is returned along with the new token. The parser switches on this identifier almost universally. The lexer has a look-ahead capability which the parser uses on occassion.
* Parser
The Rune parser takes a stream of tokens from the lexer and (well Duhhh!) parses the language. The Rune parser will also recursively open and parse lexical streams related to import statements.
The Rune parser is recursive-descent, written directly in C. Parser elements are simply procedures. However, the language is LALR(1) (or LALR(2) depending on how you look at it) and the parser never has to back up. This makes the parser extremely fast.
The Rune parser constructs a hierarchy of statements, expressions, and declarations on the fly. The primary hierarchical component that glues everything together is the statement. For example, the statement representing an import will have a reference to the imported entity.
However, declarations are used as the cornerstone of the semantic identifier search and management subsystem. Any identifier that can be looked up will have an associated declaration somewhere. Declarations also earmark storage offsets within structures, compound objects, procedures, and procedure arguments.
The Rune parser does not attempt resolve identifiers at parse-time. This is one reason why Rune can have forward-referenced identifiers. Identifiers representing definitions are hashed at parse time, however.
* Resolver
The resolver is responsible for taking a parse set and turning it into something that can be interpreted, compiled, and/or saved and restored. The resolver takes three passes on the parse-set:
* Pass 1 - Resolve subclass hierarchy. The resolver must first integrate declarations from the superclass into the subclass. Refinement issues are handled at this time. Note that the statement, expression, and declaration subhierarchies are duplicated for each subclass in order to allow Rune to optimize the refined procedures on a subclass-by-subclass basis. This greatly simplifies the job of the interpreter and compiler. I intend to have Rune figure out which elements in the superclass are not used at all by the subclass and to delete them / not generate them. For now though Rune will generate them.
Resolving the subclass hierarchy is a recursive affair. For example, the declarations in Numeric must be integrated with the declarations in Integer and from there integrated with the declarations in, say, int8.
* Pass 2 - Resolve types, expressions, and declarations. This pass is responsible for assigning an intermediate type to each expression, resolving the types (figuring out actual storage and alignment requirements for a type), and resolving non-procedural declarations (figuring out the actual storage requirements for a declaration). This includes completely resolving compound types, the top-level, and classes, all of which are really just a list of declarations.
This pass is responsible for looking up and resolving operators, procedures, and for enforcing casts. All of these operations involve manipulation of the expression hierarchy.
* Pass 3 - Resolve temporary storage requirements. The interpreter and compiler needs to know about temporary storage requirements for expressions and aggregate storage requirements for procedures. For example, if you do something like int a = (b + c) * d; then we need to store the intermediate result from (b + c) somewhere. This pass figures out what those requirements are so the interpreter can allocate the necessary space.
This pass allows the interpreter or compiler to know how much space a procedure requires to operate inclusive of variable declarations and temporary space. The Rune intepreter will reserve the space for the entire procedure up-front, on the stack, when the procedure is called. The resolver is also able to figure out legal overlappings for storage so when you have multiple executable blocks in a procedure, like this: { int a; ... } { int b; ... }, Rune will only need to allocate 4 bytes of actual stack space to represent both a and b. The same goes for temporary space used by an expression. The resolver is aware of the difference between lvalue and non-lvalue intermediate storage and only needs to allocate actual space for non-lvalue storage.
* Interpreter
The interpreter will execute a Rune program. It provides internal tie-ins to core classes, operators, and system calls, which are implemented by the interpreter itself rather then in Rune code. For example, adding two integers together ("+" on two ints) requires tie-ins to a native in-interpreter implementation of the operator.
The Rune interpreter will do further on-the-fly optimization of the statement and expression hierarchy. For example, if you have a constant-producing procedure and you call it, the interpreter will interpret the procedure, recognize the constant result, and then shortcut the expression hierarchy so the constant can simply be supplied directly the next time that particular call is made. Another example is our internal bindings. When the interpreter sees an operator bound to an internal function it will resolve the binding and from then on simply call the function directly.
Other common optimizations are things like variable + constant operations, which Rune will do for certain selected types. In most cases the interpreter simply changes the ex_Func function vector in the expression or st_Func function vector in the statement to achieve the optimization.
The Rune interpreter holds all state necessary to make a procedure call on the stack, allowing us to create pthreads threads for Rune threads on a 1:1 basis.
* Compiler
* Assembler
* Linker