This is the first of several articles I'm going to write about the Rebol programming language. To learn more about it, you can visit rebol.com. But my hope is to demystify some of its strengths and weaknesses in a way that their website currently does not, so if you read what I write first then it might help. :)
UPDATE 12-Dec-2012
Rebol version 3 is now open source under the Apache 2.0 license, so all those disclaimers I used to have in my Rebol articles about not getting attached to a proprietary language are no longer applicable! Use it without hesitation!!!
Rebol is built on the foundational concept of a flexible series, and that's what I'm going to focus on here. It's one of the many things that give the language a kind of elegant uniformity. In my article about Computer Languages as Artistic Medium, someone likened it to modeling clay (mold is even a keyword in the language). I thought that might even make a good marketing visual and made a prototype, so keep the words "flexible", "artistic", "uniform", "workable" and "fun" in mind.
How Rebol is like other Interpreted Languages
At first blush, Rebol's series might remind you of other languages. For instance, Lisp has Sequences that are used to represent not only data, but also the program code itself. A simple Lisp program to print "Hello World" could look like this:
(print "Hello world")
Lisp programmers understand that although that looks like source code, the interpreter also thinks of it as a "list" with two items. The first item is a token for the print command, and the second item is a string.
Depending on what suits our purposes, we can treat this information as data or code. If we use a certain evaluator, we will get the desired effect of printing out the string. Or we could write some other code that would process the list in some way. Let's look at this principle in action with Rebol.
Note
In the samples below, I'm showing code being evaluated in the Rebol interpreter. When you see two greater than signs, that's just the command line prompt for the interpreter
>>
When you run something, it will print any output... and after that, if the expression evaluated to a value, it will show the value next to two equals signs ==
.
>> first [ print "Hello world" ]
== print
The expression evaluated to
print
because the print command is the first element in the list. Notice that Rebol uses brackets the way Lisp uses parentheses. If you use the more generalized method of picking an element by its index position, you can get the second item in the series, which is a string in this case:>> pick [ print "Hello world" ] 2
== "Hello world"
Now let's do some "meta-programming"... a bit of code which manipulates our HelloWorld program a little bit before running it. You could use the
replace
command, and get a representation of a program that would print a different message:>> replace [print "Hello world"] "Hello world" "Goodbye world"
== [ print "Goodbye world" ]
To take that one step further and actually run this new program, you'd evaluate the expression using
do
:>> do replace [print "Hello world"] "Hello world" "Goodbye world"
Goodbye world
This style will come as no shock to anyone who's worked with interpreted languages with heritage from LISP. My own first exposure to such thinking was actually John Ousterhout's wildly popular Tool Command Language (Tcl).
The Series! Interface: Turtles All the Way Down
There's a funny part in Stephen Hawking's Brief History of Time, where he gives a retelling of a story known in science circles:
A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the center of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: "What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise." The scientist gave a superior smile before replying, "What is the tortoise standing on?" "You're very clever, young man, very clever," said the old lady. "But it's turtles all the way down!"
Once you get past the oddity of what's on Rebol's full plate, it can start to look surprisingly like turtles all the way down. For instance, when you write:
[ print "Hello world" ]
What you're getting under the hood of the interpreter is a lot more like if you'd written it as:
[ print [ #"H" #"e" #"l" #"l" #"o" #" " #"w" "o" #"r" #"l" #"d" ] ]
This is because Rebol makes many data types expose the
Series!
interface. So strings and other objects can be treated like lists even if they are not "lists". So let's try using some of the commands we saw working on blocks operate on strings instead:>> first "Hello world"
== #"H"
pick "Hello world" 2
== #"e"
>> replace "Hello world" #"o" #"a"
== "Hella world"
Note
The
replace
function only replaces the first instance by default. If you want it to replace all the instances in a series, you use replace/all
If you're only vaguely familiar with Lisp you might want to say that it does this too, since Strings are actually Sequences. But Lisp's Strings and Vectors are in truth a subclass of Sequence known as "Array", which have the nasty property that you can't change their size once they've been created! The results aren't pretty:
* (defparameter *myString* (string "Karl Marx"))
*MYSTRING*
* (subseq *myString* 0 4)
"Karl"
* (setf (subseq *myString* 0 4) "Harpo")
"Harpo"
* *myString*
"Harp Marx"
* (subseq *myString* 4)
" Marx"
* (setf (subseq *myString* 4) "o Marx")
"o Marx"
* *myString*
"Harpo Mar"
Whether you know Lisp or not, I'll just say that the credibility toward elegance sort of falls apart there. Here's the equivalent code in Rebol, showing how it should have been done:
Note
If you're new to Rebol, you'll notice that it doesn't use equals signs for assignments. It uses a name followed by a colon and then a value, such as
number-of-pizzas: 10
. Another thing is that functions support optional parameters known as "refinements" via slashes, so really just model copy
and copy/part
as two completely different functions for now.
>> my-string: "Karl Marx"
== "Karl Marx"
>> copy/part my-string 4
== "Karl"
>> change/part my-string "Harpo" 4
== " Marx"
>> print my-string
"Harpo Marx"
>> skip my-string 5
== " Marx"
>> insert skip my-string 5 " of the Brothers"
== " Marx"
>> print my-string
"Harpo of the Brothers Marx"
This is a powerful concept, especially because there are 13 fundamental types in Rebol that have this behavior. Here they are:
block!
→ Blocks of valuesparen!
→ Blocks of values enclosed in parenthesespath!
→ Paths of valueslist!
→ Linked listshash!
→ Associative arraysstring!
→ Character stringsbinary!
→ Byte stringstag!
→ HTML and XML tagsfile!
→ File namesurl!
→ Internet uniform resource locatorsemail!
→ Email namesimage!
→ Image dataissue!
→ Sequence codes
Rebol has Safe Iterators Built In
Iterators aren't a new thing. In fact, the C++ standard template library was built on the notion of generic algorithms that work with any object that elects to expose the iteration interface. You can binary search characters in a string using the same code that would search an array, or a linked list, or a hash table. Sorting and many other operations are defined abstractly in a powerful fashion.
Yet doing things "the right way" in C++ took a long time to standardize. These features aren't built into the language... and many people who program in C++ are either oblivious to them or avoid learning how to use them effectively. In part, that's because it looks like some crazy moon language:
std::string myString = "Hello world";
std::string::iterator myIterator = myString.begin();
while(myIterator != myString.end()) {
cout << *myIterator << std::endl;
myIterator++;
}
But Rebol took generalized iteration to heart and implemented this very important interface within the core interpreter. The result is efficient, terse, and every program takes it for granted. For starters, I'll show you a rather "wordy" line-by-line parallel of the above code in Rebol:
my-string: "Hello world"
myIterator: head my-string
while [my-iterator <> tail my-string] [
print (first my-iterator)
my-iterator: next my-iterator
]
But interestingly, there is no separate data type for iterators. Operations like
next
return a subsequence of the original series. That subsequence starts at the element you are currently referencing and goes to the end:>> my-string: "Hello world"
== "Hello world"
>> my-iterator: my-string
== "Hello world"
>> my-iterator: next my-iterator
== "ello world"
>> my-iterator: next my-iterator
== "llo world"
>> unset 'my-string
>> print my-string ; ** Just to prove it's gone **
** Script Error: my-string has no value
** Near: print my-string
>> print my-iterator
== "llo world"
>> my-iterator: head my-iterator
== "Hello world"
Note the entire series is reference counted, and any sub-series you have indexing into it keeps the data alive...even if you destroy the original series reference! You can still find the head of the series again when this happens. So it's possible to use the series to iterate itself, and then reset it when you are done:
my-string: "Hello world"
while [not tail? my-string] [
print (first my-string)
my-string: next my-string
]
my-string: head my-string
In fact, this is exactly how the
forall
idiom in Rebol works:my-string: "Hello world"
forall my-string [
print (first my-string)
]
Of course, you don't have to buy into the idea of modifying a series during an iteration. It's a little bit dodgy, especially regarding the semantics when you break out of the loop! But there are many ways to approach the problem, such as the
foreach
statement:my-string: "Hello world"
foreach my-char my-string [
print my-char
]
In Conclusion
There is something special about the specific trade-offs in Rebol
series!
. I hope I've shed a little bit of light on that specialness! It comes down to a kind of universal iteration and homogeneous structure that Lisp seemed to promise, but failed to follow all the way on...
You can find out more about it by reading the Series Documentation... as well as some specific documentation for block! series and string! series.