Feed Icon RSS 1.0 XML Feed available

The Flexible Series as a Core Concept of Rebol

Date: 5-Sep-2008/22:44

Tags:

Characters: (none)

This is the first of several articles I'm going to write about the Rebol programming language. To learn more about it, you can visit rebol.com. But my hope is to demystify some of its strengths and weaknesses in a way that their website currently does not, so if you read what I write first then it might help. :)
UPDATE 12-Dec-2012 Rebol version 3 is now open source under the Apache 2.0 license, so all those disclaimers I used to have in my Rebol articles about not getting attached to a proprietary language are no longer applicable! Use it without hesitation!!!
Rebol is built on the foundational concept of a flexible series, and that's what I'm going to focus on here. It's one of the many things that give the language a kind of elegant uniformity. In my article about Computer Languages as Artistic Medium, someone likened it to modeling clay (mold is even a keyword in the language). I thought that might even make a good marketing visual and made a prototype, so keep the words "flexible", "artistic", "uniform", "workable" and "fun" in mind.
My rendition of the Rebol logo in clay

How Rebol is like other Interpreted Languages

At first blush, Rebol's series might remind you of other languages. For instance, Lisp has Sequences that are used to represent not only data, but also the program code itself. A simple Lisp program to print "Hello World" could look like this:
(print "Hello world")
Lisp programmers understand that although that looks like source code, the interpreter also thinks of it as a "list" with two items. The first item is a token for the print command, and the second item is a string.
Depending on what suits our purposes, we can treat this information as data or code. If we use a certain evaluator, we will get the desired effect of printing out the string. Or we could write some other code that would process the list in some way. Let's look at this principle in action with Rebol.
Note In the samples below, I'm showing code being evaluated in the Rebol interpreter. When you see two greater than signs, that's just the command line prompt for the interpreter >> When you run something, it will print any output... and after that, if the expression evaluated to a value, it will show the value next to two equals signs ==.
>> first [ print "Hello world" ]

== print
The expression evaluated to print because the print command is the first element in the list. Notice that Rebol uses brackets the way Lisp uses parentheses. If you use the more generalized method of picking an element by its index position, you can get the second item in the series, which is a string in this case:
>> pick [ print "Hello world" ] 2
== "Hello world"
Now let's do some "meta-programming"... a bit of code which manipulates our HelloWorld program a little bit before running it. You could use the replace command, and get a representation of a program that would print a different message:
>> replace [print "Hello world"] "Hello world" "Goodbye world"
== [ print "Goodbye world" ]
To take that one step further and actually run this new program, you'd evaluate the expression using do:
>> do replace [print "Hello world"] "Hello world" "Goodbye world"
Goodbye world
This style will come as no shock to anyone who's worked with interpreted languages with heritage from LISP. My own first exposure to such thinking was actually John Ousterhout's wildly popular Tool Command Language (Tcl).

The Series! Interface: Turtles All the Way Down

There's a funny part in Stephen Hawking's Brief History of Time, where he gives a retelling of a story known in science circles:
A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the center of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: "What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise." The scientist gave a superior smile before replying, "What is the tortoise standing on?" "You're very clever, young man, very clever," said the old lady. "But it's turtles all the way down!"
Once you get past the oddity of what's on Rebol's full plate, it can start to look surprisingly like turtles all the way down. For instance, when you write:
[ print "Hello world" ]
What you're getting under the hood of the interpreter is a lot more like if you'd written it as:
[ print [ #"H" #"e" #"l" #"l" #"o" #" " #"w" "o" #"r" #"l" #"d" ] ]
This is because Rebol makes many data types expose the Series! interface. So strings and other objects can be treated like lists even if they are not "lists". So let's try using some of the commands we saw working on blocks operate on strings instead:
>> first "Hello world"
== #"H"

pick "Hello world" 2
== #"e"

>> replace "Hello world" #"o" #"a"
== "Hella world"
Note The replace function only replaces the first instance by default. If you want it to replace all the instances in a series, you use replace/all
If you're only vaguely familiar with Lisp you might want to say that it does this too, since Strings are actually Sequences. But Lisp's Strings and Vectors are in truth a subclass of Sequence known as "Array", which have the nasty property that you can't change their size once they've been created! The results aren't pretty:
* (defparameter *myString* (string "Karl Marx"))
*MYSTRING*

* (subseq *myString* 0 4)
"Karl"

* (setf (subseq *myString* 0 4) "Harpo")
"Harpo"

* *myString*
"Harp Marx"

* (subseq *myString* 4)
" Marx"

* (setf (subseq *myString* 4) "o Marx")
"o Marx"

* *myString*
"Harpo Mar"
Whether you know Lisp or not, I'll just say that the credibility toward elegance sort of falls apart there. Here's the equivalent code in Rebol, showing how it should have been done:
Note If you're new to Rebol, you'll notice that it doesn't use equals signs for assignments. It uses a name followed by a colon and then a value, such as number-of-pizzas: 10. Another thing is that functions support optional parameters known as "refinements" via slashes, so really just model copy and copy/part as two completely different functions for now.
>> my-string: "Karl Marx"
== "Karl Marx"

>> copy/part my-string 4
== "Karl"

>> change/part my-string "Harpo" 4
== " Marx"

>> print my-string
"Harpo Marx"

>> skip my-string 5
== " Marx"

>> insert skip my-string 5 " of the Brothers"
== " Marx"

>> print my-string
"Harpo of the Brothers Marx"
This is a powerful concept, especially because there are 13 fundamental types in Rebol that have this behavior. Here they are:
  • block! → Blocks of values
  • paren! → Blocks of values enclosed in parentheses
  • path! → Paths of values
  • list! → Linked lists
  • hash! → Associative arrays
  • string! → Character strings
  • binary! → Byte strings
  • tag! → HTML and XML tags
  • file! → File names
  • url! → Internet uniform resource locators
  • email! → Email names
  • image! → Image data
  • issue! → Sequence codes

Rebol has Safe Iterators Built In

Iterators aren't a new thing. In fact, the C++ standard template library was built on the notion of generic algorithms that work with any object that elects to expose the iteration interface. You can binary search characters in a string using the same code that would search an array, or a linked list, or a hash table. Sorting and many other operations are defined abstractly in a powerful fashion.
Yet doing things "the right way" in C++ took a long time to standardize. These features aren't built into the language... and many people who program in C++ are either oblivious to them or avoid learning how to use them effectively. In part, that's because it looks like some crazy moon language:
std::string myString = "Hello world";
std::string::iterator myIterator = myString.begin();
while(myIterator != myString.end()) {
    cout << *myIterator << std::endl; 
    myIterator++;
}
But Rebol took generalized iteration to heart and implemented this very important interface within the core interpreter. The result is efficient, terse, and every program takes it for granted. For starters, I'll show you a rather "wordy" line-by-line parallel of the above code in Rebol:
my-string: "Hello world"
myIterator: head my-string
while [my-iterator <> tail my-string] [
    print (first my-iterator)
    my-iterator: next my-iterator
]
But interestingly, there is no separate data type for iterators. Operations like next return a subsequence of the original series. That subsequence starts at the element you are currently referencing and goes to the end:
>> my-string: "Hello world"
== "Hello world"

>> my-iterator: my-string
== "Hello world"

>> my-iterator: next my-iterator
== "ello world"

>> my-iterator: next my-iterator
== "llo world"

>> unset 'my-string

>> print my-string ; ** Just to prove it's gone **
** Script Error: my-string has no value
** Near: print my-string

>> print my-iterator
== "llo world"

>> my-iterator: head my-iterator
== "Hello world"
Note the entire series is reference counted, and any sub-series you have indexing into it keeps the data alive...even if you destroy the original series reference! You can still find the head of the series again when this happens. So it's possible to use the series to iterate itself, and then reset it when you are done:
my-string: "Hello world"
while [not tail? my-string] [
    print (first my-string)
    my-string: next my-string
]

my-string: head my-string
In fact, this is exactly how the forall idiom in Rebol works:
my-string: "Hello world"
forall my-string [
    print (first my-string)
]
Of course, you don't have to buy into the idea of modifying a series during an iteration. It's a little bit dodgy, especially regarding the semantics when you break out of the loop! But there are many ways to approach the problem, such as the foreach statement:
my-string: "Hello world"
foreach my-char my-string [
    print my-char
] 

In Conclusion

There is something special about the specific trade-offs in Rebol series!. I hope I've shed a little bit of light on that specialness! It comes down to a kind of universal iteration and homogeneous structure that Lisp seemed to promise, but failed to follow all the way on...
You can find out more about it by reading the Series Documentation... as well as some specific documentation for block! series and string! series.
Business Card from SXSW
Copyright (c) 2007-2015 hostilefork.com

Project names and graphic designs are All Rights Reserved, unless otherwise noted. Software codebases are governed by licenses included in their distributions. Posts on blog.hostilefork.com are licensed under the Creative Commons BY-NC-SA 4.0 license, and may be excerpted or adapted under the terms of that license for noncommercial purposes.