This is a really intriguing project from 2013, exploring how programming languages in non-latin character sets could look like. The fact that this is a Lisp in Arabic makes it plain beautiful to...
This is a really intriguing project from 2013, exploring how programming languages in non-latin character sets could look like. The fact that this is a Lisp in Arabic makes it plain beautiful to behold, too.
Really, not being able to grasp anything of what's going on onscreen with more than a logic or programming-syntactic barrier preventing one from making sense of it is a really unique experience for me, and is how I imagine people without knowledge of English experience the innards of most modern programs. Really, really fascinating.
We have a Portuguese programming language called Portugol. It’s meant for educational purposes. It has no other reason to exist. It resembles Pascal. It sucks balls.
We have a Portuguese programming language called Portugol. It’s meant for educational purposes. It has no other reason to exist. It resembles Pascal. It sucks balls.
One needs to be careful not to conflate writing systems with languages, programming languages with natural languages, and software tools with programming languages. I’m all for anything that helps...
One needs to be careful not to conflate writing systems with languages, programming languages with natural languages, and software tools with programming languages.
I’m all for anything that helps push tool-makers to make software tools more amenable to non-English, non-Latin script text. But, let’s not make assumptions about cultural biases! One of my first questions for this language would be, how do you, for instance, express numbers in قلب? Fundamentally, all that computers understand are numbers, and in particular, digital computers only understand binary integers. We develop a lot of abstractions over those numbers in order to translate data into something more convenient for manipulating by humans (such as textual representations of variable or constant identifiers). Since humans understand text better than numbers, we do all kinds of things for our convenience like convert binary numbers to hexadecimal or decimal. In fact, we even express text as other kinds of text! Unicode assigns integers to every written character (or at least, a lot of them), but we don’t display text as a sequence of integers. This is not a cultural bias, but rather a universal human bias. Computers don’t have biases—only engineers who design, operate, and program them have biases!
For example, let’s look at the name 'قلب' in more detail:
index
code-point
ordinal
unicode-name
category
width
utf-8
character
0
U+0642
1602
ARABIC LETTER QAF
Lo
N
b'\xd9\x82'
ق
1
U+0644
1604
ARABIC LETTER LAM
Lo
N
b'\xd9\x84'
ل
2
U+0628
1576
ARABIC LETTER BEH
Lo
N
b'\xd8\xa8'
ب
I generated the table above with a Python script (even though I don’t know much if any Arabic myself). For a non-Arabic reader, this kind of representation of an array of characters is useful. One of my first questions would be, for an Arabic reader who doesn’t know English or the Latin script very well, what would be a more approachable and useful representation of arbitrary text to help them? Could you feasibly write such a useful program in قلب? If قلب doesn’t help a programmer to fluently manipulate non-Arabic text data just as easily as Arabic text, then it is guilty of the exact same human biases that are being foisted on other programming languages.
I think the goal should be for arbitrary programming languages and tools to work just as easily for any human, regardless of what languages (natural or programming) they know. I don’t think the goal should be for every natural language population to go off and develop ad-hoc programming languages and tools that only work for them.
When one knows a natural language (say English) and does not know a programming language (say Python), I’d argue that the majority of the existing barriers to entry have to do with the tools the programmer uses, and not the orthography, vocabulary, nor syntax. If I want to make my identifiers Arabic words in Arabic script in Python, I am free to do so:
In[3]:%pastedefقلب(ا):returnاقلب(1)## -- End pasted text --Out[3]:1
Note that while the Python interpreter is happy to interpret the code above, your browser (or other software tool) you are using to display the code above may not display the mixed Arabic and Latin script appropriately.
This goes to my main thesis: The problem is that the tools that display the text and afford me the ability to manipulate it on my screen make assumptions about character set and text direction (and often make these assumptions poorly). This is a problem with the tools and not the programming language itself! (Python's syntax and pragmatics are not very opinionated when it comes to natural language, other than documentation resources.) If you wanted to alias all the built-ins and keywords in Python with Arabic equivalents, not much is there to prevent you from doing so (though, operators in Python are less amenable to such orthographic transitions without wider pragmatic repercussions). For LISP dialects, they are even more amenable to lexical variation because the languages have much less syntactic sugar (so I’m not surprised that قلب is a dialect of LISP).
This is a really intriguing project from 2013, exploring how programming languages in non-latin character sets could look like. The fact that this is a Lisp in Arabic makes it plain beautiful to behold, too.
Really, not being able to grasp anything of what's going on onscreen with more than a logic or programming-syntactic barrier preventing one from making sense of it is a really unique experience for me, and is how I imagine people without knowledge of English experience the innards of most modern programs. Really, really fascinating.
We have a Portuguese programming language called Portugol. It’s meant for educational purposes. It has no other reason to exist. It resembles Pascal. It sucks balls.
Really cool, I wonder if there are other non-Latin programming scripts out there too! Imagine what programming in kanji could look like...
One needs to be careful not to conflate writing systems with languages, programming languages with natural languages, and software tools with programming languages.
I’m all for anything that helps push tool-makers to make software tools more amenable to non-English, non-Latin script text. But, let’s not make assumptions about cultural biases! One of my first questions for this language would be, how do you, for instance, express numbers in قلب? Fundamentally, all that computers understand are numbers, and in particular, digital computers only understand binary integers. We develop a lot of abstractions over those numbers in order to translate data into something more convenient for manipulating by humans (such as textual representations of variable or constant identifiers). Since humans understand text better than numbers, we do all kinds of things for our convenience like convert binary numbers to hexadecimal or decimal. In fact, we even express text as other kinds of text! Unicode assigns integers to every written character (or at least, a lot of them), but we don’t display text as a sequence of integers. This is not a cultural bias, but rather a universal human bias. Computers don’t have biases—only engineers who design, operate, and program them have biases!
For example, let’s look at the name 'قلب' in more detail:
U+0642
b'\xd9\x82'
U+0644
b'\xd9\x84'
U+0628
b'\xd8\xa8'
I generated the table above with a Python script (even though I don’t know much if any Arabic myself). For a non-Arabic reader, this kind of representation of an array of characters is useful. One of my first questions would be, for an Arabic reader who doesn’t know English or the Latin script very well, what would be a more approachable and useful representation of arbitrary text to help them? Could you feasibly write such a useful program in قلب? If قلب doesn’t help a programmer to fluently manipulate non-Arabic text data just as easily as Arabic text, then it is guilty of the exact same human biases that are being foisted on other programming languages.
I think the goal should be for arbitrary programming languages and tools to work just as easily for any human, regardless of what languages (natural or programming) they know. I don’t think the goal should be for every natural language population to go off and develop ad-hoc programming languages and tools that only work for them.
When one knows a natural language (say English) and does not know a programming language (say Python), I’d argue that the majority of the existing barriers to entry have to do with the tools the programmer uses, and not the orthography, vocabulary, nor syntax. If I want to make my identifiers Arabic words in Arabic script in Python, I am free to do so:
Note that while the Python interpreter is happy to interpret the code above, your browser (or other software tool) you are using to display the code above may not display the mixed Arabic and Latin script appropriately.
This goes to my main thesis: The problem is that the tools that display the text and afford me the ability to manipulate it on my screen make assumptions about character set and text direction (and often make these assumptions poorly). This is a problem with the tools and not the programming language itself! (Python's syntax and pragmatics are not very opinionated when it comes to natural language, other than documentation resources.) If you wanted to alias all the built-ins and keywords in Python with Arabic equivalents, not much is there to prevent you from doing so (though, operators in Python are less amenable to such orthographic transitions without wider pragmatic repercussions). For LISP dialects, they are even more amenable to lexical variation because the languages have much less syntactic sugar (so I’m not surprised that قلب is a dialect of LISP).