ArSpr is a set of 2 utility functions that let’s you write Arabic text in Pico8. You input a Latin text and it will transliterate it to Arabic. I made it because I want to make localized Pico8 games, and to practice writing my own implementation for Right-to-Left text and Arabic Shaping.
You can get it from the Pico8 forum:
Or you can download the following p8.png cart and import it to Pico8:
Usage is easy, but slightly different from the regular Print() function in Pico8; you have to initialize a variable to contain the text data then draw it. Here’s an example:
function _init() text1=create_ar_spr([string]) end function _draw() cls() draw_ar(text1, [x], [y]) end
The text has 3 colors. One for the letters, one for the dots, and one for the tashkeel. We can easily change colors like this:
function _init() text1=create_ar_spr([string]) end function _draw() cls() ar_col1=[1-16] --letters color ar_col2=[1-16] --dots color ar_col3=[1-16] --tashkeel color draw_ar end
[Reading past this point is optional]
TECHNICAL INFO ABOUT AR_SPR
This project was inspired by Tiny Text, which uses a similar method to encode a Pico8 font.
510 out of 8192 tokens
~30% of compressed space
No dropped frames when filling the screen
Each character’s height is fixed at 8 pixels max. The width is variable, but most characters are 5 pixels wide including an empty 1 pixel space on the right of Initial and Isolated glyphs.
I decided early that I wasn’t going to do a mono-spaced font because some letters need more space to be legible. Such as ض which is as wide as 2 typical letters when pixelated.
FONT ENCODING AND PARSING
The compressed font is made up of 2 sprites total:
Basically, we draw text on screen by copying and pasting chunk areas from the sprite map.
The chunk data is encoded in a few long strings as Pico8 doesn’t consider string data in the Token count. Let’s take a look at one of them. This one contains all the tashkeel characters:
_ar_tash = ",^__121__,/__121_6,~_112__1,▥_22111_+__32____,❎_2211__+__221__1+__321___,🐱_2211__,ˇ_2211y_+__2211__,⬇️_2211_6,✽_2211y6+__2211_6"
The string _ar_tash has the crop and draw locations for every drawn chunk. Each character in the string is converted to a letter or number. The format is “[character], [character advance], [sprite X pos], [sprite Y pos], [sprite width], [sprite height], [draw offset X], [draw offset Y], [end chunk or add a new chunk to same character]” (every bracket in the list is a single character. Underscore equals zero or null.)
Let’s take an example:
That translates to “The letter ✽ will advance the line 0 pixels. It has a chunk at (x=2,y=2), size (1,1), drawn at offset (-2, 6) and another chunk (2, 2), size (1, 1), drawn at offset (0, 6).”
Letters have 4 glyph variations (initial, medial, final, isolated). Arabic Shaping is when we pick the needed glyph based on A) the position of the letter in the word and B) the “Joining Type” of letter.
In the alphabet, there are 2 Joining Types: 1- Some letters can connect to the previous and following letter and we call them Dual-Joining. They have all 4 glyph variations (initial, medial, final, isolated), such as ب
ببب ب – بـ ـبـ ـب ب
2- Other letters can only join the previous letter and we call them Previous-Joining. They have 2 glyph variations (final, isolated), such as د
ددد بدددد – ـد د
Other characters, such as (space, numbers, punctuation), do not join with letters and so only have a single, isolated glyph.
Tashkeels (diacritics) are completely ignored by the Arabic Shaper and they have zero width so they could be drawn in the same position as a letter. They only have a single glyph each.
My Arabic Shaping implementation is a simple case-switch (pseudo code):
function _get_font(previousChar,currentChar,nextChar) if currentChar is tashkeel return _ar_tashkeel_glyph elseif currentChar is (space, number, punctuation) return _ar_nonletter_glyph elseif previousChar is any except dual-joining letter and currentChar is dual-joining letter and nextChar is any except (space, number, punctuation) return _ar_init_glyph elseif previousChar is dual-joining and currentChar is dual-joining and nextChar is any except (space, number, punctuation) return _ar_medial_glyph elseif previousChar is dual-joining and currentChar is dual-joining and nextChar is (space, number, punctuation) return _ar_final_glyph elseif previousChar is dual-joining letter and currentChar is previous-joining letter return _ar_final_glyph -- all previous checks failed. else return _ar_isolated_glyph end end
For a different implementation of Arabic Shaping you can check out MiniBidi by Ahmad Khalifa.
- We can eliminate duplicate chunk data in the strings by letting characters reference common chunks.
- We can try using caaz’s sprite compression technique.
- In an earlier version, I had one function called printAr(str,x,y) that didn’t need to initialize strings by caching string after being drawn for the first time. It was neat to have everything in one function! But I removed it to save tokens. It would be nice to bring it back…