SHOC: A Hebrew language compiler banner

SHOC: A Hebrew language compiler

5 devlogs
23h 44m 37s

a compiler for my ancient-biblical hebrew programming language
the name of the language is Shoresh (“root” in hebrew)
compiles to native x86_64 or AArch64 linux native asm
!this project was started way before i registerd it for hackclub!

This project uses AI

no use of ai at all in code generation/debugging
minor use of gemini for overall project architecture

Demo Repository

Loading README...

acetim100

Shipped this project!

Hours: 7.3
Cookies: 🍪 133
Multiplier: 18.22 cookies/hr

finnaly added AArch64 support (cross-compilation) for the shoresh language!

acetim100

FAAAAAHHHH

it is currently 5 am.
i spent 2 hours debugging this single fibbonacci program.
because im developing arm64 support for my compiler and my processor is x86_64- i have to use an emulator

the process of debugging through an emulator is pure toture

  • i have to run the emulator with a flag to make the program pause at start and be ready for gdb to attach at a specific port
  • then i need to run a cross debugger and set the architecture to arm64
  • then i need to connect to said port to start debugging the executable
    the problem was this :
    alt text

in my arm64 asm generator i accidentally swapped the w0 and w1 registers
w0 -is the right side of the expression
w1- is the left side of the expression
which i documented for myself at the head of this function:
d

so i was acc checking if w0<w1 (right<left)
instead of w1<w0 (left<right)
when checking “lower than” expression
(bug was also present in higher than expressions)

then it finally worked:

some other changes i made

fixed some other bugs i caught on static analysis relating to function prologues and epilogues

Attachment
0
acetim100

working on AArch64 support

  • finished arm64 code generation
  • currently testing it

ALSO: i had to learn arm for this, so its been a bit of time since the last devlog!

arm asm is way more annoying than the “regular” x86 architecture, all instructions are limited to 32 bits and with it comes challenges mainly revolving around immidiate value loading and other stuff like literal addresses

but! arm also has some pretty neat features like three operand instructions, meaning i can take 2 registers add them together and put the calculated value in another register- all in one instruction
this saved me a bit of lines of code but not that significant

some patches

  • removed annoying stderr message each time the compiler enters sync mode
Attachment
0
acetim100

Shipped this project!

Hours: 16.45
Cookies: 🍪 297
Multiplier: 18.03 cookies/hr

i built a compiler for my own language which is inspired by ancient biblical hebrew. the hardest part was debugging the generated asm file to ensure it works as intended -using gdb. during debugging i learned about the 16 byte stack alignment requierment that the libc funcs (and a lot of other libraries) go by. (it has smt to do with optimization). because i used 0 external libraries for this compiler i had to learn a lot about how compilers really work and implement eveything by myself which was really enjoyable

acetim100

FINAL DEVLOG(BEFORE SHIPPING)

shoc- now available on npm!!!
full documentation is available on the github repo in the link below, documentation contains guide on how you can install the compiler , how to use it & how to write in the language

what did i do since the last devlog?

i just finished testing and and fixing bugs in the assembly part and in the program itself
also added shabatchk function to the compiled asm- it checks if its sabbath today and if it is it will not run (no work is permitted on sabbath)
i also have a repo in my github account that contains the notepad++ config for the user defined language - it provides syntax highlighting for shoresh and a nice development env

planning do add:

i will add support for arm64 linux native asm a
and i will prb add arrays sometime soon

github repo

Attachment
0
acetim100

CODE GENERATION & I\O &BUG FIXES

fixed some bugs in semantic validation(concerning stack alignment in x64 architecture asm)

implemented some of the code generation(to x86_64 GAS intel syntax asm) functions (not all of them)
with the most important one being the AstExpression and made a recursive algorithm to traverse the AST tree and convert it to asm code
the algorithm itself isnt the most efficent because it uses the stack to store intermediate values during calculation and i will prb implement a register map to use the registers instead of the stack (until they are all in use ofc)

i also added io functions to let you input a number to a variable, a function that lets you print an expression and a function that lets you print a string

printing and input is handled by libc

Attachment
Attachment
0
acetim100

ADDED UNARY EXPRESSIONS AND FIXED SOME BUGS

i added the possibility to use unary expressions like -,! inside my expression parser
the image shown is a debugging functions output that prints out the structred nodes that define the program
the highlighted part is the not operator (לא in hebrew) being parsed correctry
the expression is printed in prefix notation or how some may call it - “polish notation”
i also fixed a bug in the typeCheker that caused the type checker to not check the validity of the expressions it passes to the function

Attachment
0