add --trim option for generating smaller binaries

Since I am a big fun of programming languages and compilers, I always follow the latest news about them. I am also a big fun of Julia programming language. I follow the latest commits almost everyday, especially the ones that are related to the compiler. But one of the most important commits that I have ever seen was the one that is related to the --trim option for generating smaller binaries. This feature is the start of a game changer one for the Julia programming language. Finally, I want to share my first tastes of this feature.

In 2024, September 29th, I got up early and turned my computer on. I was excited to see that the legendary pull request was merged.

Here is the link for the pull request:
add --trim option for generation smaller binaries

This pull request introduces a new feature of trimming binary size of the compiled Julia code. Note that the feature is not enabled by default or well-integrated into the compiler. We have now three Julia scripts in the contrib folder of the source code:

These files can be manually downloaded if you are not compiling your own Julia instance from the source code. The juliac.jl script is the main script that is used to compile the Julia code. The juliac-buildscript.jl is a helper script that is used by the juliac.jl script. The julia-config.jl script is used to configure the Julia compiler. I prepared a bash script to download and update these files easily. Here is the script:

download-script.sh

!#/bin/bash
rm -f juliac-buildscript.jl
rm -f juliac.jl
rm -f julia-config.jl

wget https://raw.githubusercontent.com/JuliaLang/julia/refs/heads/master/contrib/juliac-buildscript.jl
wget https://raw.githubusercontent.com/JuliaLang/julia/refs/heads/master/contrib/juliac.jl
wget https://raw.githubusercontent.com/JuliaLang/julia/refs/heads/master/contrib/julia-config.jl

The Early Bird

Since the feature is not well-integrated into the compiler, we need to manually install the newest Julia, that is, we need a nightly build or a custom compiled instance from source. Juliaup is the best tool for this purpose. We can add the nightly build of Julia to our system by running the following command:

#> juliaup add nightly

I always have an instance of the latest mainstream, the alpha channel, and the nigtly build of Julia. Here is the list of the installed Julia versions on my system:

#> juliaup status
Default  Channel  Version                              Update 
--------------------------------------------------------------
         alpha    1.11.0-rc4+0.x64.linux.gnu                  
         nightly  Development version 1.12.0-DEV.1282         
      *  release  1.10.5+0.x64.linux.gnu        

We can always switch between the installed versions by running the following command:

#> julia +nightly --version
julia version 1.12.0-DEV

#> julia +alpha --version
julia version 1.11.0-rc4

#> julia --version
julia version 1.10.5

helloworld.jl

Let's get started and delve into it :) Wait, I don't use chatGPT here, just kidding. Here is the simplest Julia code that prints "Hello, World!" to the console. Note that the code is wrapped in a module. This is important for the compiler to generate a binary file. The main function is annotated with the @ccallable macro. The main function returns an integer, but not in type of Int, it's Cint which is native integer type of C language. The function returns 0, which is the exit code, indicating that the program has been executed successfully.

module HelloWorld

Base.@ccallable function main()::Cint
    println(Core.stdout, "Hello, World!")
    return 0
end

end # End of module HelloWorld 

Compiling in terminal

The --trim option is mandatory for generating smaller binaries. If you don't use it the compiler will generate a binary file that is not trimmed. The --output-exe option is used to specify the name of the output file. This option can take 3 arguments. An exe (--output-exe), a dynamic link library (--output-lib), or a system image (--output-sysimage). We choose the exe option here. The last argument is the name of the source file.

#> julia +nightly juliac.jl --output-exe hello --trim helloworld.jl 

Compiled file

In my Linux machine, the output is an ELF 64-bit executable file. The file size is 1652184 bytes which is considerably large for a simple "Hello, World!" program. But remember the non-trimmed version of the file is much larger (about 150 mbs). That's a huge difference.

#> file hello
hello: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, 
BuildID[sha1]=c6748be5508f17a4163a6e0ddcfc345c80eddf8d, 
for GNU/Linux 3.2.0, with debug_info, not stripped

#> ls -al hello
-rwxrwxr-x 1 hako hako 1652184 Oct  2 10:02 hello*

Running the file

Here is the time it takes to execute the file. The total time is 34.03 milliseconds. Yes, it's a subjective measurement. But we can use this number to compare it to more complex ones.

#> time ./hello
Hello, World!

________________________________________________________
Executed in   34.03 millis    fish           external
   usr time   27.45 millis    0.00 micros   27.45 millis
   sys time    7.15 millis  603.00 micros    6.55 millis    

Let's take a look at the linked shared libraries of the executable file.

#> ldd ./hello
linux-vdso.so.1 (0x00007ffe954ff000)
libjulia.so.1.12 => .../lib/libjulia.so.1.12 (0x0000724c69598000)
libjulia-internal.so.1.12 => .../lib/julia/libjulia-internal.so.1.12 (0x0000724c68e00000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000724c68a00000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000724c69575000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000724c69570000)
libunwind.so.8 => .../julia/libunwind.so.8 (0x0000724c68400000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x0000724c69569000)
libz.so.1 => .../lib/julia/libz.so.1 (0x0000724c67e00000)
libatomic.so.1 => .../julia/libatomic.so.1 (0x0000724c6955f000)
libstdc++.so.6 => .../julia/libstdc++.so.6 (0x0000724c67a00000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000724c69474000)
libgcc_s.so.1 => .../julia/libgcc_s.so.1 (0x0000724c6944f000)
/lib64/ld-linux-x86-64.so.2 (0x0000724c696ae000)

It seems the executable file is linked to the Julia shared libraries, which are libjulia.so and libjulia-internal.so (among others). That means the executable file is not standalone. It needs the Julia shared libraries to run. When the compiler gets the ability of generating statically linked binaries, we can expect the executable file to be standalone. That means the file will not need any shared libraries to run. Yes, like we do it in Rust. This would be huge step for Julia, I think. Time will show and show must go on.

Let's get it more complex

Here is a more complex example, but not that much. We are going to integrate the standard normal distribution function from -1.96 to 1.96. The result is nearly 0.95. The code is wrapped in a module and the main function is annotated with the @ccallable macro again. The other things are the same for a regular Julia user. The important thing here is the code should be type-safe. It's guaranteed by the Julia compiler that the normal function take a Float64 as an argument and returns a Float64, like we do in other type-safe languages. The method used for numerical integration is the simplest one, the rectangle rule, a.k.a Riemann sums. The code is not optimized for performance. It's just for demonstration purposes. Smaller eps values will give more accurate results. But it will take more time to compute.

module NumericalIntegration

    function normal(x::Float64)::Float64
      return (1 / sqrt(2 * pi)) * exp(-0.5 * x * x)
    end

    function integrate(f::Function, a::Float64, b::Float64, eps::Float64)::Float64
      total = 0.0
      x = a
      while x <= b
        total += f(x) * eps
        x += eps
      end
      return total
    end

    Base.@ccallable function main()::Cint
      result = integrate(normal, -1.96, 1.96, 0.0001)
      println(Core.stdout, result)
      return 0
    end

end # End of module NumericalIntegration    

Let's compile the code and see the results.

#> julia +nightly juliac.jl --output-exe integrate --trim integrate.jl 

The file is a 64-bit ELF executable again. But this time the file size is 1756192 bytes. It's larger than the previous one. But remember the previous one was a simple "Hello, World!" program.

#> file integrate
integrate: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, 
BuildID[sha1]=c430370463ca9af258eccc17657d8e1ef2861a30, 
for GNU/Linux 3.2.0, with debug_info, not stripped

#> ls -al integrate
-rwxrwxr-x 1 hako hako 1756192 Oct  2 10:08 integrate*

The time it takes to execute the file is 34.88 milliseconds. It's nearly the same as the previous one.

#> time ./integrate
0.9500100536071496

________________________________________________________
Executed in   34.88 millis    fish           external
   usr time   23.67 millis    0.00 micros   23.67 millis
   sys time   13.57 millis  903.00 micros   12.66 millis

Yes, I think that's all for now. I hope you enjoyed the article. I will be back with more interesting topics. Stay tuned and take care of yourself. Bye for now. Sorry for the style of this web page, by the way, I am not a React developer :)

- jbytecode(2024-10-02)