Packaging TeX Live For MSYS2

tl;dr it's ready now and I wrote a python library for the same.

Packaging TeX Live for MSYS2 banner

On 2021/04/02, I read Norbert's blog "TeX Live 2021" was released; I was interested in downloading it. I am on Windows and got to know about MSYS2 recently. So I decided to give them some Love 💗.

The First thing I did was to look into how Arch Linux is doing the packaging. Do you know both Arch Linux and MSYS2 use the same package manager Pacman? So, naturally, the best idea to start an MSYS2 package is to look into Arch Linux packaging; this got me to, https://archlinux.org/packages/?q=texlive, where I could see there were quite some packages. The packages texlive-bin and texlive-core attracted me; I started looking at their PKGBUILD, first at texlive-bin, PKGBUILD. Then I was looking into the dependencies and makedepends got to know that some packages were missing in MSYS2. I tried building them; many failed, so I finally disabled them. It's something people won't use usually.

Next, I edited that PKGBUILD to add MSYS2 dependencies and removed the things which are not used and started to compile them. Everything went smooth until there was a redeclaration of headers. I came up with a patch, 0001-Remove-DLUASOCKET_INET_PTON-from-Makefile.patch, after some searching which worked. Then the whole beast built successfully. That looks so easy; it actually took 3 days.

Now, the binary package, texlive-bin, is ready; I decided to get the texlive-core now. I had a look at the PKGBUILD again, but this time I was confused; there were other things that I didn't know; I was unsure how they produced the sources(later I discovered that it's a copy from CTAN).  I went to the mailing list and ask for help and also hoping anyone will even reply.

I got a reply the next day morning from Norbert, explaining things. He explained the specific configuration files and ways to generate them. I found that everything was in a file, texlive.tlpdb; I should either parse them or do things manually and take pains to update each year. I choose to do things automatically :). 

Birth of Msys2-Texlive

I decided to parse texlive.tlpdb automatically. People have already used Perl for these purposes; Tex Live has an official API; I didn't know Perl, sadly(I didn't know about them initially). I wrote one myself using Python; the birth of msys2-texlive ;-).

When I asked the MSYS2 people about this, they created a repository called msys2-texlive and added me as a maintainer.

With that project, now I automated the source creation from CTAN, including config files. Every file is generated on Github Actions and uploaded as a release asset, which I downloaded in the PKGBUILD while building.  

Then I opened a PR to MINGW-packages repo with the new PKGBUILD. It compiled successfully after quite some tries. Then comes another hurdle, "The launchers aren't working on a platform", it just Seg Faulted on 32-bits.

Debugging SegFault - 32-bits

I noticed this happened when installing the texlive-core package and running those hooks on Github Actions. So, I decided to debug them. I looked into the SegFault and found it was mtxrun.exe; I grepped around to find it's the source. It was a Perl script, and I was like, "oh no!", how can an interpreted language create an exe file. Then I found that the executable calls Perl when the executable is executed. The launcher was called runscript.exe; I looked at its source, it was written in C. I didn't know much of C to find the problem in that script; I jumped into a help channel in the MSYS2 Discord server.  (Read the chat I linked to you would get an insight into what the issue was).

@1480c1 jumped in to help me told it was the math calculation done with an int which could be a possible reason for SegFault in 32-bits. But that wasn't true; after changing it, it was still SegFaulting. I thought it would great to drop 32-bits... 

With more debugging(never give up!), you know "with print statements", I could find the cause of the SegFault isn't in that script; it happened when calling _spawnvp and returning the return code is returned. That was something strange I found. Returning a weird return code in Windows Bash(Cygwin) would say SegFault. For example, a sample program like below would SegFault.

int main() {
    return -1073741819;
}

It was calling a program called texlua.exe; I just run it plainly; it didn't SegFault, so the issue was the runscript.tlu it was running. I then found that texlua.exe is a Lua interpreter. The runscript.exe tries to run a script called runscript.tlu; it crashes. I tried a hello world; it worked without SegFault, so I decided it was a problem with that script. Again, with "print-statements debugging", I found that it called the FFI module; and it crashed boom!

Reported at: https://mailman.ntg.nl/pipermail/dev-luatex/2021-April/006484.html

It was a hard debug; that took 6 hours.

And finally, after all this, I was able to get the build assets from the CI and use TexLive to compile a simple document. https://twitter.com/syrus_dark/status/1386241640252145664  🎉


Comments