As I continue my journey to learn Rust, I occasionally run into various hardships and difficulties. Most of them are related to lifetimes and ownership, but they lead to a pleasant moment of enlightenment when I figure it out. But for the last couple of days, I struggled with a much dumber error, one which I should have figured out much faster.
I was trying to load some word vector embeddings, from the
finalfusion package. Word vectors have grown up since I used them (back in the word2vec days). These are 3.85 Gb.
I tried to load them up and to play with the API on my Linux desktop. The loading time was about 10 seconds, but then it was fast. And it worked.
Fast forward a month, during which I worked on other project and I get around again to working with the word vectors, this time from my Windows desktop. The rest of the project runs fine, but when it comes to loading the word vectors, it errors out with a menacing stack trace
thread 'main' panicked at 'capacity overflow', src\liballoc\raw_vec.rs:750:5 ...
I look at the libraries repo, there was a new release in the meantime. They warn something about a breaking change, maybe the old code can't read newer vectors? I update the library; I change the code; I still get the same error.
Maybe it's my Rust version that's old and something broke. I update Rust. Nope, not it.
I try to DuckDuckGo the error, but I don't find anything relevant. So I open an issue on the GitHub repo of the library and I ask about this. I get an answer about it in 5 minutes (thank you for the prompt answer Daniel!): am I using the 32-bit or the 64-bit toolchain?
I facepalm hard, because I realize that's probably the error: the word vector is right around the size that can be loaded into memory in 32 bit systems, there might be some extra allocations done while loading, so it goes overboard.
I check with
rustup what toolchain I have:
> rustup toolchain list stable-i686-pc-windows-msvc (default)
That's the 32 bit toolchain, my friends. So I install the
x86_64 toolchain and set it as default:
> rustup toolchain install stable-x86_64-pc-windows-msvc > rustup default stable-x86_64-pc-windows-msvc > rustup toolchain list stable-i686-pc-windows-msvc stable-x86_64-pc-windows-msvc (default)
And lo and behold, the word vectors are now successfully loaded and I can start playing around more seriously with them.
Why is the 32bit toolchain the default one on Windows in 2020?
I’m publishing this as part of 100 Days To Offload - Day 12.