What is Pacoloco?
Pacoloco behaves like a regular Pacman mirror except that it downloads a package at first only when it is requested. It acts as a cache between your local Pacman package manager and the remote mirror. When run on a local server, it massively reduces the download time your local Pacman package manager takes to download new, updated packages.
Pacoloco can download new versions of packages automatically every night for example, this is called “prefetching”. It is really nice and reduces upgrade times even on a 50Mbps connection from minutes to just seconds for the downloading part for large upgrade sizes and shares the downloaded packages between machines in your local network. Very nice for rarely updated machines!
Installation
This is fairly straightforward, so I will keep it short: First, install the pacoloco
package on a local server, configure it to use some mirrors of your choice and set a time how often it should prefetch packages.
You can also configure the directory we you want Pacoloco to place the files it caches into.
The config file is at /etc/pacoloco.yaml
Second, update your Pacman mirrorlist to use the Pacoloco cache, by default the URL for a regular x86 installation looks like this:
http://pacolocohost.example:9129/repo/archlinux/$repo/os/$arch
Priming the Cache
It’s full potential for prefetching achieves Pacoloco only when it knows your currently installed packages, so it’s a good idea to prime it. This is done by getting a list of currently installed packages for each repo (“core”, “extra”) and requesting the corresponding package url just like Pacman would do when you would (re-)install it.
The size of the package cache naturally depends on the number of packages you have installed, but to give you an idea: My cache size with all installed packages from core was just about 3 GB and with all installed ones from extra it is under 7GB, so nothing to worry about. Isn’t zstandard a very nice and efficient compression algorithm? I LOVE it!
You can prime the cache by running this gigantic and relatively inefficient Bash one-liner (set the repo and host variable! and if needed the Pacoloco repo url if your repo isn’t archlinux
or you are running on ARM):
export REPO=core; export CACHE_HOST='pacolocohost:9129'; curl -s -q -I HEAD -K <(pacman -Qiq | grep -Eo "Name\s+:.*|Version\s+:.*|Architecture\s+:.*" | awk '{print $3}' | grep . | awk 'ORS=NR%3?FS:RS' | (grep -of <(pacman -Slq $REPO | sed 's/.*/\^& .*\$/')) | awk -F" " '{print "url=\"http://"ENVIRON["CACHE_HOST"]"/repo/archlinux/"ENVIRON["REPO"]"/os/x86_64/"$1"-"$2"-"$3".pkg.tar.zst\""}')
Depending on your internet connection and the number of installed packages of that repo, this can take some time. You can monitor the progress in Pacoloco’s log.
Here is a breakdown of what this does:
pacman -Qiq
returns the list of all installed packages, together with some info. We only need the name, version and the package’s architecturegrep -Eo "Name\s+:.*|Version\s+:.*|Architecture\s+:.*"
- We only want the info itself and not its title in each line, so we use awk to keep only the third column using
awk '{print $3}'
- We remove any empty line which would interfere with the next step (there shouldn’t be any, but just make sure):
grep .
- Take three consecutive lines with name, version and architecture and combine them into a single line with
awk 'ORS=NR%3?FS:RS'
- Now, as Pacman doesn’t return any info about which repository (core/extra/…) the packages are from, we need to filter.
pacman -Sql core
returns all packages from the core repo, also not installed ones. We prepend each lines with a^
and add.*$
to end so we can use it as a regex (this is not 100% safe as package names are now interpreted as a regex too) - Using
grep -of
we use the aforementioned list of all packages of that repo as a list of patterns we want to match against our list of all installed packages - We parse each line with the 3 values of name, version and architecture to create a new url we can later fetch. Here we have to use awk’s
ENVIRON["ENV_VARIABLE"]
syntax to access to outer environment variable. - We now can request the HEAD of all these URLS using curl. With this, Pacoloco will download the package and prefetch it later without us having to deal with the result in our Bash script.
We use curl’s parameter
-K
to supply the list of URLs as a file so that curl can reuse TCP connections for all requests, this is also the reason all urls are prepended withurl="..."
earlier.
Metrics
Metrics are nice and important, and Pacoloco provides an endpoint for Prometheus at
http://pacolocohost.example:9129/metrics
Most relevant are pacoloco_cache_hits_total
and pacoloco_cache_size_bytes
.
Using Pacoloco with multiple architectures at the same time
If you have both, regular x86 Arch boxes and some ArchLinuxARM boxes, you can use one single Pacoloco installation for both of them, you just need to add mirrors for each architecture’s repository in your config file.
For ArchLinux on ARM and x86 your config file could look like this:
...
repos:
archlinux:
urls: ## add or change official mirror urls as desired, see https://archlinux.org/mirrors/status/
- http://mirrors.kernel.org/archlinux
archlinux_armv7h:
url: http://de4.mirror.archlinuxarm.org
...
with the Pacman mirrorlist on your ARM machine containting this mirror URL:
http://pacolochost.example:9129/repo/archlinux_$arch/$arch/$repo