Updated:
I created a GitHub action which work as this solution, less code and better optimizations. Cache Anything New
This solution is similar to the most voted. I tried the proposed solution but it didn't work for me because I was installing texlive-latex
, and pandoc
which has many dependencies and sub-dependencies.
I created a solution which should help many people. One case is when you install a couple of packages (apt install
), the other solution is when you make
a program and it takes for a while.
Solution:
- Step which has all the logic, it will cache.
- Use
find
to create a list of all the files in the container.
- Install all the packages or
make
the programs, whatever that you want to cache.
- Use
find
to create a list of all the files in the container.
- Use
diff
to get the new created files.
- Add these new files to the cache directory. This directory will automatically store with
actions/cache@v2
.
- Step which load the created cache.
- Copy all the files from the cache directory to the main path
/
.
- Steps which will be benefited by the cache and other steps that you need.
When to use this?
- I didn't use cache, the installation of the packages was around ~2 minutes to finish all the process.
- With the cache, it takes 7~10 minutes to create it the first time.
- Using the cache takes ~ 1 minute to finish all the process.
- It is useful only if your main process take a lot of time also it is convenient if you're deploying very often.
Implementation:
release.yml
name: CI - Release books
on:
release:
types: [ released ]
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- uses: actions/cache@v2
id: cache-packages
with:
path: ${{ runner.temp }}/cache-linux
key: ${{ runner.os }}-cache-packages-v2.1
- name: Install packages
if: steps.cache-packages.outputs.cache-hit != 'true'
env:
SOURCE: ${{ runner.temp }}/cache-linux
run: |
set +xv
echo "# --------------------------------------------------------"
echo "# Action environment variables"
echo "github.workspace: ${{ github.workspace }}"
echo "runner.workspace: ${{ runner.workspace }}"
echo "runner.os: ${{ runner.os }}"
echo "runner.temp: ${{ runner.temp }}"
echo "# --------------------------------------------------------"
echo "# Where am I?"
pwd
echo "SOURCE: ${SOURCE}"
ls -lha /
sudo du -h -d 1 / 2> /dev/null || true
echo "# --------------------------------------------------------"
echo "# APT update"
sudo apt update
echo "# --------------------------------------------------------"
echo "# Set up snapshot"
mkdir -p "${{ runner.temp }}"/snapshots/
echo "# --------------------------------------------------------"
echo "# Install tools"
sudo rm -f /var/lib/apt/lists/lock
#sudo apt install -y vim bash-completion
echo "# --------------------------------------------------------"
echo "# Take first snapshot"
sudo find / \
-type f,l \
-not \( -path "/sys*" -prune \) \
-not \( -path "/proc*" -prune \) \
-not \( -path "/mnt*" -prune \) \
-not \( -path "/dev*" -prune \) \
-not \( -path "/run*" -prune \) \
-not \( -path "/etc/mtab*" -prune \) \
-not \( -path "/var/cache/apt/archives*" -prune \) \
-not \( -path "/tmp*" -prune \) \
-not \( -path "/var/tmp*" -prune \) \
-not \( -path "/var/backups*" \) \
-not \( -path "/boot*" -prune \) \
-not \( -path "/vmlinuz*" -prune \) \
> "${{ runner.temp }}"/snapshots/snapshot_01.txt 2> /dev/null \
|| true
echo "# --------------------------------------------------------"
echo "# Install pandoc and dependencies"
sudo apt install -y texlive-latex-extra wget
wget -q https://github.com/jgm/pandoc/releases/download/2.11.2/pandoc-2.11.2-1-amd64.deb
sudo dpkg -i pandoc-2.11.2-1-amd64.deb
rm -f pandoc-2.11.2-1-amd64.deb
echo "# --------------------------------------------------------"
echo "# Take second snapshot"
sudo find / \
-type f,l \
-not \( -path "/sys*" -prune \) \
-not \( -path "/proc*" -prune \) \
-not \( -path "/mnt*" -prune \) \
-not \( -path "/dev*" -prune \) \
-not \( -path "/run*" -prune \) \
-not \( -path "/etc/mtab*" -prune \) \
-not \( -path "/var/cache/apt/archives*" -prune \) \
-not \( -path "/tmp*" -prune \) \
-not \( -path "/var/tmp*" -prune \) \
-not \( -path "/var/backups*" \) \
-not \( -path "/boot*" -prune \) \
-not \( -path "/vmlinuz*" -prune \) \
> "${{ runner.temp }}"/snapshots/snapshot_02.txt 2> /dev/null \
|| true
echo "# --------------------------------------------------------"
echo "# Filter new files"
diff -C 1 \
--color=always \
"${{ runner.temp }}"/snapshots/snapshot_01.txt \
"${{ runner.temp }}"/snapshots/snapshot_02.txt \
| grep -E "^\+" \
| sed -E s/..// \
> "${{ runner.temp }}"/snapshots/snapshot_new_files.txt
< "${{ runner.temp }}"/snapshots/snapshot_new_files.txt wc -l
ls -lha "${{ runner.temp }}"/snapshots/
echo "# --------------------------------------------------------"
echo "# Make cache directory"
rm -fR "${SOURCE}"
mkdir -p "${SOURCE}"
while IFS= read -r LINE
do
sudo cp -a --parent "${LINE}" "${SOURCE}"
done < "${{ runner.temp }}"/snapshots/snapshot_new_files.txt
ls -lha "${SOURCE}"
echo ""
sudo du -sh "${SOURCE}" || true
echo "# --------------------------------------------------------"
- name: Copy cached packages
if: steps.cache-packages.outputs.cache-hit == 'true'
env:
SOURCE: ${{ runner.temp }}/cache-linux
run: |
echo "# --------------------------------------------------------"
echo "# Using Cached packages"
ls -lha "${SOURCE}"
sudo cp --force --recursive "${SOURCE}"/. /
echo "# --------------------------------------------------------"
- name: Generate release files and commit in GitHub
run: |
echo "# --------------------------------------------------------"
echo "# Generating release files"
git fetch --all
git pull --rebase origin main
git checkout main
cd ./src/programming-from-the-ground-up
./make.sh
cd ../../
ls -lha release/
git config --global user.name 'Israel Roldan'
git config --global user.email 'israel.alberto.rv@gmail.com'
git add .
git status
git commit -m "Automated Release."
git push
git status
echo "# --------------------------------------------------------"
Explaining some pieces of the code:
Here the action cache, indicate a key
which will be generated once and compare in later executions. The path
is the directory where the files should be to generate the cache compressed file.
- uses: actions/cache@v2
id: cache-packages
with:
path: ${{ runner.temp }}/cache-linux
key: ${{ runner.os }}-cache-packages-v2.1
This conditional search for the key
cache, if it exits the cache-hit
is 'true'.
if: steps.cache-packages.outputs.cache-hit != 'true'
if: steps.cache-packages.outputs.cache-hit == 'true'
It's not critical but when the du
command executes at first time, Linux indexed all the files (5~8 minutes), then when we will use the find
, it will take only ~50 seconds to get all the files. You can delete this line, if you want.
The suffixed command || true
prevents that 2> /dev/null
return error otherwise the action will stop because it will detect that your script has an error output. You will see during the script a couple of theses.
sudo du -h -d 1 / 2> /dev/null || true
This is the magical part, use find
to generate a list of the actual files, excluding some directories to optimize the cache folder. It also will be executed after the installations and make
programs. In the next snapshot the file name should be different snapshot_02.txt
.
sudo find / \
-type f,l \
-not \( -path "/sys*" -prune \) \
-not \( -path "/proc*" -prune \) \
-not \( -path "/mnt*" -prune \) \
-not \( -path "/dev*" -prune \) \
-not \( -path "/run*" -prune \) \
-not \( -path "/etc/mtab*" -prune \) \
-not \( -path "/var/cache/apt/archives*" -prune \) \
-not \( -path "/tmp*" -prune \) \
-not \( -path "/var/tmp*" -prune \) \
-not \( -path "/var/backups*" \) \
-not \( -path "/boot*" -prune \) \
-not \( -path "/vmlinuz*" -prune \) \
> "${{ runner.temp }}"/snapshots/snapshot_01.txt 2> /dev/null \
|| true
Install some packages and pandoc
.
sudo apt install -y texlive-latex-extra wget
wget -q https://github.com/jgm/pandoc/releases/download/2.11.2/pandoc-2.11.2-1-amd64.deb
sudo dpkg -i pandoc-2.11.2-1-amd64.deb
rm -f pandoc-2.11.2-1-amd64.deb
Generate the text file with the new files added, the files could be symbolic files, too.
diff -C 1 \
"${{ runner.temp }}"/snapshots/snapshot_01.txt \
"${{ runner.temp }}"/snapshots/snapshot_02.txt \
| grep -E "^\+" \
| sed -E s/..// \
> "${{ runner.temp }}"/snapshots/snapshot_new_files.txt
At the end copy all the files into the cache directory as an archive to keep the original information.
while IFS= read -r LINE
do
sudo cp -a --parent "${LINE}" "${SOURCE}"
done < "${{ runner.temp }}"/snapshots/snapshot_new_files.txt
Step to copy all the cached files into the main path /
.
- name: Copy cached packages
if: steps.cache-packages.outputs.cache-hit == 'true'
env:
SOURCE: ${{ runner.temp }}/cache-linux
run: |
echo "# --------------------------------------------------------"
echo "# Using Cached packages"
ls -lha "${SOURCE}"
sudo cp --force --recursive "${SOURCE}"/. /
echo "# --------------------------------------------------------"
This step is where I'm using the installed packages generated by the cache, the ./make.sh
script use pandoc
to do some conversions. As I mentioned, you can create other steps which use the cache benefits or another which not use the cache.
- name: Generate release files and commit in GitHub
run: |
echo "# --------------------------------------------------------"
echo "# Generating release files"
cd ./src/programming-from-the-ground-up
./make.sh