2022-04-20

How are Docker buildx layer cache hashes calculated?

I'm digging into the caching of Docker buildx to try to debug an issue. I'm trying to figure out how, exactly, buildx checks if a layer is available in the local cache. Although I've searched fairly extensively, I can't seem to find any documentation on this.

Looking at the local cache files themselves, I see a bunch of files with hash names. My assumption is that it works as follows (assuming use of type=local,mode=max):

  1. For each line in the Dockerfile, it uses some combination of parameters to calculate a SHA hash.
  2. It checks in the --cache-from directory to see if a file with that hash as the name exists
  3. If it does exist, it uses that file as the layer and doesn't re-build anything (and copies that file to the --cache-to directory.
  4. If it does not exist, it builds the layer and saves it as a file, with that hash as the name, in the --cache-to directory.
  5. This results in an output cache with 1 file for each line in the Dockerfile.

So my questions are:

  1. Is my understanding of this process correct? Am I missing any key elements?
  2. For step (1) above, what are the "parameters" that it uses to calculate the hash? I would think it's the string value of the line itself, plus the value of any files that are copied by the line (e.g. ADD), but does it use anything else? e.g. the last-modified timestamp of any files that it copies?


No comments:

Post a Comment