feat/bust-cache-validators-change #14

Merged
brianmcgee merged 2 commits from feat/bust-cache-validators-change into main 2024-01-03 08:08:57 +00:00
Member

Tracks the mod time and size of a formatter's executable in bolt.

The cache is busted using the following criteria:

  • a new formatter has been configured.
  • an existing formatter has changed (mod time or size)
  • an existing formatter has been removed from config

Also implemented better resolution of symlinks when determining a formatters executable path.

Tracks the mod time and size of a formatter's executable in bolt. The cache is busted using the following criteria: - a new formatter has been configured. - an existing formatter has changed (mod time or size) - an existing formatter has been removed from config Also implemented better resolution of symlinks when determining a formatters executable path.
zimbatm was assigned by brianmcgee 2024-01-02 17:46:49 +00:00
brianmcgee added 2 commits 2024-01-02 17:46:50 +00:00
Signed-off-by: Brian McGee <brian@bmcgee.ie>
Tracks the mod time and size of a formatter's executable in bolt.

The cache is busted using the following criteria:

- a new formatter has been configured.
- an existing formatter has changed (mod time or size)
- an existing formatter has been removed from config

Also implemented better resolution of symlinks when determining a formatters executable path.

Closes #2

Signed-off-by: Brian McGee <brian@bmcgee.ie>
zimbatm approved these changes 2024-01-02 19:10:16 +00:00
zimbatm left a comment
Owner

Looks good 👍

I remember it taking me a bit of time getting these things so it feels right in the original implementation.

Random thought: for the formatted file paths, it might make sense to skip symlinks entirely. This removes the issue where treefmt could potentially format /etc/passwd because it's the target of a symlink in the repo. Repos really should only symlink to files that exist in the repo, and those are then also part of the Walking anyways.

Looks good 👍 I remember it taking me a bit of time getting these things so it feels right in the original implementation. Random thought: for the formatted file paths, it might make sense to skip symlinks entirely. This removes the issue where treefmt could potentially format `/etc/passwd` because it's the target of a symlink in the repo. Repos really should only symlink to files that exist in the repo, and those are then also part of the Walking anyways.
@ -69,0 +71,4 @@
// check for any newly configured or modified formatters
for name, formatter := range formatters {
stat, err := os.Lstat(formatter.Executable())

Probably we care more about which executable is running in case it's a symlink. I would use os.Stat here. And maybe store the resolved path as well?

Probably we care more about which executable is running in case it's a symlink. I would use `os.Stat` here. And maybe store the resolved path as well?
Author
Member

I went with os.Lstat here because gofmt resolved to a symlink which was then another relative symlink within the nix output path. This ensures we recursively resolve down to the underlying binary.

For paths that are being formatted we already skip symlinks when traversing the filesystem

internal/cache/cache.go Lines 102 to 115 in ada9a72a7e
return filepath.Walk(root, func(path string, info fs.FileInfo, err error) error {
if err != nil {
return fmt.Errorf("%w: failed to walk path", err)
} else if ctx.Err() != nil {
return ctx.Err()
} else if info.IsDir() {
// todo what about symlinks?
return nil
}
if info.Mode()&os.ModeSymlink == os.ModeSymlink {
// skip symlinks
return nil
}

I went with `os.Lstat` here because `gofmt` resolved to a symlink which was then another relative symlink within the nix output path. This ensures we recursively resolve down to the underlying binary. For paths that are being formatted we already skip symlinks when traversing the filesystem https://git.numtide.com/numtide/treefmt/src/commit/ada9a72a7e56ffea11af5899cb711b6bb66a7293/internal/cache/cache.go#L102-L115
brianmcgee marked this conversation as resolved
@ -69,0 +94,4 @@
entry = &Entry{
Size: stat.Size(),
Modified: stat.ModTime(),
}

This could be seen as a "CacheKey" or something like that. If any of the values have changed, invalidate. I know this is what you are doing already; just wondering about semantics a bit.

This could be seen as a "CacheKey" or something like that. If any of the values have changed, invalidate. I know this is what you are doing already; just wondering about semantics a bit.
Author
Member

I currently think of it as a cache.Entry given the package it's within, as it's more of a Cache Entry for me than a Cache Key since it's not being used as the key in bolt.

I currently think of it as a `cache.Entry` given the package it's within, as it's more of a Cache Entry for me than a Cache Key since it's not being used as the key in bolt.
brianmcgee marked this conversation as resolved
@ -110,12 +111,15 @@ func (f *Format) Run() error {
eg.Go(func() error {
batchSize := 1024
batch := make([]string, batchSize)
batch = batch[:0]

Aren't those two lines equivalent to batch := []string{}?

Aren't those two lines equivalent to `batch := []string{}`?
Author
Member

What I believe I'm doing here is creating an instance of batch with a pre-initialised backing array of batchSize but resetting the slice's length to 0 so that subsequent logic looking at len(batch) and range batch all work correctly.

Looking at it with a debugger seems to confirm it's doing what I think it's doing, with the capacity being set to 1024 but length matching the latest batch to process.

What I _believe_ I'm doing here is creating an instance of `batch` with a pre-initialised backing array of `batchSize` but resetting the slice's length to 0 so that subsequent logic looking at `len(batch)` and `range batch` all work correctly. Looking at it with a debugger seems to confirm it's doing what I think it's doing, with the capacity being set to `1024` but length matching the latest batch to process.
brianmcgee marked this conversation as resolved
brianmcgee merged commit 811f883a2b into main 2024-01-03 08:08:57 +00:00
zimbatm deleted branch feat/bust-cache-validators-change 2024-01-03 10:15:32 +00:00
This repo is archived. You cannot comment on pull requests.
No reviewers
No Milestone
No project
No Assignees
2 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: numtide/treefmt#14
No description provided.