git-sizer GitHub

Use this command to install git-sizer:

winget install --id=GitHub.git-sizer -e

git-sizer is a command-line tool designed to analyze the size and health of a local Git repository. It computes various metrics, such as overall repository size, number of references (branches and tags), object counts, and file sizes, flagging potential issues that could impact performance or usability.

Key Features:

Repository Size Analysis: Identifies if the repository is too large, providing recommendations to optimize storage and reduce clone times.
Reference Management: Flags excessive branches or tags, suggesting strategies to streamline and manage references effectively.
Large File Detection: Highlights oversized blobs (files) and suggests alternatives like Git-LFS for handling them efficiently.
Tree Structure Evaluation: Detects directories with an unusually high number of entries, offering guidance on sharding files into smaller directories.
Duplicate Files Identification: Warns about repeated or similar files across paths, suggesting better organization practices.

Audience & Benefit:

Ideal for developers and teams managing Git repositories to maintain performance and ensure optimal usage. By identifying size-related issues early, git-sizer helps prevent common pain points such as slow cloning, excessive disk usage, and inefficient operations, ensuring a healthy and performant repository.

The tool can be installed via winget on Windows, making it accessible for developers across different environments.

Happy Git repositories are all alike; every unhappy Git repository is unhappy in its own way. —Linus Tolstoy

git-sizer

Is your Git repository bursting at the seams?

git-sizer computes various size metrics for a local Git repository, flagging those that might cause you problems or inconvenience. For example:

Is the repository too big overall? Ideally, Git repositories should be under 1 GiB, and (without special handling) they start to get unwieldy over 5 GiB. Big repositories take a long time to clone and repack, and take a lot of disk space. Suggestions:
- Avoid storing generated files (e.g., compiler output, JAR files) in Git. It would be better to regenerate them when necessary, or store them in a package registry or even a fileserver.
- Avoid storing large media assets in Git. You might want to look into Git-LFS or git-annex, which allow you to version your media assets in Git while actually storing them outside of your repository.
- Avoid storing file archives (e.g., ZIP files, tarballs) in Git, especially if compressed. Different versions of such files don't delta well against each other, so Git can't store them efficiently. It would be better to store the individual files in your repository, or store the archive elsewhere.
Does the repository have too many references (branches and/or tags)? They all have to be transferred to the client for every fetch, even if your clone is up-to-date. Try to limit them to a few tens of thousands at most. Suggestions:
- Delete unneeded tags and branches.
- Avoid pushing your "remote-tracking" branches to a shared repository.
- Consider using "git notes" rather than tags to attach auxiliary information to commits (for example, CI build results).
- Perhaps store some of your rarely-needed tags and branches in a separate fork of your repository that is not fetched from by normal developers.
Does the repository include too many objects? The more objects, the longer it takes for Git to traverse the repository's history, for example when garbage-collecting. Suggestions:
- Think about whether you are storing very many tiny files that could easily be collected into a few bigger files.
- Consider breaking your project up into multiple subprojects.
Does the repository include gigantic blobs (files)? Git works best with small- to medium-sized files. It's OK to have a few files in the megabyte range, but they should generally be the exception. Suggestions:

git-sizer GitHub

README

git-sizer

Getting started