Housekeeping
At high volume or over time, Bugsink can fill up disk space. Although it’s designed to minimize manual cleanup, larger or long-running installations may require occasional housekeeping to keep the system running smoothly. This page outlines the key areas to focus on.
Note: since Bugsink is a relatively new project, a lot of the housekeeping features are still evolving and somewhat scattered across different parts of the system. This page serves as a starting point for understanding the current state of housekeeping in Bugsink, but expect more unified and comprehensive documentation in the future.
In this page, we will cover:
- Where to look: what kinds of things can fill up; and how can you detect whether that’s needed
- What to do: how to clean up the things that can fill up, which commands are available, and how to automate them
We’ll focus on Bugsink itself; but in the final section we’ll briefly touch on database-level cleanup, since the database is typically the thing that fills up first and is the most opaque.
General setup tips
If you’re considering how to run cleanup tasks, you’re likely operating at the scale where external event storage is worth considering.
Bugsink’s default setup stores full event payloads in the database. That’s great for simplicity, but has downsides at scale:
- Disk space ties to DB size
- Migrations get slower and riskier
- Backups and restores become unwieldy
- It’s more opaque: seeing what is taking up space is harder
Bugsink supports storing event payloads as flat files (or other backends) while keeping the metadata in the DB. It’s a good idea in any high-throughput setup, not just for archival. For details, see moving event data out of the database
Event eviction and retention
Bugsink automatically deletes events over time to manage disk usage. This is based on a smart retention algorithm that tries to keep the most relevant events while discarding older or less useful ones.
The default number of events to keep per-project is 10,000; this setting can be adjusted in the project settings UI. Retention is applied automatically during normal operation (as part of the digest process), so you don’t need to run a separate cleanup job for this.
When an event is deleted, Bugsink also removes most related data like tags and metadata, although tag values may become orphaned (see below).
For more background, see Rate Limits and Retention.
Locations on disk
These are the main places where Bugsink stores data that can grow over time.
-
Database: Primary storage for all Bugsink-related data, including issues, tags, etc. In the default setup this includes the event payload verbatim; if you’re using the event storage feature, the event payloads are stored in a separate location, but the metadata is still in the database.
-
Ingestion store: events during ingestion; this location is short-lived in principle, but if you’re running a high-volume Bugsink, it can fill up (in proportion to the backlog of events to process), and it may also fill up if the ingestion worker is not running, misconfigured or crashing. Configured as
INGEST_STORE_BASE_DIR
, by default:/tmp/bugsink/ingestion
.
File event storage: if you’re using the file event storage feature, this is where
the event payloads are stored. This is configured in bugsink_conf.py
via BUGSINK["EVENT_STORAGES"]
, or in Docker via
FILE_EVENT_STORAGE_PATH
and FILE_EVENT_STORAGE_USE_FOR_WRITE
.
Identifying what needs cleanup
You can get a rough overview of the contents of the database using the /counts/
page in the UI (as a superuser; simply
visit https://YOURBUGSINK/counts/). On this page you can see counts of various objects in the system, such as
issues, events, tags, and more.
This can help you get as sense of how much data you have stored in your database, and can serve as a starting point for identifying what might need cleanup.
For example, if you see a very high number of tags or events compared to issues, it may indicate that there are orphaned rows that need cleanup. Alternatively: if you see relatively low numbers across the board, but your database keeps growing, your problem may be at the level of the database itself, and you may need to look into database-level cleanup.

Bugsink’s built-in cleanup features
Use these tools to detect and resolve housekeeping issues:
-
bugsink-manage vacuum_tags
Removes unusedTagKey
andTagValue
entries left behind whenEvents
orIssues
are deleted. (checking for unused tagvalues is not done at each event-eviction / issue-deletion for efficiency reasons). run periodically to keep the tag tables from growing indefinitely. -
bugsink-manage cleanup_eventstorage <storage>
Removes stored event payloads that no longer have a matching Event in the database. In theory, this happens during Event deletion, but in practice it may not always be reliable because the event-storage is disconnected from the database by design. -
bugsink-manage make_consistent [--dry-run]
Deletes dangling objects (Events, Issues, etc) and updates counters. “In theory” Bugsink should do this itself; but may be needed if the database was modified directly and perhaps in the case of crashes (though those would be a bug in Bugsink, so feel free to report them on GitHub)
It might make sense to set up a periodic job (e.g. daily or weekly) to run vacuum_tags
and cleanup_eventstorage
,
while make_consistent
is typically run on demand.
Deleting projects and issues may make sense too, though you should be aware that deleting issues is often not what you want.
DB-level cleanup
All relational databases have trade-offs around how they handle deletions and disk usage. Most don’t immediately reclaim space when rows are deleted; they mark the space as reusable instead. Over time, especially with frequent insert/delete cycles, this can cause database files to grow unexpectedly.
If you suspect this is happening, you might need to run a vacuum or similar command to reclaim space. To do this, you can refer to your database’s documentation for the specific command to run. Search for terms like
VACUUM <your database>
table bloat <your database>
optimize table <your database>
or consult the docs for your specific backend.
Future improvements
Bugsink is a relatively new project, and many of the housekeeping features are still evolving. The goal is to eventually have more automated and comprehensive cleanup processes that require less manual intervention.
At the time of writing, some issues are “work in progress” on GitHub; you can check that progress here:
- #138 (Auto-)Cleanup for Orphaned Issues
- #135 Cleanup of orphaned TagKey and TagValue rows
- #134 IssueTag cleanup
Conclusion
Bugsink is built to clean up after itself where possible, deleting related data when issues or events are removed. Still, over time, and especially at higher volumes, some manual housekeeping may be needed to free up disk space or clarify what’s actually being stored.
The tools and options described above help you stay ahead of that, until more of this becomes automatic.