Ensuring Gitlab is being used correctly
An overview of example queries to verify usage data sanity of Gitlab’s usage.

In continuation of the my previous article https://medium.com/@godwin-pinto/how-well-maintained-are-your-git-management-processes-9edb4ffed6e6, this article provides you a hands on developer experience to articulate the queries which will extract the information once you have completed running the open source utility https://github.com/godwinpinto/gitlab-etl.
Recommended: Its best to have a Business Intelligence / Reporting tool that can query MongoDB else developing ones own panel is a tedious task, specially creating further drill downs on the summary data.
Post running the above job you would find below mentioned collections in your MongoDB;
- groups: Group metadata
- group_users: Group and user mapping and their rights
- group_milestones: Milestones created under respective groups
- projects: Project metadata
- project_users: Project and user mapping and their rights
- project_milestones: Milestones created under respective projects
- project_branches: Repositories are mapped to projects and this will store all Git repository (branches) metadata
- users: All users metadata from Gitlab server
- issues: Gitlab’s issue metadata is stored in this collection
- issue_notes: The history of various conversation under every issue
In majority of the queries below, you might see “created_at” field with a date criteria. This is set in such a way so that all backward non-sanitized data can be ignored as part of the report. e.x. suppose you start following the conventions from a date like 1st April 2023.
Milestones
Below are some checkpoints that you could derive in the Gitlab’s milestone area (collections name: project_milestones, group_milestones)
//Collections: group_milestones, project_milestones
//Milestones that don't have due dates or start dates
{ "state" : "active", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "$or" : [{ "due_date" : null}, { "start_date" : null}]}
//Milestones that dont have atleast one Documentation issue. Usually used to store FSD, BRS, Design documents, etc.
{ "$match" : { "state" : "active", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}}}, { "$project" : { "title" : 1, "web_url" : 1, "id" : 1}}, { "$lookup" : { "from" : "issues", "localField" : "id", "foreignField" : "milestone.id", "as" : "issues"}}, { "$match" : { "issues.labels" : { "$ne" : "Work: Documentation"}}}, { "$project" : { "title" : 1, "web_url" : 1, "id" : 1, "_id" : 0}}
//Milestones that dont have atleast one Testing issue. Usually used to store testcases
{ "$match" : { "state" : "active", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}}}, { "$project" : { "title" : 1, "web_url" : 1, "id" : 1}}, { "$lookup" : { "from" : "issues", "localField" : "id", "foreignField" : "milestone.id", "as" : "issues"}}, { "$match" : { "issues.labels" : { "$ne" : "Work: Testing"}}}, { "$project" : { "title" : 1, "web_url" : 1, "id" : 1, "_id" : 0}}, { "$count" : "total_records"}
Issues
Below are some checkpoints that you could derive in the Gitlab’s issue area (collections name: issues).
//Issues that dont have deadlines
{ "state" : "opened", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "due_date" : null}
//No milestones attached to issues
{ "state" : "opened", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "milestone" : null}
//No assignees mapped to issues
{ "state" : "opened", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "assignee" : null}
//Issues with no Work label
{ "state" : "opened", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "labels" : { "$nin" : ["Work: Documentation", "Work: Testing", "Work: Enhancement", "Work: BugFix", "Work: Learning", "Work: Interview", "Work: Support", "Work: Meeting", "Work: Call", "Work: Release", "Work: Available", "Work: UnableToWork", "Work: KnowledgeTransfer", "Work: Training", "Work: Analysis", "Work: DevOps", "Work: Discussion", "Work: Suggestion"]}}
//Issues that dont have Environment label
{ "state" : "opened", "$and" : [{ "labels" : "Work: BugFix"}, { "labels" : { "$nin" : ["Env: SIT", "Env: UAT", "Env: PROD"]}}], "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}}
//Issues that dont have severity tag
{ "state" : "opened", "$and" : [{ "labels" : "Work: BugFix"}, { "labels" : { "$nin" : ["Severity: Critical", "Severity: High", "Severity: Low", "Severity: Medium"]}}], "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}}
//Issues that are open but their milestone is closed
{ "state" : "opened", "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "milestone.state" : "closed"}
//Time estimates have not been entered for issues
{ "time_stats.time_estimate" : 0, "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "updated_at" : { "$gt" : { "$java" : 2023-03-06 16:58:42.022 } } }
//Time spent not entered for issues
{ "created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "time_stats.total_time_spent" : 0}
Branches
Below are some checkpoints that you could derive in the Gitlab’s branches / repository area
//Branches that dont have any commit since 'x' date
{ "commit.committed_date" : { "$lt" : { "$java" : 2022-12-21 16:58:42.093 } }, "commit.created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "name" : { "$nin" : ["master", "main"]} }
//Assuming milestone level branch is created with prefix mil-. This tells if they are protected or not
{ "name" : { "$regularExpression" : { "pattern" : "^mil-", "options" : ""}}, "commit.created_at" : { "$gt" : { "$date" : "2022-08-31T18:30:00Z"}}, "protected" : false}
Note: The queries have been printed from JAVA logs. Some formatting might be required when running on MongoDB compatible IDE.
A Separate article will be covered on the developer performance insights that can be extracted.
While these are some basic examples but should give an idea of how an organization could add this process in their devOps to automate and verify Gitlab’s data usage.