As noted by previous posts victi.ms is undergoing architectural changes to meet the updated needs of users. In this post I hope to give a quick overview of what to expect over the next few months.
In victims v1 and v2 the API was front and center. The API was how clients retrieved definitions. This worked pretty well with v1. With v2 things started to get a little bit hairy as more fields were added which didn’t always match the needs of different package types. It also became apparent that having an API for retrieving content was not an ideal way to get content which may differ for different data streams.
This lead us to a new idea of how to structure the victims data. Many people are aware of the victims-cve-db. This repository houses information in a standard YAML format which can be used by clients who are scanning against versions. Instead of housing hash data in a REST API and YAML data in a git repository we’ve decided to move to using the victims-cve-db. This will mean we will be adding some more fields to facility hash information. This also gets the victi.ms project out of the external API business. Clients will now be able to clone the git repository and use/import the results in whatever form makes sense.
Basic Flow and Architecture
Instead of pushing everything through a central hosted database with API and web front ends we will be using microservices to update and release the victims-cve-db.
Adding to the database
- User submits a PR to the victims-cve-db filled out with everything but hash information
- The PR is reviewed. Once verified the PR is merged into master.
- A GitHub webhook is caught by the victims-bot which
- Clones the repo
- Looks at the merged item
- Downloads the relevant artifacts
- Submits the artifacts to the proper hashing system (example)
- The hashing system opens up the artifact and hashes the internals
- The hashing system returns the results back to the caller with any relevant metadata
- Updates the merged items with hashes
- Commits and pushes the results back to master
Clients can do one of the following to get content to scan with:
- Download the tarball for current master as one large data set
- Use git to clone/update content and use the new content
What About Old Clients?
We will be hosting the current output of the v2 API in a static fashion. This means clients will be able to snag the entire database if the need to do so. However, filtering by time frame or database version will not be supported. After 6 months we will decommission the static database download.
Some future additions we are considering include:
- Automatic submissions from trusted public sources
- Python/Ruby/Go hashing services
- Notification of content updates through some method
We believe this change in architecture will create a more stable project and let the community more easily grow the content. If you’d like to help don’t hesitate to jump in and lend a hand!