Overview

The HNSW algorithm for approximate k-nearest neighbor graph search provides a scalable approach to identifying a close match to a query vector in a very large database while requiring many fewer comparison operations than are needed for a full database scan, at the cost of a controllable degree of random error in the recall accuracy of the system. (See here for the original paper of Malkov and Yashunin, first published as a preprint in 2016, and here for an accessible summary of the technique as implemented in Facebook Research’s Faiss library.) For the MPCv3 design we are aiming to implement the HNSW algorithm on top of an MPC privacy layer, so that all iris biometric data is stored in secret-shared form by several independent server nodes, and all comparisons between biometric templates are done via a Secure Multiparty Computation protocol as in the MPCv2 design.

In the following, we give a more detailed description of the system architecture we plan to implement for the MPCv3 release to integrate an HNSW index which supports scaling of the private iris uniqueness system to a very large iris code database. In particular, we outline the high-level design of the system, and discuss changes and extensions to the HNSW algorithm compared to the specification given in the original HNSW paper. The most important details are the following:

A somewhat more in-depth tour of the design features can be found in this page. The remainder of this document provides more details and justification for our design choices, a list of the primary system parameters, and detailed pseudocode of the main functions which will need to be implemented.

Algorithm Specification

Design Choices