Data Association on Large Quantities of Complex Data
Tauer, Gregory William
MetadataShow full item record
Data association is the problem of identifying when multiple sensors have observed the same entity, which is important since it allows multiple sensors to work together in observing a common environment. This thesis will address three issues facing modern data association algorithms, which result from the changing nature of data upon which data association algorithms must operate. First, graph association is developed for use on richly relational data to improve the accuracy of association. Graph association is modeled as a binary linear program, and it is shown that the graph association problem is NP-Hard, and closely related to approximate graph matching and the quadratic assignment problems. Since graph association is NP-Hard, a Lagrangian heuristic is developed for its solution, which also provides a performance guarantee in the form of an optimality gap. Second, to cope with the increasing quantity of data resulting from advances in sensor infrastructures, including advances in natural language processing, the Lagrangian heuristic developed for graph association is adapted for implementation in a map-reduce framework which allows the algorithm to be distributed across a cluster of computers, and address the real-time and large data requirements of data association. Finally, to handle streaming updates to a data association problem, an incremental graph partitioning approach is considered. The developed algorithm allows a current solution to be updated when new data arrives without recomputing a solution in full. This is accomplished in a way that considers the unique properties of data association, including the existence of robust gating techniques, non-metric distances, and sparse problems.