Investigating Domain-Specific Attacks on Data Attribution*

Session Number

1

Advisor(s)

Jiaqi Ma, UIUC

Location

A133

Discipline

Computer Science

Start Date

15-4-2026 10:15 AM

End Date

15-4-2026 11:00 AM

Abstract

Data attribution methods have become more studied in literature to estimate the influence of each training sample on the model. This work has important implications for data privacy laws, copyright compensation, and more. However, recent work has shown that this method of data attribution can be vulnerable to adversarial manipulation. In this project, we investigate how such vulnerabilities can be exploited using domain-specific insights rather than purely black-box optimization techniques. We focus on discrete generative AI models in the symbolic music domain, where musical structures such as chords, motifs, and arpeggios offer interpretable patterns that have a rich theory behind them. Using these deep relationships, we develop simple, concrete strategies that can be used to influence training samples scores. Our results demonstrate that effective adversarial examples can be generated using intuitive, human-understandable strategies grounded in the structure of the domain. This suggests that similar approaches may exist in other discrete generative settings, including text and code, where structural patterns can likewise be exploited. Our work highlights the broader implications of weak machine learning systems through showing domain-specific weaknesses in data attribution.

Share

COinS
 
Apr 15th, 10:15 AM Apr 15th, 11:00 AM

Investigating Domain-Specific Attacks on Data Attribution*

A133

Data attribution methods have become more studied in literature to estimate the influence of each training sample on the model. This work has important implications for data privacy laws, copyright compensation, and more. However, recent work has shown that this method of data attribution can be vulnerable to adversarial manipulation. In this project, we investigate how such vulnerabilities can be exploited using domain-specific insights rather than purely black-box optimization techniques. We focus on discrete generative AI models in the symbolic music domain, where musical structures such as chords, motifs, and arpeggios offer interpretable patterns that have a rich theory behind them. Using these deep relationships, we develop simple, concrete strategies that can be used to influence training samples scores. Our results demonstrate that effective adversarial examples can be generated using intuitive, human-understandable strategies grounded in the structure of the domain. This suggests that similar approaches may exist in other discrete generative settings, including text and code, where structural patterns can likewise be exploited. Our work highlights the broader implications of weak machine learning systems through showing domain-specific weaknesses in data attribution.