TY - GEN
T1 - Probabilistic Robustness for Data Filtering
AU - Yu, Yu
AU - Khan, Abdul Rafae
AU - Khadivi, Shahram
AU - Xu, Jia
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - We introduce our probabilistic robustness rewarded data optimization (PRoDO) approach as a framework to enhance the model's generalization power by selecting training data that optimizes our probabilistic robustness metrics. We use proximal policy optimization (PPO) reinforcement learning to approximately solve the computationally intractable training subset selection problem. The PPO's reward is defined as our (α, ϵ, γ)-Robustness that measures performance consistency over multiple domains by simulating unknown test sets in real-world scenarios using a leaving-one-out strategy. We demonstrate that our PRoDO effectively filters data that lead to significantly higher prediction accuracy and robustness on unknown-domain test sets. Our experiments achieve up to +17.2% increase of accuracy (+25.5% relatively) in sentiment analysis, and - 28.05 decrease of perplexity (-32.1% relatively) in language modeling. In addition, our probabilistic (α, ϵ, γ)-Robustness definition serves as an evaluation metric with higher levels of agreement with human annotations than typical performance-based metrics.
AB - We introduce our probabilistic robustness rewarded data optimization (PRoDO) approach as a framework to enhance the model's generalization power by selecting training data that optimizes our probabilistic robustness metrics. We use proximal policy optimization (PPO) reinforcement learning to approximately solve the computationally intractable training subset selection problem. The PPO's reward is defined as our (α, ϵ, γ)-Robustness that measures performance consistency over multiple domains by simulating unknown test sets in real-world scenarios using a leaving-one-out strategy. We demonstrate that our PRoDO effectively filters data that lead to significantly higher prediction accuracy and robustness on unknown-domain test sets. Our experiments achieve up to +17.2% increase of accuracy (+25.5% relatively) in sentiment analysis, and - 28.05 decrease of perplexity (-32.1% relatively) in language modeling. In addition, our probabilistic (α, ϵ, γ)-Robustness definition serves as an evaluation metric with higher levels of agreement with human annotations than typical performance-based metrics.
UR - http://www.scopus.com/inward/record.url?scp=85159857480&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85159857480&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85159857480
T3 - EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
SP - 2942
EP - 2951
BT - EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
T2 - 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023
Y2 - 2 May 2023 through 6 May 2023
ER -