Attention, Compilation, and Solver-based Symbolic Analysis are All You Need

06/11/2023
by   Prithwish Jana, et al.
0

In this paper we present a Java-to-Python (J2P) and Python-to-Java (P2J) back-to-back code translation method, and associated tool called CoTran, based on large language models (LLMs). Our method leverages the attention mechanism of LLMs, compilation, and symbolic execution-based test generation for equivalence testing between the input and output programs. More precisely, we modify the typical LLM training loop to incorporate compiler and symbolic execution loss. Via extensive experiments comparing CoTran with 10 other transpilers and LLM-based translation tools over a benchmark of more than 57,000 Java-Python equivalent pairs, we show that CoTran outperforms them on relevant metrics such as compilation and runtime equivalence accuracy. For example, our tool gets 97.43 equivalence accuracy for J2P translation, whereas the nearest competing tool only gets 96.44

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset