Tommaso's Portfolio

Seq2Seq Models for Symbolic Expression Recovery from Taylor Series

Neural Symbolic Regression from Truncated Taylor Series

This project explores the task of recovering symbolic mathematical expressions from truncated Taylor series using a sequence-to-sequence deep learning model.

Motivation

Taylor series provide a way to approximate functions locally, but the reverse problem — reconstructing the original symbolic function given only a truncated expansion — is far from trivial. This inverse task has important applications in areas such as symbolic regression, mathematical physics, and automated reasoning.

I became interested in exploring this challenge because I wanted to see whether it was feasible to recover the exact function starting only from its Taylor expansion. This scenario occurs more often than one might think: in many scientific and engineering domains, we often have access only to truncated series approximations of a function, or to indirect representations that capture its local behavior. However, in order to conduct deeper analysis or reasoning, the original closed-form function is required. Reconstructing it from limited information is therefore both a difficult and highly relevant problem.

Approach

Results

The trained model shows that it is indeed possible to reconstruct non-trivial symbolic functions starting solely from their truncated Taylor expansions. Despite the apparent loss of information that occurs when moving from a full closed-form expression to a local approximation, the model is able to recover the original structure with a surprisingly high degree of accuracy.

This suggests that neural networks, when trained on carefully constructed datasets, are capable of capturing and generalizing aspects of symbolic regression — a task traditionally considered outside the scope of purely data-driven methods.

The results obtained are promising, but they also open several directions for further improvement. In particular, integrating mechanisms such as self-attention could provide the model with a more flexible way of handling long-range dependencies within symbolic sequences, and thus potentially enhance both robustness and accuracy in the reconstruction process.


Repository structure: