i've been trying to understand paraphrasing from a learning and ml perspective and something confusing keeps showing up
when a model rewrites a sentence, it often either stays too close to the original structure or changes it so much that the meaning becomes slightly distorted
what makes this harder is that humans seem to do something similar when we are still learning, especially when we read a lot of technical text and then try to restate it in our own words
i've noticed this while going through tutorials and writing notes where my output still reflects the structure of the source even when i understand the concept
it made me wonder whether this is mainly a limitation of current training objectives, or if evaluation methods for originality are still too shallow to properly capture what good paraphrasing actually is
curious how people here think about this from a model design or training perspective