If it’s doing a search for the code, pulling it in to the context, and then spitting it back out in slightly modified form, then it can attribute the source it pulled in. That’s a very different thing from the AI because code that is pulled into context by a search had a strong influence on the output. The output is still generated the same way but it would be reasonable to credit the author of the code that is pulled in. However, the code in the training data cannot be credited. How you would pull in just the right piece of code in the first place though is a bit of a mystery to me.
There are a few ways of finding which code is relevant, but one way is to use some sort of vector database to perform the search using embeddings generated from the Qs, As, and query.
Embeddings are essentially semantic representations of the text which can be compared to each other for similarity.
If it’s doing a search for the code, pulling it in to the context, and then spitting it back out in slightly modified form, then it can attribute the source it pulled in. That’s a very different thing from the AI because code that is pulled into context by a search had a strong influence on the output. The output is still generated the same way but it would be reasonable to credit the author of the code that is pulled in. However, the code in the training data cannot be credited. How you would pull in just the right piece of code in the first place though is a bit of a mystery to me.
There are a few ways of finding which code is relevant, but one way is to use some sort of vector database to perform the search using embeddings generated from the Qs, As, and query.
Embeddings are essentially semantic representations of the text which can be compared to each other for similarity.