Integer sqrt is usually not a library function and it’s very easy to implement, just a few lines of code. Algorithm is well defined on Wikipedia you read a lot. And yes, it doesn’t use FPU at all and it’s quite fast even on i8086.
I doubt doing it in software like that outperforms sqrtss/sqrtsd. Modern CPUs can do the conversions and the floating point sqrt in approximately 20-30 cycles total. That’s comparable to one integer division. But I wouldn’t mind being proven wrong.
Integer sqrt is usually not a library function and it’s very easy to implement, just a few lines of code. Algorithm is well defined on Wikipedia you read a lot. And yes, it doesn’t use FPU at all and it’s quite fast even on i8086.
I doubt doing it in software like that outperforms sqrtss/sqrtsd. Modern CPUs can do the conversions and the floating point sqrt in approximately 20-30 cycles total. That’s comparable to one integer division. But I wouldn’t mind being proven wrong.
Integer sqrt can be used for integers with any length, not only for integers fit into f64