aarch64: Use memcpy_simd as the default memcpy
Since __memcpy_simd is the fastest memcpy on almost all cores, replace the generic memcpy with it. (cherry picked from commit 91ac82d0)
Loading
Since __memcpy_simd is the fastest memcpy on almost all cores, replace the generic memcpy with it. (cherry picked from commit 91ac82d0)