I wrote my own malloc new and realloc for my C++ project. Some of these pages are >= 4K. I was wondering when I call my malloc is there a way I can zero out the 4K+ page without reading the data into cache? I vaguely remember reading about something like this in either intel or AMD x86-64 documentation but I can't remember what it's called.
Does gcc (or clang) have an intrinsic I can use? If not what assembly instructions should I look up? I have 3 common use cases after a malloc. zeroing the memory, memcpy-ing a buffer and mixing both (64bytes or 512 of memcpy then rest as zeros). I'm not sure what will be the miminum architecture I'll support but it's no less then haswell. Likely it'll be Intel Skylake/AMD Zen and up
-Edit- I rolled back the C++ tag to C because generally intrinsic is in C