High Resolution performance timer


I recently stumbled across this issue because I was trying to compile some code on an ARM based computer. There was code in the program I wanted to compile that uses assembly! I am not going to get into the details over which method might be faster or has higher resolution. From what I have learned this is the most compact and portable code to use if you want a high resolution counter that can be used for something like performance profiling.


This was the original code that was causing the issue. This code depends on the x86 instruction set.


#ifdef _WIN32
   unsigned long long tick;
QueryPerformanceCounter((LARGE_INTEGER )&tick); // works great on Windows ONLY
return tick;
#else
 uint32_t hi, lo;
asm volatile ("rdtsc" : "=a"(lo), "=d"(hi)); // Works well on x86 only
return ( (uint64_t)lo)|( (uint64_t)hi)<< 32 );
#endif



Thanks to improvements on the POSIX based
int clock_gettime(clockid_t clk_id, struct timespec tp);
 
We can replace our not portable assembly code for our easy to use clock_gettime code like so



#ifdef _WIN32
   unsigned long long tick;
QueryPerformanceCounter((LARGE_INTEGER )&tick); // works great on Windows ONLY
return tick;
#else
   timespec timeInfo;
clock_gettime(CLOCK_MONOTONIC_RAW, &timeInfo); // nanosecond resolution
unsigned long long int nanosecs = ((unsigned long long)timeInfo.tv_sec)
1000000000 + 
                       ((unsigned long long)timeInfo.tv_nsec);
return nanosecs;
#endif

Best of luck.

References:
  1. http://man7.org/linux/man-pages/man2/clock_gettime.2.html
  2. http://tdistler.com/2010/06/27/high-performance-timing-on-linux-windows
  3. http://en.wikipedia.org/wiki/High_Precision_Event_Timer