I assume that you have the platform-specific drivers available for the I2C/GPIO/PWM/UART interfaces on your board(it should be part of BSP[Board-support-package] ).
It is just that you don't want to use the Kernel device driver framework and want to do things from the user-space. I'd been in this situation hence I know, how tempting it could be,especially , if you are not well-versed in Kernel device drivers.
a. SPEED: You mentioned it. But, you probably didn't grasp the reason completely.
Speed efficiency comes from avoiding the Context-switching between Kernel and User-space process. Here is an example:
/* A loop in kernel code which reads a register 100 time */
for (i = 0 ; i < 100 ; i++ )
{
__kernel_read_reg(...);
}
/* A loop in User-space code which reads a register 100 time */
for ( i= 0 ; i < 100; i++)
{
__user_read_reg(...);
}
Functionality wise both *_read_reg() is same. Assuming that __user_read_reg() will go through a typical-system-call procedure,it has to do a Context-switch for every single __user_read_reg(...) which is too costly.
You may argue, "We can mmap() the hardware registers and avoid system call for such operations".
Of course, you could do that, but the point I was making is:
What is close to hardware (for example: a register read or write or handling an interrupt) should be done as fast as possible. Latencies involved in context-switching will impact the performance.
b. Existing/Tested/Well-built subsystems:
If you see an I2C subsystem in the Linux Kernel, it provides a well-tested, robust framework which could be easily-reused. You don't have to write full I2C subsytem (handling all device types, speed, various configuration etc ) in the user-space.
Re-using" what is already done could be one big advantage while going for kernel device drivers.
c. Move from Polling-based approach to Interrupt-based mechanism
If you are not handling interrupts in Kernel driver,You must be using some sort of polling-mechanism in the user-space process. Depending on the system, it might not be very reliable way of handling the hardware-changes.Definitely not accurate/reliable for fast devices.
Interrupt-based mechanism , in general, where you handle the critical changes as fast as possible( Hardware interrupt context) and move the non-critical work-load either to user-space or some other kernel mechanism is more reliable way of handling devices.
Of-course, there could be several more arguments and counter-arguments besides above three.
Another thread which might be of interest to you is here:
Userspace vs kernel space driver