A lot of the implementations of OpenMP "outline" the parallel region. Essentially this means that they replace the parallel region with a call to a subroutine where the code for the parallel region is placed. Private variables then become local (automatic) variables in this subroutine. So when you call a function, all the compiler has to do is pass the local copy and the function only sees that private copy. If the variable is shared, no local variable is created in the "outlined" routine and a shared copy is passed to the function (and you will have a race if you don't protect it). Global variables used in the function are also shared and need to be protected.
The simple answer to your question is that the compiler can take care of much of the work for you with the information it has looking at the code. If you look at the OpenMP V2.5 spec, you can see this under the section that discusses predetermined sharing attributes. The number of items that the compiler should predetermine has gotten larger over the years for OpenMP. The original OpenMP spec really didn't expect a compiler to do much (or have much intelligence in regards to code flow, parallelism, etc). Later specs have expected compilers to be smarter in how they handle variables. However, there are still plenty of conditions where a compiler needs help - especially when it doesn't see all of the code at once. In those cases the information that you supply through the clauses is essential to getting the program to work correctly.
I am not sure that I answered your question in enough detail or not. If not, let me know and I will see if I can find one of the more technical presentations or papers that goes into this in more detail.