There is currently no way to do this with a reduction clause in OpenMP. The two main options are to use critical (or atomic, which may be more efficient), or to expand A into a vector of vectors of size the number of threads: each thread accumulates partial sums into its own element, indexed using omp_get_thread_num(), and then these are added together at the end (this operation can be parallelised across different components of the vector). Which is more efficient will depend on a number of factors including the cost of evaluating the Os, the value of N, the number of components in the vector and the number of threads.
OpenMP 4.0 will support user-defined reduction operators, so you will be able to define what reduction means for an n-vector and use it in a reduction clause. Working implementations are likely 6-12 months away, though.
Hope that helps,