different results vs # of threads or proc's'

General OpenMP discussion

Re: different results vs # of threads or proc's'

Postby ftinetti » Tue Nov 05, 2013 3:47 am

Hi David,

Let's restart, then, at the point in which we left: please identify arguments with IN/OUT/INOUT (only for those subroutines run in the parallel region). Also, I would like to know if it is possible to avoid print and write in the parallel region... is it? If not: remember that each thread will print/write its "own" values (i.e. those computed for its share of the workload) you are comparing the output taking that into account, right?

Fernando.
ftinetti
 
Posts: 558
Joined: Wed Feb 10, 2010 2:44 pm

Re: different results vs # of threads or proc's'

Postby dva2tlse » Wed Nov 06, 2013 1:07 am

Hi Fernando,
It's on a good way again; I started back from an old version which worked on one side, and I inserted in it the improvements that I had made on other sides, and, step by step, the solution lets itself be attained.
Bye, and thank's to you and Mark to have enabled me to find what was wrong,
David
dva2tlse
 
Posts: 18
Joined: Sun Sep 08, 2013 2:52 am
Location: Toulouse, France

Re: different results vs # of threads or proc's'

Postby dva2tlse » Sun Nov 10, 2013 6:24 am

Hi Mark and Fernando,
well, finally it does not work so much; again, at a step that I did not notice which (or the problem would be almost solved) it gives different results again, depending on the number of threads. It looks as if some data from one thread was copied to another thread and that this second thread goes on computing with the wrong set of intermediate data, so that at the end of that second thread, the result is wrong.
So I will try to run the process on only two threads, with files recording all the intermediate data, so that I should be able to compare everything and find what is wrong and when.
My chief runs the process on a single thread, and it is his responsibility to tune the cleanig subroutines doublons(... ) and ValInt(... ), and the final subroutine rainflow(... ) that counts events.
(while it is my responsibility to create one matrix per entity with fabmat(... ), and its columns which will be cleaned and counted, and I do also try to have the whole thing work correctly on several proc's)

Here again is the pseudo code, or skeleton, of our process :
Code: Select all
      program rfomp1
      open(1, file=file1, err=799) ! access='READONLY',
      call lecana(lina, LNGANA) ! LECture of the "ana" file.
      close(1)
c
      call leccvt(Ccvt1, Ccvt2, FScvt2) ! LECture of the "cvt" file. (open and close of fortran unit 2)
c
      open(3, file='1D/elements.input', err=700) ! file containing the numbers of the entities
   70 EID=EID+1
   71 READ(3, '(A)', err=702, end=77)LINE ! sortie de la boucle de lecture en fin de fichier
      if(line(1:1).eq.'#')goto 71
   77 print '(A, I6)', 'rfomp1:00 end=77 fin de elements.input-3 Ok'
      close(3) ! fermeture du fichier elements.input
c
      CALL OMP_SET_NUM_THREADS(40) ! 1) ! 2) ! 3) ! 4) ! 16) ! 8) !
c
C$OMP PARALLEL PRIVATE(cont, Diml, EID, Sstf, TID, tmp)
C$OMP+ SHARED(Ccvt1, Ccvt2, elem, elemout, lina)
C$OMP DO SCHEDULE(DYNAMIC)
c
      do EID=1, EIDMAX ! loop on the entities which numbers were read in the third input file
c
          call fabmat(cont, Diml, elem(EID), lina, LNGANA, nocc, TID) ! FABricate MATrix
c
          Stot=0
          do col=1, 326 ! loop on the 326 columns of the previous matrix
              call doublons(cont, Diml, ntf, valinter) ! first part of the cleaning
              call ValInt(Arr, Diml, ntf, valinter) ! second part of the cleaning
              call rainflow(Arr, ntf, Diml, EID, FAT, p, q) ! counting of events
c
              Stot=Stot+nocc(ntf)*(FAT)**p ! cont**p
              Voltot=Voltot+nocc(ntf)
          enddo ! end of the loop on the columns of the matrix
c
          Stotfinal=(Stot/Voltot)**(1/p) !DVA: ZZZ une équation semblable existe déjà dans rainflow.
          print
     +'(A, I3, A, I3, A, I6, A, A, A, E9.3, A, E13.6, A, I5, A, E9.3)',
     +'rfomp1:00 vol=', ntf,
     +', nocc=', nocc(ntf),
     +', EID=', EID,
     +', elem(EID)=', elem(EID),
     +', Sfat=', FAT,
     +' Stot=', Stot,
     +' Voltot=', Voltot,
     +' Sequi=', Stotfinal ! Stotfinal is the final interesting output from rainflow
c
c C$OMP ATOMIC
          write(elemout(EID), '(I3, 7X, A, F9.3)') EID, elem(EID), Stotfinal
          print '(A, I2.2, A, I6, A)',
     +'rfomp1:', TID,
     +' boucle Ok pour EID=', EID,
     +', elem(EID)='//elem(EID)//', elemout(EID)='//elemout(EID)//'.'
          print '(A)'
c
      enddo ! end of the loop on the different entities processed
C$OMP END DO
C$OMP END PARALLEL
c
c Ouverture du fichier de sortie :
c
      open(4, file='rfomp1.out', status='replace')
      do EID=1, EIDMAX
          write(4, '(A)') elemout(EID)
      enddo ! bas de la boucle d'écriture de elemout.
      close(4) ! fermeture du fichier de sortie rfomp1.out
c
      print '(A)'
      print '(A)', 'rfomp1:00 Fin.'
      print '(A)'
      end

I still did not add yet the INTENT IN/OUT/INOUT fortran statements at the beginning of the fortran subroutines; I am a bit lazy about it since it did not exist in the fortran 77 that I learned in the 80's. But the OMP sentinels and directives and clauses did not exit neither, and I use them anyhow, so I will have to force myself writing down this INTENT stuff...

[.../...]

Okay it's done, well at least it is prepared, so that on wednesday when I go back to work, I will just have to paste the stuff that I have just prepared at the beginning of the different subroutines of my program.
BTW, some websites that I have just consulted about the syntax of these INTENT statements (because of the comma that I had forgotten, eg. integer, intent(inout)::i , I forgot the comma between "integer" and "intent", but now it's okay) , well these websites recall that an integer I variable in one subroutine is of course completely independent of another integer I variable in another subroutine. Can you confirm me that it is the same between an "I" integer variable in a subroutine ran by thread TID1, and another "I" integer variable of the same subroutine but ran by thread TID2, provided this variable is not explicitely declared as SHARED between the threads ? I mean, if the variable is completely internal to the subroutine, not in the argument list nor a common block, I expect that of thread TID1 to be independent of that of thread TID2.

Now I wonder something else about the outputs; for each entity processed, there is a matrix of 326 columns.
Each column is cleaned then some events happening within it are counted.
Then for each entity, the counts are added-up, and this is the result to be kept.
You can see above that these results are written in an "elemout" character array, one line at a time; since the lines of elemout are indexed by the integer EID which is the number of the entity which is processed, and because this processing happens one at a time, there should not be any race condition between elemout(EID1) and elemout(EID2). Hence the ATOMIC directive which is written above can remain commented out and does not need to be used. Do you agree ? (I made my own opinion while writing)

These two stupid questions (the independance of variables between threads, and the need or not for the ATOMIC directive) could have explained the strange behaviour that I see about my results, but they seem to definitely NOT be the cause of the dependance of the results upon the number of threads.

Well, it seems that I did not find yet the cause of the strange results that I get, so I will probably write here again as soon as there is a need for explanations in my brain.
Good bye,
David
dva2tlse
 
Posts: 18
Joined: Sun Sep 08, 2013 2:52 am
Location: Toulouse, France

Re: different results vs # of threads or proc's'

Postby ftinetti » Mon Nov 11, 2013 4:26 am

Hi David,

I'm a little bit confused, the "real" program rfomp1 does have "implicit none", right?

I a quick look, I think there are some race conditions on Stot, Voltot, and Stotfinal

BTW, some websites that I have just consulted about the syntax of these INTENT statements (because of the comma that I had forgotten, eg. integer, intent(inout)::i , I forgot the comma between "integer" and "intent", but now it's okay) , well these websites recall that an integer I variable in one subroutine is of course completely independent of another integer I variable in another subroutine. Can you confirm me that it is the same between an "I" integer variable in a subroutine ran by thread TID1, and another "I" integer variable of the same subroutine but ran by thread TID2, provided this variable is not explicitely declared as SHARED between the threads ? I mean, if the variable is completely internal to the subroutine, not in the argument list nor a common block, I expect that of thread TID1 to be independent of that of thread TID2.


I do not follow all of your explanation/question, but I would agree on this: "if the variable is completely internal to the subroutine, not in the argument list nor a common block," and not SAVE(d) (I think there would be some more cases, but I think those mentioned are good enough for the point we are focusing on) then that variable is completely private to each thread running the subroutine. More specifically,
Code: Select all
subroutine sub1(arg1, ... argn)
! intents of arg1, argn
integer i
...

i is private to each thread.

HTH,

Fernando.
ftinetti
 
Posts: 558
Joined: Wed Feb 10, 2010 2:44 pm

Re: different results vs # of threads or proc's'

Postby dva2tlse » Mon Nov 11, 2013 5:53 am

Hello Fernando,
ftinetti wrote:Hi David,
I'm a little bit confused, the "real" program rfomp1 does have "implicit none", right ?

Of course it does; I may not have copied-pasted correctly, but as I explained you previously, I always put that statement, and the DEfAULT(NONE) directive for OMP also, now.

ftinetti wrote:I a quick look, I think there are some race conditions on Stot, Voltot, and Stotfinal

Yes thank you, it is probably the case; and it explains what I wrote previously, that it looks as if "some data from one thread was copied to another thread, and that this second thread goes on computing with the wrong set of intermediate data, so that at the end of that second thread, the result is wrong".
I think that this is exactly the result of a race condition, and what you noticed should be true. I will check for that on wednesday, when I will be at work again.

ftinetti wrote:I would agree on this: "if the variable is completely internal to the subroutine, not in the argument list nor a common block," then that variable is completely private to each thread running the subroutine.

Okay for that; it is basic, but I saw such a strange behaviour that I lost my trust in the basics.

Have a good day, and "see" you on wednesday,
David
dva2tlse
 
Posts: 18
Joined: Sun Sep 08, 2013 2:52 am
Location: Toulouse, France

Re: different results vs # of threads or proc's'

Postby dva2tlse » Fri Nov 15, 2013 4:50 am

Hello Fernando,
well, there is probably not any race condition on Stot, Voltot, and Stotfinal, because these variables are used in a totally private loop, and that the interesting result, Stotfinal, is output out of that loop by a write statement into a file which fortran unit number is totally private also :
Code: Select all
      write(TID+20, '(I3, 7X, A, F9.3))') EID, elem(EID), Stotfinal
(EID and elem(EID) are integer and character variables used to index Stotfinal and show which entity it is related with)

On the other hand, I found an internal input statement which reads in an intetnal ASCCI line namec lint, between bounds A and B that were improperly set :
Code: Select all
      read(lint(A:B), *)FSTcvt, FScvt2(FSTcvt, 0, 2),
     +(Ccvt2(FSTcvt, T(FSTcvt), i), i=1, 8)

This results in the values read being truncated roughly a hundred times too low, so that later on, a comparison did result in .false. instead of .true. and all the following operations were false, with no reason from openmp.
So it's all right now, with lots of other problems, but none regarding openmp.
Thank you to have helped me in that messy stuff,
David
dva2tlse
 
Posts: 18
Joined: Sun Sep 08, 2013 2:52 am
Location: Toulouse, France

Previous

Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 4 guests