I wrote a small program https://github.com/Beliavsky/FortranTip/blob/main/xxdirect_access.f90 to understand how direct access files, also copied below. Intel Fortran gives output
Be Beryllium 9.0120
H Hydrogen 1.0070
could not read record 10
He Helium 4.0020
which is what I expected, but gfortran gives
Be Beryllium 9.0120
H Hydrogen 1.0070
0.0000
He Helium 4.0020
Three questions:
(1) Are both compilers standard-conforming here?
(2) How should I set the parameter atom_recl?
(3) Is there a way to find out the maximum record number of a direct access file other than trial and error?
I wrote a small program https://github.com/Beliavsky/FortranTip/blob/main/xxdirect_access.f90 to understand how direct access files, also copied below. Intel Fortran gives output
Be Beryllium 9.0120
H Hydrogen 1.0070
could not read record 10
He Helium 4.0020
which is what I expected, but gfortran gives
Be Beryllium 9.0120
H Hydrogen 1.0070
0.0000
He Helium 4.0020
Three questions:
(1) Are both compilers standard-conforming here?
(2) How should I set the parameter atom_recl?
(3) Is there a way to find out the maximum record number of a direct access file other than trial and error?
Here is the code from GitHub.
module chemistry_mod
implicit none
integer, parameter :: atom_recl = 100
type :: atom_t
character :: symbol*2, name*10
real :: mass
end type atom_t
end module chemistry_mod
!
program direct_access
use chemistry_mod, only: atom_t, atom_recl
implicit none
integer, parameter :: outu = 10, inu = 11, irec(4) = [4,1,10,2]
character (len=*), parameter :: data_file = "atomic_mass.dat"
integer :: i,j,ierr
type(atom_t) :: atom
! write to a direct access data file
open (unit=outu,file=data_file,access="direct",recl=atom_recl,action="write") write (outu,rec=1) atom_t("H" ,"Hydrogen" ,1.007)
write (outu,rec=2) atom_t("He","Helium" ,4.002)
write (outu,rec=3) atom_t("Li","Lithium" ,6.941)
write (outu,rec=4) atom_t("Be","Beryllium",9.012)
close (outu)
! open the direct access data file for reading -- must specify recl
open (unit=inu,file=data_file,access="direct",recl=atom_recl,action="read") do i=1,size(irec)
j = irec(i) ! which record to read
read (inu,rec=j,iostat=ierr) atom ! try to read record j from data file
if (ierr == 0) then
write (*,"(a2,1x,a10,1x,f8.4)") atom ! if it exists, print it
else
write (*,"('could not read record ',i0)") j
end if
end do
end program direct_access
Direct access records are all the same length, so you can INQUIRE the file size and divide by the size of a line to get the number of lines..
I wrote a small program https://github.com/Beliavsky/FortranTip/blob/main/xxdirect_access.f90 to understand how direct access files, also copied below. Intel Fortran gives output
Be Beryllium 9.0120
H Hydrogen 1.0070
could not read record 10
He Helium 4.0020
which is what I expected, but gfortran gives
Be Beryllium 9.0120
H Hydrogen 1.0070
0.0000
He Helium 4.0020
Three questions:
(1) Are both compilers standard-conforming here?
(2) How should I set the parameter atom_recl?
(3) Is there a way to find out the maximum record number of a
direct access file other than trial and error?
Three questions:
(1) Are both compilers standard-conforming here?
(2) How should I set the parameter atom_recl?
(3) Is there a way to find out the maximum record number of a direct access file other than trial and error?
I wrote a small program https://github.com/Beliavsky/FortranTip/blob/main/xxdirect_access.f90 to understand how direct access files, also copied below. Intel Fortran gives output
Three questions:
(1) Are both compilers standard-conforming here?
(2) How should I set the parameter atom_recl?
(3) Is there a way to find out the maximum record number of a direct access file other than trial and error?
I wrote a program where each record consists of 2 reals and timed writing and reading large numbers of such records using unformatted direct, unformatted stream, and formatted sequential. For this case unformatted stream looks good since both writingand reading are fast. On Windows, Intel Fortran was much slower than gfortran for writing unformatted direct.
In article <3c9ee6bc-13d3-412a...@googlegroups.com>,
Beliavsky <beli...@aol.com> writes:
Reading the Stream Input Output article by Clive Page at the Fortran Wiki h=
ttps://fortranwiki.org/fortran/show/Stream+Input+Output, it looks like one =
can get the benefits of direct access files using unformatted stream I/O an=
d setting the POS in the read or write statements to access the desired arr=
ay element. Dividing the storage_size of the derived type by the file_stora=
ge_size constant from the iso_fortran_env module gives the position offset =
of successive array elements in the file. The code is at https://github.com=
/Beliavsky/FortranTip/blob/main/stream_pos_dt.f90 and also below.
I definitely prefer record-oriented format to stream format for usenet
posts. :-|
Reading the Stream Input Output article by Clive Page at the Fortran Wiki h= ttps://fortranwiki.org/fortran/show/Stream+Input+Output, it looks like one = can get the benefits of direct access files using unformatted stream I/O an= d setting the POS in the read or write statements to access the desired arr= ay element. Dividing the storage_size of the derived type by the file_stora= ge_size constant from the iso_fortran_env module gives the position offset = of successive array elements in the file. The code is at https://github.com= /Beliavsky/FortranTip/blob/main/stream_pos_dt.f90 and also below.
On Thursday, March 17, 2022 at 2:36:57 PM UTC-4, Phillip Helbig (undress to reply) wrote:
In article <3c9ee6bc-13d3-412a...@googlegroups.com>,
Beliavsky <beli...@aol.com> writes:
Reading the Stream Input Output article by Clive Page at the Fortran Wiki h=
ttps://fortranwiki.org/fortran/show/Stream+Input+Output, it looks like one =
can get the benefits of direct access files using unformatted stream I/O an=
d setting the POS in the read or write statements to access the desired arr=
ay element. Dividing the storage_size of the derived type by the file_stora=
ge_size constant from the iso_fortran_env module gives the position offset =
of successive array elements in the file. The code is at https://github.com=
/Beliavsky/FortranTip/blob/main/stream_pos_dt.f90 and also below.
I definitely prefer record-oriented format to stream format for usenet posts. :-|
I will use shorter lines in future messages, cutting them off at position 70 or 75.
Many people assume that the way the text is shown on their own screen is
the same as other people will see it, especially with regard to line-breaking, but also with regard to encoding and decoding of 8-bit characters and so on.
When I started using Fortran, I set the editor to automatically wrap
after 72 characters. That is a good number for usenet and for many
other text documents as well.
On Thursday, March 17, 2022 at 12:34:13 PM UTC-7, Phillip Helbig (undress to reply) wrote:
(snip)
Many people assume that the way the text is shown on their own screen is
the same as other people will see it, especially with regard to
line-breaking, but also with regard to encoding and decoding of 8-bit
characters and so on.
When I started using Fortran, I set the editor to automatically wrap
after 72 characters. That is a good number for usenet and for many
other text documents as well.
Some news hosts require it. I forget now which one I used to use,
but it required reformatting quoted text that was too long. I still
do that sometimes, remembering that one. With variable width fonts,
it is hard to know how wide things are, but I do try to keep them
not too wide.
I wrote a program where each record consists of 2 reals and timed writing and reading large numbers of such records using unformatted direct, unformatted stream, and formatted sequential. For this case unformatted stream looks good since both writingand reading are fast. On Windows, Intel Fortran was much slower than gfortran for writing unformatted direct. The code is below and also at https://github.com/Beliavsky/FortranTip/blob/main/xdirect_access_array.f90 . The gfortran results with -O3 are
n, nreals = 10000000 2My preference is for "elapse_time" rather than CPU_TIME, especially for I/O timing, although elapsed time and I/O performance can be effected by disk buffering.
iol= 8
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
task time
write unformatted direct 1.156250
read unformatted direct 0.000000
write unformatted stream 0.031250
read unformatted stream 0.015625
write formatted sequential 14.250000
read formatted sequential 8.015625
and the Intel Fortran results for Version 2021.5.0 Build 20211109_000000 with -O3 are
n, nreals = 10000000 2
iol= 2
0.4931603 0.8502113 0.4931603 0.8502113
0.4931603 0.8502113 0.4931603 0.8502113
0.4931603 0.8502113 0.4931603 0.8502113
task time
write unformatted direct 30.156250
read unformatted direct 0.000000
write unformatted stream 0.031250
read unformatted stream 0.015625
write formatted sequential 30.859375
read formatted sequential 5.046875
Here is the code.
program direct_access
implicit none
integer, parameter :: n = 10**7, & ! # of records
nreals = 2, & ! values per record
iu = 10, ntimes = 7, ndt = ntimes - 1, nlen = 35
character (len=*), parameter :: unformatted_file = "temp.bin", & unformatted_seq_file = "temp_seq.bin",formatted_seq_file = "temp_seq.txt" integer :: i,iol
real :: xmat(n,nreals),ymat(n,nreals),xlast(nreals),times(ntimes),dt(ndt) character (len=nlen) :: labels(ndt)
call random_number(xmat)
inquire (iolength=iol) xmat(1,:) ! store record length in iol
print*,"n, nreals =",n,nreals
print*,"iol=",iol
call cpu_time(times(1))
! write unformatted direct
open (unit=iu,file=unformatted_file,access="direct", & recl=iol,action="write")
do i=1,n
write (iu,rec=i) xmat(i,:)
end do
close (iu)
call cpu_time(times(2))
open (unit=iu,file=unformatted_file,access="direct", & recl=iol,form="unformatted",action="read")
! read the last record without looping over previous records
read (iu,rec=n) xlast
print*,xmat(n,:),xlast
close (iu)
call cpu_time(times(3))
! write unformatted stream
open (unit=iu,file=unformatted_seq_file,form="unformatted", & access="stream",action="write")
write (iu) xmat
close (iu)
call cpu_time(times(4))
! read unformatted stream
open (unit=iu,file=unformatted_seq_file,form="unformatted", & access="stream",action="read")
read (iu) ymat
print*,xmat(n,:),ymat(n,:)
close (iu)
call cpu_time(times(5))
! write formatted sequential
open (unit=iu,file=formatted_seq_file,action="write")
do i=1,n
write (iu,*) xmat(i,:)
end do
close (iu)
call cpu_time(times(6))
! read formatted sequential
open (unit=iu,file=formatted_seq_file,action="read")
do i=1,n
read (iu,*) ymat(i,:)
end do
print*,xmat(n,:),ymat(n,:)
call cpu_time(times(7))
dt = times(2:) - times(:ntimes-1)
labels = [character (len=nlen) :: &
"write unformatted direct","read unformatted direct", &
"write unformatted stream","read unformatted stream", &
"write formatted sequential","read formatted sequential"]
print "(/,a35,1x,a9)", "task","time"
print "(a35,1x,f9.6)",(trim(labels(i)),dt(i),i=1,ndt)
end program direct_access
On Friday, March 18, 2022 at 12:38:06 AM UTC+11, Beliavsky wrote:and reading are fast. On Windows, Intel Fortran was much slower than gfortran for writing unformatted direct. The code is below and also at https://github.com/Beliavsky/FortranTip/blob/main/xdirect_access_array.f90 . The gfortran results with -O3 are
I wrote a program where each record consists of 2 reals and timed writing and reading large numbers of such records using unformatted direct, unformatted stream, and formatted sequential. For this case unformatted stream looks good since both writing
My preference is for "elapse_time" rather than CPU_TIME, especially for I/O timing, although elapsed time and I/O performance can be effected by disk buffering.
n, nreals = 10000000 2
iol= 8
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
task time
write unformatted direct 1.156250
read unformatted direct 0.000000
write unformatted stream 0.031250
read unformatted stream 0.015625
write formatted sequential 14.250000
read formatted sequential 8.015625
and the Intel Fortran results for Version 2021.5.0 Build 20211109_000000 with -O3 are
n, nreals = 10000000 2
iol= 2
0.4931603 0.8502113 0.4931603 0.8502113
0.4931603 0.8502113 0.4931603 0.8502113
0.4931603 0.8502113 0.4931603 0.8502113
task time
write unformatted direct 30.156250
read unformatted direct 0.000000
write unformatted stream 0.031250
read unformatted stream 0.015625
write formatted sequential 30.859375
read formatted sequential 5.046875
Here is the code.
program direct_access
implicit none
integer, parameter :: n = 10**7, & ! # of records
nreals = 2, & ! values per record
iu = 10, ntimes = 7, ndt = ntimes - 1, nlen = 35
character (len=*), parameter :: unformatted_file = "temp.bin", &
unformatted_seq_file = "temp_seq.bin",formatted_seq_file = "temp_seq.txt"
integer :: i,iol
real :: xmat(n,nreals),ymat(n,nreals),xlast(nreals),times(ntimes),dt(ndt)
character (len=nlen) :: labels(ndt)
call random_number(xmat)
inquire (iolength=iol) xmat(1,:) ! store record length in iol
print*,"n, nreals =",n,nreals
print*,"iol=",iol
call cpu_time(times(1))
! write unformatted direct
open (unit=iu,file=unformatted_file,access="direct", &
recl=iol,action="write")
do i=1,n
write (iu,rec=i) xmat(i,:)
end do
close (iu)
call cpu_time(times(2))
open (unit=iu,file=unformatted_file,access="direct", &
recl=iol,form="unformatted",action="read")
! read the last record without looping over previous records
read (iu,rec=n) xlast
print*,xmat(n,:),xlast
close (iu)
call cpu_time(times(3))
! write unformatted stream
open (unit=iu,file=unformatted_seq_file,form="unformatted", &
access="stream",action="write")
write (iu) xmat
close (iu)
call cpu_time(times(4))
! read unformatted stream
open (unit=iu,file=unformatted_seq_file,form="unformatted", &
access="stream",action="read")
read (iu) ymat
print*,xmat(n,:),ymat(n,:)
close (iu)
call cpu_time(times(5))
! write formatted sequential
open (unit=iu,file=formatted_seq_file,action="write")
do i=1,n
write (iu,*) xmat(i,:)
end do
close (iu)
call cpu_time(times(6))
! read formatted sequential
open (unit=iu,file=formatted_seq_file,action="read")
do i=1,n
read (iu,*) ymat(i,:)
end do
print*,xmat(n,:),ymat(n,:)
call cpu_time(times(7))
dt = times(2:) - times(:ntimes-1)
labels = [character (len=nlen) :: &
"write unformatted direct","read unformatted direct", &
"write unformatted stream","read unformatted stream", &
"write formatted sequential","read formatted sequential"]
print "(/,a35,1x,a9)", "task","time"
print "(a35,1x,f9.6)",(trim(labels(i)),dt(i),i=1,ndt)
end program direct_access
There are arguments for either timer, although they do represent different performance. The granularity of CPU_TIME is always a problem on Windows.
Given how annoying "recl=iol" is when converting to ifort, I would not fix that although it should be updated.
I wrote a program where each record consists of 2 reals and timed writing and reading large numbers of such records using unformatted direct, unformatted stream, and formatted sequential. For this case unformatted stream looks good since both writingand reading are fast. On Windows, Intel Fortran was much slower than gfortran for writing unformatted direct. The code is below and also at https://github.com/Beliavsky/FortranTip/blob/main/xdirect_access_array.f90 . The gfortran results with -O3 are
n, nreals = 10000000 2Another problem that this example demonstrates is the poor "write formatted sequential" performance of 14.25 sec for gfortran and 30.86 sec for Intel. This is not due to disk I/O delays.
iol= 8
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
7.01291561E-02 0.410597622 7.01291561E-02 0.410597622
task time
write unformatted direct 1.156250
read unformatted direct 0.000000
write unformatted stream 0.031250
read unformatted stream 0.015625
write formatted sequential 14.250000
read formatted sequential 8.015625
and the Intel Fortran results for Version 2021.5.0 Build 20211109_000000 with -O3 are
n, nreals = 10000000 2
iol= 2
0.4931603 0.8502113 0.4931603 0.8502113
0.4931603 0.8502113 0.4931603 0.8502113
0.4931603 0.8502113 0.4931603 0.8502113
task time
write unformatted direct 30.156250
read unformatted direct 0.000000
write unformatted stream 0.031250
read unformatted stream 0.015625
write formatted sequential 30.859375
read formatted sequential 5.046875
Here is the code.
program direct_access
implicit none
integer, parameter :: n = 10**7, & ! # of records
nreals = 2, & ! values per record
iu = 10, ntimes = 7, ndt = ntimes - 1, nlen = 35
character (len=*), parameter :: unformatted_file = "temp.bin", & unformatted_seq_file = "temp_seq.bin",formatted_seq_file = "temp_seq.txt" integer :: i,iol
real :: xmat(n,nreals),ymat(n,nreals),xlast(nreals),times(ntimes),dt(ndt) character (len=nlen) :: labels(ndt)
call random_number(xmat)
inquire (iolength=iol) xmat(1,:) ! store record length in iol
print*,"n, nreals =",n,nreals
print*,"iol=",iol
call cpu_time(times(1))
! write unformatted direct
open (unit=iu,file=unformatted_file,access="direct", & recl=iol,action="write")
do i=1,n
write (iu,rec=i) xmat(i,:)
end do
close (iu)
call cpu_time(times(2))
open (unit=iu,file=unformatted_file,access="direct", & recl=iol,form="unformatted",action="read")
! read the last record without looping over previous records
read (iu,rec=n) xlast
print*,xmat(n,:),xlast
close (iu)
call cpu_time(times(3))
! write unformatted stream
open (unit=iu,file=unformatted_seq_file,form="unformatted", & access="stream",action="write")
write (iu) xmat
close (iu)
call cpu_time(times(4))
! read unformatted stream
open (unit=iu,file=unformatted_seq_file,form="unformatted", & access="stream",action="read")
read (iu) ymat
print*,xmat(n,:),ymat(n,:)
close (iu)
call cpu_time(times(5))
! write formatted sequential
open (unit=iu,file=formatted_seq_file,action="write")
do i=1,n
write (iu,*) xmat(i,:)
end do
close (iu)
call cpu_time(times(6))
! read formatted sequential
open (unit=iu,file=formatted_seq_file,action="read")
do i=1,n
read (iu,*) ymat(i,:)
end do
print*,xmat(n,:),ymat(n,:)
call cpu_time(times(7))
dt = times(2:) - times(:ntimes-1)
labels = [character (len=nlen) :: &
"write unformatted direct","read unformatted direct", &
"write unformatted stream","read unformatted stream", &
"write formatted sequential","read formatted sequential"]
print "(/,a35,1x,a9)", "task","time"
print "(a35,1x,f9.6)",(trim(labels(i)),dt(i),i=1,ndt)
end program direct_access
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 157 |
Nodes: | 16 (0 / 16) |
Uptime: | 16:19:40 |
Calls: | 3,193 |
Files: | 10,512 |
Messages: | 2,978,769 |