Currently, the FORT_CONVERT environment variable has the highest priority when
setting the endianness conversion for unformatted files. In discussing the
appropriate priority for the fconvert option, convert specifiers were decided
to take highest priority.
This patch also initializes the open statement convert state to unknown
to disambiguate cases where the convert specifier was not provided from
cases where convert=native was set. This makes it possible to defer to the
environment setting where appropriate.
Differential Revision: https://reviews.llvm.org/D133237
The runtime I/O library correctly handles endianness conversions on payload
data I/O and on the output of sequential record headers and footers, but
does not swap endianness when required when reading sequential record headers
and footers back in for READ and BACKSPACE statements. Mea culpa. Fix.
Fixes https://github.com/llvm/llvm-project/issues/57126
Differential Revision: https://reviews.llvm.org/D132168
When an I/O statement contains a function call that attempts
to perform I/O on the same unit, detect the recursive I/O
and terminate with a useful message rather than deadlocking in
the threading library.
Differential Revision: https://reviews.llvm.org/D131097
An OPEN statement that implies closing a connection must invalidate
the unit's frame buffer so as to prevent stale data from the old
connection from being read into the newly-connected unit.
Differential Revision: https://reviews.llvm.org/D130430
When the I/O runtime is truncating an external file due to an
implied ENDFILE or explicit ENDFILE, ensure that the unit's frame
buffer for the file discards any data that have become obsolete.
This bug caused trouble with ACCESS='STREAM' I/O using POS= on
a WRITE, but it may have not been limited to that scenario.
Differential Revision: https://reviews.llvm.org/D129673
An ENDFILE statement executed when a non-advancing READ has
left the unit in the middle of a record must truncate the file
at that position.
Differential Revision: https://reviews.llvm.org/D129019
First, ExternalFileUnit::SetPosition was being used both as a utility
within the class' member functions as well as an API from I/O statement
processing. Make it private, and add APIs for SetStreamPos and SetDirectRec.
Second, ensure that SetStreamPos for POS= positioning in a stream
doesn't leave the current record number and endfile record number
in an arbitrary state. In stream I/O they are used only to manage
end-of-file detection, and shouldn't produce false positive results
from IsAtEnd() after repositioning.
Differential Revision: https://reviews.llvm.org/D128388
Fortran does have negative unit numbers -- they show up in child I/O
subroutines for defined I/O and for OPEN(NEWUNIT=) -- but the runtime
needs to catch the cases where a negative unit number that wasn't
generated by the runtime is passed in for OPEN or for an I/O statement
that would ordinarily create an anonymous "fort.NNN" file for a
hitherto unseen unit.
Differential Revision: https://reviews.llvm.org/D127788
A REWIND of a unit that's in the middle of a record due to a READ
or WRITE statement with ADVANCE='NO' needs to reset the left tab
limit so that the next transfer takes place at the beginning of
the first record.
Differential Revision: https://reviews.llvm.org/D127783
When an unconnected unit number is used in a BACKSPACE statement
with ERR=, IOSTAT=, &/or IOMSG= control specifiers, don't crash,
but let the program deal with the error.
Differential Revision: https://reviews.llvm.org/D127782
After a recoverable error condition in a READ statement with ADVANCE='NO',
skip the remainder of the current record.
Differential Revision: https://reviews.llvm.org/D127433
Track pending "asynchronous" I/O operation IDs so that WAIT statements can
report errors about bad ID numbers.
Lowering will need to extended to call GetAsynchronousId() for a READ or
WRITE statement with ID=n.
Differential Revision: https://reviews.llvm.org/D127421
Diagnose OPEN(FILE=f) when f is already connected by the same name to
a distinct external I/O unit.
Differential Revision: https://reviews.llvm.org/D127035
The initial size of the file was not being captured as the file position
on which the first output buffer should be framed.
Differential Revision: https://reviews.llvm.org/D127029
An external READ(END=) that hits the end of the file must
also note the virtual position of the endfile record that
has just been discovered, so that a later BACKSPACE statement
won't end up at the wrong record.
Differential Revision: https://reviews.llvm.org/D126146
A BACKSPACE statement on a unit after a READ or WRITE with ADVANCE="NO"
must reset the position to the beginning of the record, not to the
beginning of the previous one.
Differential Revision: https://reviews.llvm.org/D125057
When the last operation on a foramtted sequential or stream file (prior
to an implied or explicit ENDFILE) is a non-advancing WRITE, ensure
that any partial record data is emitted to the file without a line
terminator. Further, when that last record is read with a non-advancing
READ, ensure that it won't raise an end-of-record condition after its
data, but instead will signal an end-of-file.
Differential Revision: https://reviews.llvm.org/D124546
When an error occurs in a formatted sequential output statement
and no output was ever emitted, don't emit a blank record.
This matches the error case behavior of other Fortran compilers.
Differential Revision: https://reviews.llvm.org/D123734
A predicate expression made ENDFILE statements significant
only for sequential files, but it's applicable to formatted
stream output as well.
Differential Revision: https://reviews.llvm.org/D123730
Correct the implementation of non-advancing I/O after some testing
to ensure that T tab edit descriptors are not allowed to back up
into positions of a record prior to where it stood at the beginning
of the I/O statement.
Differential Revision: https://reviews.llvm.org/D123709
Implements UTF-8 encoding and decoding for external units
with OPEN(ENCODING='UTF-8'). This encoding applies to default
CHARACTER values that are not 7-bit ASCII as well as to
the wide CHARACTER kinds 2 and 4. Basic testing is in place
via direct calls to the runtime I/O APIs, but serious checkout
awaits lowering support of the wide CHARACTER kinds.
Differential Revision: https://reviews.llvm.org/D122038
Some I/O error situations are current handled with fatal
runtime asserts, but should be exposed for user program
error recovery.
Differential Revision: https://reviews.llvm.org/D122049
Where possible, I added additional information to the messages to help
programmers figure out what went wrong. I also removed all uses of the word
"bad" from the messages since (to me) that implies a moral judgement rather
than a programming error. I replaced it with either "invalid" or "unsupported"
where appropriate.
Differential Revision: https://reviews.llvm.org/D121493
Rather than reading default character variables in formatted
input one byte at a time via NextInField(), skip and read
them via blocks of available buffer data. This eliminates
a bottleneck that affected reads of large character values.
(It also exposed a problem with sequential reads with RECL=
set on the OPEN statement, so that's fixed too.)
Differential Revision: https://reviews.llvm.org/D121144
A data transfer statement must have REC= in its control list
if (and only if) the unit was opened with ACCESS='DIRECT'.
The runtime wasn't catching this error, but was just silently
advancing to the next record as if the access were sequential.
Differential Revision: https://reviews.llvm.org/D120838
The runtime crashes on several fundamental I/O data transfer statement
control list errors, like list I/O on a direct-access unit, or
input from a write-only unit, &c. These errors should not be fatal
when ERR= or IOSTAT= are present.
This patch creates a new ErroneousIoStatementState class and
uses it for the state of an I/O statement that is doomed to fail
from these errors. If there is no ERR= label or IOSTAT= variable,
the error will be raised at the end of the statement. Data transfer
operations along the way will be no-op failures.
Differential Revision: https://reviews.llvm.org/D120745
Corrects the runtime implementation of I/O on files with
the access mode ACCESS='STREAM'. This is a collection
of edge-case tweaks to ensure that the distinctions between
stream and direct/sequential files, unformatted or formatted,
are respected where appropriate.
Moves NextInField() from io-stmt.h to io-stmt.cpp --
it was getting too big to keep in a header.
This patch exposed a problem with the I/O runtime
on Windows and it was reverted. This version also
fixes that problem; files are now opened on Windows
in binary mode to prevent inadvertent insertions of
carriage returns before line feeds, and those line
endings (CR+LF) are now explicitly generated.
Differential Revision: https://reviews.llvm.org/D119015
Corrects the runtime implementation of I/O on files with
the access mode ACCESS='STREAM'. This is a collection
of edge-case tweaks to ensure that the distinctions between
stream and direct/sequential files, unformatted or formatted,
are respected where appropriate.
Moves NextInField() from io-stmt.h to io-stmt.cpp --
it was getting too big to keep in a header.
Differential Revision: https://reviews.llvm.org/D118834
When RECL= is set on OPEN(), ensure that it:
1) enforces a max output record payload size
(not including header+footer or newline), and
2) causes padding of short output records only
for ACCESS='DIRECT'
The previous code was causing some false overrun errors
and applying padding to sequential/stream output files.
Differential Revision: https://reviews.llvm.org/D118630
After an ENDFILE statement, a WRITE is an error without
a prior BACKSPACE. Also fix the return value for the case
of formatted integer input with no input digits to be false
(exposed by new test).
Differential Revision: https://reviews.llvm.org/D117346
RECL= is required for direct access I/O, but is permitted
as well for sequential I/O, where it is defined by the
standard to specify a maximum record (line) length.
The standard does not say what should happen when an
sequential formatted input record appears whose length is
unequal to RECL= when it is specified.
Precedents from other compilers are unclear: one raises an error,
some honor RECL= as an effective truncation, and a few ignore the
situation. On output, all other compilers tested raised an
error when an attempt is made to emit a record longer than RECL=.
This patch treats RECL= as effective truncation on input and
as a hard limit with error on output, and also ensures that
RECL= can be set *longer* than the actual input record lengths.
Differential Revision: https://reviews.llvm.org/D115102
INQUIRE(POSITION=)'s results need to reflect the POSITION=
specifier used for the OPEN statement until the unit has been
repositioned. Preserve the POSITION= from OPEN and used it
for INQUIRE(POSITION=) until is becomes obsolete.
INQUIRE(PAD=) is implemented here in the case of an unconnected unit
with Fortran 2018 semantics; i.e., "UNDEFINED", rather than Fortran 90's
"YES"/"NO" (see 4.3.6 para 2). Apparent failures with F'90-only tests
will persist with INQUIRE(PAD=); these discrepancies don't seem to warrant
an option or environment variable.
To make the implementation of INQUIRE more closely match the language
in the standard, rename IsOpen() to IsConnected(), and use it explicitly
for the various INQUIRE specifiers.
Differential Revision: https://reviews.llvm.org/D114755
This is a near-universal language extension; external unit 0
is preconnected to the standard error output.
Differential Revision: https://reviews.llvm.org/D114298
The predefined units were not being initialized with FORM='FORMATTED',
so INQUIRE(PAD=) was failing if no I/O had already been done.
INQUIRE(POSITION=) was returning 'REWIND' on stdin/stdout (which
is somewhat defensible from the definition, and is what Intel Fortran
does), but most implementations return 'ASIS'. Change the runtime
to return 'REWIND' only for positionable external files, but 'ASIS'
for terminals, sockets, &c.
Differential Revision: https://reviews.llvm.org/D114028
1. To avoid overwriting the part of the record read in the non advancing read,
the furtherPositionInRecord field must be set to the max of the
furtherPositionInRecord and the positionInRecord at the beginning of the
IO write.
2. To allow any further read to succeed after the write, the unit
beganReadingRecord_ must be set to false when resetting the recordLength
during the write, otherwise, recordLength will not be computed in further
read and an assert is hit (at unit.cpp(398)).
The added unit test exercises both of these scenarios.
Differential Revision: https://reviews.llvm.org/D113740
Profiling a basic internal real input read benchmark shows some
hot spots in the code used to prepare input for decimal-to-binary
conversion, which is of course where the time should be spent.
The library that implements decimal to/from binary conversions has
been optimized, but not the code in the Fortran runtime that calls it,
and there are some obvious light changes worth making here.
Move some member functions from *.cpp files into the class definitions
of Descriptor and IoStatementState to enable inlining and specialization.
Make GetNextInputBytes() the new basic input API within the
runtime, replacing GetCurrentChar() -- which is rewritten in terms of
GetNextInputBytes -- so that input routines can have the
ability to acquire more than one input character at a time
and amortize overhead.
These changes speed up the time to read 1M random reals
using internal I/O from a character array from 1.29s to 0.54s
on my machine, which on par with Intel Fortran and much faster than
GNU Fortran.
Differential Revision: https://reviews.llvm.org/D113697
When processing the devious NAMELIST input
&group logarray = t t t
= 666 /
for LOGICAL::logarray(3) and INTEGER::t, the runtime library
needs to do some look-ahead on the input stream to make sure
that the last "t" on the first line is a truth value rather than
an item name -- which in this case it is. This look-ahead
was implemented in a previous patch but only worked for internal
input cases; this patch implements look-ahead capabilities for
input from an external file, too (and also adjusts repeated
list-directed input items to use this infrastructure, too).
Differential Revision: https://reviews.llvm.org/D113311
A recent change caused some variable-length sequential formatted
output statements with record positioning at the end of a FORMAT
(e.g., FORMAT('hi',10X) to append blanks at the end of the completed
record when emitting it.
Differential Revision: https://reviews.llvm.org/D112742
The runtime library was emitting unformatted record headers and
footers when an external unit had no fixed RECL=. This is wrong
for sequential files, which should have headers & footers even
with RECL. Change to omit headers & footers from unformatted
I/O only for direct access files.
Differential Revision: https://reviews.llvm.org/D112243
Count input characters corresponding to formatted edit descriptors
for READ(SIZE=); count output bytes for INQUIRE(IOLENGTH=).
The I/O APIs GetSize() and GetLength() were adjusted to return
std::size_t as function results.
Basic unit tests were added (and others fixed).
Differential Revision: https://reviews.llvm.org/D110291
When an end of record is met in non advancing IO:
- Set IOSTAT if present according to 12.11.4 (5).
- Position the file to the next record (12.11.4 (4)).
The previous code was only signaling EOR for fixed record length IO.
Reading at 12.11.4, I do not find the rational for this condition, so I
removed it.
It also does not seem the presence of padding should prevent
the EOR signaling.
The positionning to the next record was block when EOR is signaling
in FinishReadingRecord because ErrorHandler.isError() is true in this
case.
EOR in input is not an error, but I am not confident to modify
ErrorHandler.isError() to cover that. However, In FinishReadingRecord,
the code should not bail if the error is simply an end of record.
I did not check the SIZE requirements here because GetSize runtime is
not yet implemented.
Differential Revision: https://reviews.llvm.org/D109505
BACKSPACE leaves "recordLength" set, which is fine for a later READ,
but it causes a later WRITE to fail due to a misinterpretation of the
knowledge of the record length as indication of a fixed-length record
file (RECL=). Fix.
Differential Revision: https://reviews.llvm.org/D108594
A recent runtime I/O change[1] was meant to improve the handling of
input from external files missing a terminal newline on their last
records; the change was "triggered" by the wrong circumstances and
causing reads that should have pulled more data into the buffer to be
treated as EOFs. So fix that, and also don't retain input data
in the buffer once an input record has been finished unless it's
known that list-directed or NAMELIST input of a repeated input item
may need to backspace a non-positionable external unit to return
to the beginning of the repeated item.
[1] 6578893a0453384346f149479f8574dfff977ace
Differential Revision: https://reviews.llvm.org/D108164
Fix the external-io unittest under Windows.
In particular, fixes the following issues:
1. When creating a temporary file, open it with read+write permissions
using the _O_RDWR flag. _S_IREAD and _S_IWRITE are for the file
permissions of the created file.
2. _chsize returns 0 on success (just like ftruncate).
3. To set a std::optional, use its assign-operator overload instead of
getting a reference to its value and overwrite that. The latter is
invalid if the std::optional has no value, and is caught by
msvc's debug STL.
The non-GTest unittest is currently not executed under Windows because
of the added .exe extension to the output file: external-io.text.exe.
llvm-lit skips the file because .exe is not in the lists of test
suffixes (.test is). D105315 is going to change that by converting it
to a GTest-test.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D106726
When a WRITE overwrites an endfile record, we need to forget
that there was an endfile record. When doing a BACKSPACE
after an explicit ENDFILE statement, the position afterwards
must be upon the endfile record.
Attempts to join list-directed delimited character input across
record boundaries was due to a bad reading of the standard
and has been deleted, now that the requirements are better understood.
This problem would cause a read attempt past EOF if a delimited
character input value was at the end of a record.
It turns out that delimited list-directed (and NAMELIST) character
output is required to emit contiguous doubled instances of the
delimiter character when it appears in the output value. When
fixed-size records are being emitted, as is the case with internal
output, this is not possible when the problematic character falls
on the last position of a record. No two other Fortran compilers
do the same thing in this situation so there is no good precedent
to follow.
Because it seems least wrong, with this patch we now emit one copy
of the delimiter as the last character of the current record and
another as the first character of the next record. (The
second-least-wrong alternative might be to flag a runtime error,
but that seems harsh since it's not an explicit error in the standard,
and the output may not have to be usable later as input anyway.)
Consequently, the output is not suitable for use as list-directed or
NAMELIST input.
If a later standard were to clarify this case, this behavior will of
course change as needed to conform.
Differential Revision: https://reviews.llvm.org/D106695
F18 was sigalling an end-of-file error condition when reading an
unformatted sequential input file without an ultimate newline
(or CR-LF). Other Fortran implementations can handle it, so change
the runtime to support it.
Differential Revision: https://reviews.llvm.org/D106321