|
Programming C, bash, Python, Perl, PHP, Java, you name it. |
|
Thread Tools | Display Modes |
|
||||
fseek() and read() problem
I was working on an audio program on the weekend and had some weird results on BSD. Tracking it down, it seems to be related to fseek() and read(). It occurs on both
* NetBSD 4.0.1 -release i386 * OpenBSD 4.4 -release i386 What happens is this: An input file consists of a whole number of frames (frame=2352 bytes). I fseek() a whole number of frames into the file, and then try to read() the rest of the file frame-by-frame. I noticed the wrong number of frames get read, and the last read() doesn't read a whole frame (too few bytes are read). From the read(2) man page: Quote:
$ dd if=/dev/zero of=infile bs=2352 count=9 You can vary the count to see what happens. Check the file size: Code:
$ ls -l infile -rw-r--r-- 1 xxx users 21168 Apr 13 10:17 infile Code:
#include <stdio.h> #include <unistd.h> int main( int argc, char* argv[] ) { ssize_t nr; char buf[2352]; /* = 1 frame */ int ifd, j; FILE *ifp; ifp = fopen( "infile", "rb" ); ifd = fileno( ifp ); printf( "where = %ld\n", ftell( ifp ) ); if( fseek( ifp, 2352, SEEK_SET ) ) /* seek 1 frame in */ return 1; printf( "where = %ld\n", ftell( ifp ) ); for( j=1; (nr = read( ifd, (void*)buf, 2352 )) != -1; j++ ) { printf( "%d: nr = %d\n", j, nr ); if( nr==0 ) break; } fclose( ifp ); return 0; } $ gcc -Wall -o trd trd.c and run it. E.g., with a count of 9: Code:
$ trd where = 0 where = 2352 1: nr = 2352 2: nr = 2352 3: nr = 80 4: nr = 0 |
|
||||
Thanks ephemera, that was a good idea. I tried lseek, and in a quick test (for count=9 again) it gave the expected result:
Code:
$ ltrd where = 0 where = 0 1: nr = 2352 2: nr = 2352 3: nr = 2352 4: nr = 2352 5: nr = 2352 6: nr = 2352 7: nr = 2352 8: nr = 2352 9: nr = 0 As for FreeBSD, I don't have it installed and never used it, so I don't know if the results are the same there. If any of the local FreeBSD users wish to try it and report the result that would be interesting! |
|
||||
I don't know why this is so.
But, notice that ftell(3) didn't report the correct file offset. > I'm still puzzled why fseek doesn't work with read(2), as they both seem to be rather legacy functions of this kind. There is some difference, fseek is a stream I/O function of C stdlib and lseek is the system call for seeking into a file. Maybe there are different DS in the kernel for them and perhaps they are not kept in sync? <guess/> |
|
||||
Hmm, I wonder if errno gets set to anything useful.
Code:
$ vim t.c $ dd if=/dev/zero of=infile bs=2352 count=9 $ gcc t.c -o t && ./t where = 0 where = 2352 1: nr = 2352 2: nr = 2352 3: nr = 2352 4: nr = 2352 5: nr = 2352 6: nr = 2352 7: nr = 2352 8: nr = 608 9: nr = 0 $ uname -rms 7.2-PRERELEASE
__________________
My Journal Thou shalt check the array bounds of all strings (indeed, all arrays), for surely where thou typest ``foo'' someone someday shall type ``supercalifragilisticexpialidocious''. |
|
||||
Thanks TerryP for trying it on FreeBSD. Looks like you're also getting unexpected behaviour there, but different in detail. My sense is that, assuming the test code is properly written for the various platforms, then they should all give the same output. Since they don't, something is likely wrong.
Well, since fseek() and read() are tested for their respective error condition (-1 in each case) and those cases aren't entered, probably errno won't be set, right? |
|
||||
Quote:
FWIW: I tried creating a version that checks errno after each call via a macro, only to have it segfault on run. Then I yanked the version in your post (again) to a temp file, compiled & run as in the last post and it segfaulted exactly the same way (same machine). Code:
Terry@dixie$ gcc -ggdb3 -Wall /tmp/t.c -o /tmp/t && gdb /tmp/t GNU gdb 6.1.1 [FreeBSD] ... This GDB was configured as "i386-marcel-freebsd"... (gdb) run Starting program: /tmp/t Program received signal SIGSEGV, Segmentation fault. 0x08048587 in main () at /tmp/t.c:13 13 ifd = fileno( ifp ); (gdb) == beyond that == I've never tried to mix standard I/O functions with I/O system calls (why does anyone need to do that, normally?), but I remember a comment in the book Programming Perl: a warning about mixing things like read() and sysread(), should only be done if you are into wizardry, pain, or both. (read() and sysread() in Perl are basically equivalents to a Unix/C's fread() and read() respectively). I would reckon is you manipulate the file descriptor without updating the structure on the other side of a FILE *, like f.*() functions should do; things could probably get out of sync between the integer file descriptor and the FILE *stream; and get pissed off accordingly if certain ops were done, hypothetically anyway. I'd really suggest trying it with fread() and such instead, as BSDFan suggests. == other == The documentation on read() system call returns the # of bytes read, 0 if the read was EOF, -1 if a cork popped and sets errno. So if it's not reading the specified amount, I would rather assume it hit EOF and returned what was read up to that point (i.e. a number of bytes that is > 0 but < 2352) edit: yep Quote:
__________________
My Journal Thou shalt check the array bounds of all strings (indeed, all arrays), for surely where thou typest ``foo'' someone someday shall type ``supercalifragilisticexpialidocious''. Last edited by TerryP; 14th April 2009 at 07:40 AM. |
|
||||
Thanks BSDfan666 and TerryP again, that is really helpful stuff. It looks like fread() may be the missing piece. In trying to understand the origins of my confusion on this there seem to be 3 factors:
1) There are a lot of similar functions available here. Although I was aware of the distinction between those using stdio FILE*'s and the lower level ones using file descriptors, I wasn't aware of lseek(), and it seems I had forgotten about fread() due to: 2) I don't work with these things on a very regular basis, so things get fuzzy . 3) The program was originally developed on Linux, where fseek() and read() seem to work together ok. (BTW a quick check on SunOS showed it was ok there too.) This is good in a way, but it led to a false sense of security as to the general situation. Quote:
Quote:
So ... yesterday I re-wrote the thing to use lseek() [pointed out by ephemera]. But it seems I should really use fread() and re-assess things concerning lseek vs fseeko. Thanks again for all the patient replies. |
|
||||
Ok, I sorted through the relevant functions and made a little summary. Here it is in case anyone ever finds it helpful. Notes:
* there are many other functions related to file access * BSD = NetBSD 4.0.1 and OpenBSD 4.4 * Linux = Slackware 11.0 and 12.2 * comments on speed are on my i386 machines (take with grain of salt ) * corrections etc. are welcome Functions using stdio library interface and FILE structures: 16- and 18-bit, exists in K&R and PDP-7 respectively but not Linux or BSD seek 32-bit fseek ftell 32-bit Linux, can be changed to 64 by a #define. ... also ... 64-bit BSD. fseeko ftello - does not show the result of lseek. [Either in 32 or 64 bit mode (Linux)] fread Functions using system/kernel calls with file descriptors: 32-bit Linux, can be changed to 64 by a #define. ... also ... 64-bit BSD. lseek 64-bit Linux lseek64 = llseek "ltell" - does not exist, ftell[o] don't work here. read - faster than fread (Linux, NetBSD); not faster than fread (OpenBSD) ----------------------------------- At the moment, I think I'll stick with read() over fread(), since it can be faster. A lot of reads are done, and portability to non-Unix-like is not important. So lseek() must be used so as not to mix functions from the two categories (originally done via blundering). On Linux lseek() is 32-bit by default, which is good enough for now, and can easily be changed if needed. Not a big downside. Last edited by IdOp; 15th April 2009 at 03:56 AM. Reason: added 18 bits and attempt to clarify headings |
|
|||
Short history lesson; The first Unix system was the PDP-7, this system had 18-bit integers.. not 16-bit.
Also.. fseeko/ftello use off_t, under OpenBSD this type is always 64-bit.. but you'll need to define _FILE_OFFSET_BITS=64 on Linux. |
|
||||
Quote:
Quote:
Quote:
|
|
||||
What's so odd, DEC made Programmed Data Processors (PDPs) in 12, 16, 18, and 36 bit, among other interesting gizmos over the years.
__________________
My Journal Thou shalt check the array bounds of all strings (indeed, all arrays), for surely where thou typest ``foo'' someone someday shall type ``supercalifragilisticexpialidocious''. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How do you read an IRIX cd using EFS | ablecode | NetBSD General | 9 | 26th May 2010 07:54 PM |
I've read the installation guide...but! | wubrgamer | FreeBSD General | 5 | 20th September 2008 02:37 PM |
when and by what is .profile read? | kasse | FreeBSD General | 8 | 11th September 2008 08:46 AM |
/etc/rc.* files isn't read properly? | mathias | OpenBSD General | 4 | 1st June 2008 06:35 PM |
Filesystem read errors | Foon | FreeBSD General | 0 | 10th May 2008 07:27 AM |