Bugzilla – Bug 3532
ACE_OS::fopen() : Can't read files with 1 Bytes (ACE_USES_WCHAR)
Last modified: 2009-06-19 02:28:54
You need to log in before you can comment on or make changes to this bug.
ACE VERSION: 5.6.4 HOST MACHINE and OPERATING SYSTEM: PC, Windows XP Prof. SP 2, VC 8 TARGET MACHINE and OPERATING SYSTEM, if different from HOST: COMPILER NAME AND VERSION (AND PATCHLEVEL): - THE $ACE_ROOT/ace/config.h FILE [if you use a link to a platform- specific file, simply state which one]: #define _USE_32BIT_TIME_T 1 #define ACE_USES_WCHAR #include "ace/config-win32.h" THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE [if you use a link to a platform-specific file, simply state which one (unless this isn't used in this case, e.g., with Microsoft Visual C++)]: - CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features (used by MPC when you generate your own makefiles): file not exists AREA/CLASS/EXAMPLE AFFECTED: ACE, File IO DOES THE PROBLEM AFFECT: COMPILATION? no LINKING? no EXECUTION? yes OTHER (please specify)? - SYNOPSIS: Can't read files with 1 Bytes when ACE_USES_WCHAR is defined. DESCRIPTION: - Prepare a file with one character (one byte, no BOM) - verify with hex editor. - Open and read the byte of this file (see the following example). - The fread() function will return 0. The reason for this problem is the fuction checkUnicodeFormat() of the file %ACE_ROOT%/ace/OS_NS_stdio.cpp. Within the fopen() call it's used to ignore the BOM. The fread call tries to tread two bytes in our One-Byte-ASCII file. The call fails but the file pointer is not the same as before (I guess). Also if I execute the fseek(fp, 0, FILE_BEGIN); before my own read everything works fine. #if defined (ACE_USES_WCHAR) void ACE_OS::checkUnicodeFormat (FILE* fp) { if (fp != 0) { // Due to the ACE_TCHAR definition, all default input files, such as // svc.conf, have to be in Unicode format (small endian) on WinCE // because ACE has all 'char' converted into ACE_TCHAR. // However, for TAO, ASCII files, such as IOR file, can still be read // and be written without any error since given buffers are all in 'char' // type instead of ACE_TCHAR. Therefore, it is user's reponsibility to // select correct buffer type. // At this point, check if the file is Unicode or not. ACE_UINT16 first_two_bytes; size_t numRead = ACE_OS::fread(&first_two_bytes, sizeof (first_two_bytes), 1, fp); if (numRead == 1) { if ((first_two_bytes != 0xFFFE) && // not a small endian Unicode file (first_two_bytes != 0xFEFF)) // not a big endian Unicode file { // set file pointer back to the beginning #if defined (ACE_WIN32) ACE_OS::fseek(fp, 0, FILE_BEGIN); #else ACE_OS::fseek(fp, 0, SEEK_SET); #endif /* ACE_WIN32 */ } } // if it is a Unicode file, file pointer will be right next to the first // two-bytes } } #endif // ACE_USES_WCHAR REPEAT BY: #include "ace/OS.h" int main() { FILE* pFile = ACE_OS::fopen("OneByteFile", L"rb"); char Buffer[1000]; size_t BytesRead = ACE_OS::fread(&Buffer, 1, 1, pFile); if(BytesRead == 0) { // This is an error: One byte should be read } ACE_OS::fclose(pFile); } SAMPLE FIX/WORKAROUND: I see two options: - Maybe the file size has to be checked before the read in the checkUnicodeFormat() function. - Or alwyas do the seek if the numRead is 0.
please add a one button test to ace_wrappers/tests as reproducer
Created an attachment (id=1040) [details] Bug_3532_Regression_Test
Fri Jan 30 12:55:52 UTC 2009 Johnny Willemsen <jwillemsen@remedy.nl> * tests/run_test.lst: * tests/tests.mpc: * tests/Bug_3532_Regression.cpp: Added a new test for bugzilla 3532. This bug is not fixed, just integrating the regression test. Thanks to Martin Gaus <Gaus@gmx.de> for creating this test
Tue Mar 24 11:38:22 UTC 2009 Johnny Willemsen <jwillemsen@remedy.nl> * ace/OS_NS_stdio.cpp (checkUnicodeFormat): Fixed a bug that the file pointer wasn't set toe the start of the file when the file is just 1 byte large. This fixes bugzilla 3532, thanks to Martin Gaus <gaus at gmx dot de> for reporting this.
added dep