Page 1 of 1

“Excel found unreadable content" on opening Excel 2007 file

Posted: Mon Mar 15, 2010 4:45 pm
by mik
I’m getting the following message on attempt to open a file created by XLSReadWriteII4 in Excel 2007:

“Excel found unreadable content in ‘t.xlsx’. Do you want to recover the content of this workbook?”

After I agree the file is opened as “repaired” and following notification pops up:

“Removed Records: Document Theme from /xl/workbook.xml part (Workbook)”

After the file is saved from Excel it starts opening normally.
Please find the original and the repaired content of workbook.xml below.
Does anyone have the same issue?

Thanks!

Original workbook.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
- <workbook xmlns="http://schemas.openxmlformats.org/sprea ... /2006/main" xmlns:r="http://schemas.openxmlformats.org/offic ... ationships">
<fileVersion appName="XLSReadWriteII" lastEdited="4" lowestEdited="4" rupBuild="4.00.32" />
- <sheets>
<sheet name="Sheet 1" sheetId="1" r:id="rId1" />
<sheet name="1000 columns" sheetId="2" r:id="rId2" />
</sheets>
<calcPr calcId="0" />
</workbook>

Repaired workbook.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
- <workbook xmlns="http://schemas.openxmlformats.org/sprea ... /2006/main" xmlns:r="http://schemas.openxmlformats.org/offic ... ationships">
<fileVersion appName="xl" lastEdited="4" lowestEdited="4" rupBuild="4506" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="240" yWindow="570" windowWidth="18855" windowHeight="11700" activeTab="1" />
</bookViews>
- <sheets>
<sheet name="Sheet 1" sheetId="1" r:id="rId1" />
<sheet name="1000 columns" sheetId="2" r:id="rId2" />
</sheets>
<calcPr calcId="125725" />
<fileRecoveryPr repairLoad="1" />
</workbook>

Re: “Excel found unreadable content" on opening Excel 2007 file

Posted: Mon Mar 15, 2010 4:48 pm
by larsa
Hello

Please send me a code sample on how create a file with this error.

Re: “Excel found unreadable content" on opening Excel 2007 file

Posted: Mon Mar 15, 2010 5:07 pm
by mik
Hi Lars,

I'll try create a small demo project and e-mail it to you.

Thanks for being responsive to my requests!

Sampel code

Posted: Mon Mar 15, 2010 6:09 pm
by mik
Hi,

Here is the demo project text.
Notes.

1. If lines between "BEGIN Portion can be removed"/"END Portion can be removed" are removed from the source code, the issue persists. In this case the resulting document is empty.

2. After "xls.Filename := 'c:\temp\t.xls';" uncommented and "xls.Version := xvExcel2007;" commented out the resulting document opens fine.

Please let me know if you can replicate the issue.

Thanks!


program Project1;

{$APPTYPE CONSOLE}

uses
SysUtils,
XLSReadWriteII4,
SheetData4,
BIFFRecsII4
;

var
xls: TXLSReadWriteII4;
begin
xls := TXLSReadWriteII4.Create(nil);
try
xls.Filename := 'c:\temp\t.xlsx';
// xls.Filename := 'c:\temp\t.xls';
xls.Version := xvExcel2007;
//BEGIN Portion can be removed
xls.Clear;
xls.Sheets[0].Name := 'Sheet 1';
xls.Formats.Add.NumberFormat := '0.00';
xls.Sheets[0].IntWriteNumber(0,0,xls.Formats.Count-1,3.33);
//END Portion can be removed
xls.Write;
finally
xls.Free;
end;
end.

Re: “Excel found unreadable content" on opening Excel 2007 file

Posted: Tue Mar 16, 2010 2:22 am
by mik
Seems to be related to oriental font names.
Starts working normally when I comment out the following group of lines (TXLSWrite2007XML.WriteTheme):

'<a:font script="Jpan" typeface="MS Pゴシック" />' +
'<a:font script="Hang" typeface="맑은 고딕" />' +
'<a:font script="Hans" typeface="宋体" />' +
'<a:font script="Hant" typeface="新細明體" />' +

Please note that this group occurs twice.

“Excel found unreadable content": I figured it out

Posted: Tue Mar 16, 2010 2:44 pm
by mik
The effect is a result of running the project on Windows 7, which handles Asian fonts better than XP.
Whe I run the project on XP, Excel has no problem opening the resulting document simply beacuse all pictograms are replaced with question marks.
However, when I run it on Windows 7, Japanese and Chinese (but not Korean) symbols get into the xml as they are, which makes it unreadable.
I compared the results produced by XLSReadWriteII4 and Excel 2007 in binary codes. It seems that XLSReadWriteII4 saves Asian symbols in Japanese multibyte code page, while when the file is saved from Excel unicode is used.

\xl\theme\theme1.xml

Windows XP:
<a:font script="Jpan" typeface="MS P????" /><a:font script="Hang" typeface="?? ??" /><a:font script="Hans" typeface="??" /><a:font script="Hant" typeface="????" />

Windows 7:
<a:font script="Jpan" typeface="MS Pゴシック" /><a:font script="Hang" typeface="?? ??" /><a:font script="Hans" typeface="宋体" /><a:font script="Hant" typeface="新細明體" />

Re: “Excel found unreadable content" on opening Excel 2007 file

Posted: Tue Mar 16, 2010 4:56 pm
by mik
I fixed it in my version of XLSXML4.pas by changing TXLSWriteXML.AddStr so that the wide string argument would be converted to UTF8 prior to writing to the stream.
After that theme1.xml produced by Excel and by TXLSReadWriteII4 became identical.

Here's the new code (note: "Windows" should be added to "uses" section).

procedure TXLSWriteXML.AddStr(Str: WideString);
var
buffer: array of char;
s: string;
begin
SetLength(buffer, Length(Str)*2+2);
FillChar(buffer[0], Length(buffer), 0);
WideCharToMultiByte(CP_UTF8, 0, PWideChar(Str), Length(Str),
PChar(@buffer[0]), Length(buffer), nil, nil);
s := string(PChar(@buffer[0]));
WriteString(s);
end;

Japanese locale

Posted: Wed Mar 17, 2010 2:50 pm
by mik
Just want to add that on my system I have Japanese as a default locale. With this setting I cannot export Japanese texts into Excel file because function XLSUTF8EncodeWS works incorrectly when it assigns the result of conversion to back to WideString

Result := WideString(S);

My guess is that Delphi treats S as multibyte Japanese string, while it is UTF-8 string.
As a result the exported text is screwed up.

I was able to fix this by introducing a new function, which returns XLS8String instead of WideString.