[owncloud-devel] Hi! (and Improving Cross-Platform International/Unicode Support)

Hoffmann, Patrick HOP at bito.de
Fri Nov 14 06:23:19 GMT 2014


Hello Lee,

did you test with IIS and php?
We use an apache on windows and we don't have any of this problems.

Yours,

Patrick



BITO-Lagertechnik Bittmann GmbH

Obertor 29
55590 Meisenheim
Germany

Tel: +49 (0) 6753 122 0
Fax: +49 (0) 6753 122 399

E-Mail: info at bito.de
http://www.bito.de

Geschäftsführer: Werner Magin, Winfried Schmuck | HRB 2704 Bad Kreuznach | Gesellschafter: Fritz Bittmann Holding GmbH | USt-ID: DE 811 202 181


*************************************************************************************
Dieses E-Mail ist nur für den Empfänger bestimmt, an den es gerichtet ist und kann vertrauliches bzw. unter das Berufsgeheimnis fallendes Material enthalten. Jegliche darin enthaltene Ansicht oder Meinungsäußerung ist die des Autors und stellt nicht notwendigerweise die Ansicht oder Meinung von BITO-Lagertechnik Bittmann GmbH dar. Sind Sie nicht der Empfänger und haben diese E-Mail irrtümlich erhalten, sind jegliche Verwendung, Veröffentlichung, Weiterleitung, Abschrift oder jeglicher Druck dieser E-Mail strengstens untersagt. Jede ausgehende E-Mail wird von uns mit höchster Sorgfalt auf Viren geprüft. Jedoch übernehmen weder BITO-Lagertechnik Bittmann GmbH noch der Absender (Patrick Hoffmann) die Haftung für Viren; es obliegt Ihrer Verantwortung, die E-Mail und deren Anhänge auf Viren zu prüfen.

This email is exclusively meant for the addressee and may contain confidential information or information which can be classified as professional secret. Any view or opinion stated in this email is that of the author and does not necessarily represent the view or the opinion of BITO-Lagertechnik Bittmann GmbH. If you are not the addressee and if this email has been transmitted to you by mistake, you may not make use of, publish, transmit, reproduce or print the information contained therein for whatever purpose. We take every reasonable care to check all out-bound emails for viruses. However, neither BITO-Lagertechnik Bittmann GmbH nor the sender (Patrick Hoffmann) can be held liable for the occurrence of viruses and any consequential damages. It is therefore the addressee's sole responsibility to check incoming emails and attachments for viruses.
*************************************************************************************
Anhänge:
Versand am 14.11.2014 07:23 von Patrick Hoffmann

-----Ursprüngliche Nachricht-----
Von: devel-bounces at owncloud.org [mailto:devel-bounces at owncloud.org] Im Auftrag von Lee Thompson
Gesendet: Freitag, 14. November 2014 01:57
An: devel at owncloud.org
Betreff: [owncloud-devel] Hi! (and Improving Cross-Platform International/Unicode Support)

Hi Everyone,

Like I'm sure many of you, I've been programming for a long time on various projects.

My typical platform is Windows although I do work on Linux machines as well (usually Debian flavor).

At my work, we've recently deployed ownCloud (7.0.2, community edition) and it's working well although we've run into some issues with unicode characters in filenames causing some issues with inserting the rows into the database.

The root issue appears to be the way PHP communicates with Windows, I don't know if this issue is affecting other operating systems as well.

(I have reported this as a bug at
https://github.com/owncloud/core/issues/12112 but I'm perfectly happy to
help with contributing to this project.   Some of that information will
be repeated here.)


Basically what happens is when PHP's functions look at the Windows file
system and it contains a unicode character it will appear to PHP's
mbstring to be encoded as UTF-8 but it is actually encoded (on US-EN
anyway) as Windows-1252.

Now with this, we can get the correct codepage...

$target_encoding = "UTF-8";
$default_codepage = "UTF-8";

if ( 'WIN' === substr( PHP_OS, 0, 3 ) ) {
        $codepage = 'Windows-' . trim( strstr( setlocale( LC_CTYPE, "" ), '.'
), '.' );
} else {
        $codepage = $default_codepage;
}

... and then convert it

$encoded_filename = mb_convert_encoding( $filename, $target_encoding,
$codepage );



So my thought is to add to config.php a default codepage to use,
initially filled in by the installer as UTF-8 or if on Windows, from the
routine above.

There should also be codepage settings for each of the external storages
(defaulting to the 'system one' for local/smb) just in case other file
systems are in play, this would allow the admin to account for special
or mixed environments.

The sync clients should probably communicate their local codepage as
well just to ensure that it all translates properly (if needed).  (I'll
confess I haven't done any programming of WebDAV and don't know if any
codepage translation occurs.)


Other notes and potential gotchas:

1. Folders should probably be codepage encoded too.

2. Unfortunately we will likely need to decode back to the codepage to
open the file within PHP.  (e.g. $decoded_filename =
mb_convert_encoding( $encoded_filename, $codepage, $target_encoding); )

3. It's also worth noting that for MySQL 5.5.3+, utf8/utf8_bin is not
sufficient for true UTF-8 support.   It needs to be utf8mb4 with
utf8mb4_bin_ci or utf8mb4_unicode_ci collation.   (ref:
http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html)


--
Lee Thompson
thompsonl at logh.net

_______________________________________________
Devel mailing list
Devel at owncloud.org
http://mailman.owncloud.org/mailman/listinfo/devel



More information about the Devel mailing list