gPodder Bug Tracker – Bug 1097
[PATCH] Cannot sync with non-ascii characters in title
Last modified: 2010-08-23 22:05:08 BST
gpodder 2.7 hangs when trying to sync (^S) to a directory that contains a non-ascii character. The directory name is from the podcast's title. gpodder 2.4 does not have this problem. I did not test 2.5 or 2.6. Example feed: http://omegataupodcast.net/category/podcast-en/feed/ The relevant entry from an exported .opml: <outline text="Wissenschaft und Technik im Kopfhoerer / Science and Technology in your Headphones" title="omega tau » podcast (en)" type="rss" xmlUrl="http://omegataupodcast.net/category/podcast-en/feed/"/> Note that title contains a '»' (U+00BB / RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK / angle quotation mark (right) / » / » / raquo) character. Error message: Exception in thread Thread-8: Traceback (most recent call last): File "/usr/lib/python2.6/threading.py", line 525, in __bootstrap_inner self.run() File "/usr/lib/python2.6/threading.py", line 477, in run self.__target(*self.__args, **self.__kwargs) File ".../gpodder-2.7/src/gpodder/gtkui/desktop/sync.py", line 186, in sync_thread_func device.add_tracks(episodes) File ".../gpodder-2.7/src/gpodder/sync.py", line 200, in add_tracks added = self.add_track(track) File "/home/chkno/local/src/gpodder-2.7/src/gpodder/sync.py", line 597, in add_track log('Copying %s => %s', os.path.basename(from_file), to_file.decode(util.encoding), sender=self) File ".../gpodder-2.7/src/gpodder/liblogger.py", line 49, in log print (('[%8.3f] ' % (time.time()-first_time)) + message) % args UnicodeEncodeError: 'ascii' codec can't encode characters in position 115-116: ordinal not in range(128) On startup, gpodder says: [ 2.294] Using ISO-8859-15 as encoding. If this [ 2.294] is incorrect, please set your $LANG variable.
Queueing for review and possible fix in the next release.
Created attachment 540 [details] Fix non-utf8 locale support This is hack'ish replacement for not proper working "isinstance(s, unicode)" expression. Fixes file and folder names when non-utf8 locale is used.
(In reply to comment #0) > gpodder 2.7 hangs when trying to sync (^S) to a directory that contains a > non-ascii character. The directory name is from the podcast's title. > > gpodder 2.4 does not have this problem. I did not test 2.5 or 2.6. > > Example feed: http://omegataupodcast.net/category/podcast-en/feed/ > > The relevant entry from an exported .opml: > > <outline text="Wissenschaft und Technik im Kopfhoerer / Science and Technology > in your Headphones" title="omega tau » podcast (en)" type="rss" > xmlUrl="http://omegataupodcast.net/category/podcast-en/feed/"/> > > Note that title contains a '»' (U+00BB / RIGHT-POINTING DOUBLE ANGLE QUOTATION > MARK / angle quotation mark (right) / » / » / raquo) character. > > Error message: > > Exception in thread Thread-8: > Traceback (most recent call last): > File "/usr/lib/python2.6/threading.py", line 525, in __bootstrap_inner > self.run() > File "/usr/lib/python2.6/threading.py", line 477, in run > self.__target(*self.__args, **self.__kwargs) > File ".../gpodder-2.7/src/gpodder/gtkui/desktop/sync.py", line 186, in > sync_thread_func > device.add_tracks(episodes) > File ".../gpodder-2.7/src/gpodder/sync.py", line 200, in add_tracks > added = self.add_track(track) > File "/home/chkno/local/src/gpodder-2.7/src/gpodder/sync.py", line 597, in > add_track > log('Copying %s => %s', os.path.basename(from_file), > to_file.decode(util.encoding), sender=self) > File ".../gpodder-2.7/src/gpodder/liblogger.py", line 49, in log > print (('[%8.3f] ' % (time.time()-first_time)) + message) % args > UnicodeEncodeError: 'ascii' codec can't encode characters in position 115-116: > ordinal not in range(128) > > > On startup, gpodder says: > [ 2.294] Using ISO-8859-15 as encoding. If this > [ 2.294] is incorrect, please set your $LANG variable. I have similar (locale-related) problem with 2.5 and 2.7 under Windows. My system uses (like other Russian Windows do) ru_RU.CP1251 locale. If i start gPodder 'as is' it uses utf8 for filenames and folders (both downloaded episodes and synced filenames). Thus a have to start gPodder with "set LANG=ru_RU.cp1251" in batch file. Some investigation show me that "isinstance(filename, unicode)" expression does not work as expected under Windows. I'm not familiar with python/unicode so can't find proper solution, attached patch works for me, even if it really hack'ish. Perhaps, it will help you too and help devs to find/fix the bug.
(In reply to comment #3) > If i start gPodder 'as is' it uses utf8 for filenames and folders (both > downloaded episodes and synced filenames). Thus a have to start gPodder with > "set LANG=ru_RU.cp1251" in batch file. Do you think we can add code to gPodder that can detect the "LANG" variable based on some Windows registry setting or something?
(In reply to comment #4) > (In reply to comment #3) > > If i start gPodder 'as is' it uses utf8 for filenames and folders (both > > downloaded episodes and synced filenames). Thus a have to start gPodder with > > "set LANG=ru_RU.cp1251" in batch file. > > Do you think we can add code to gPodder that can detect the "LANG" variable > based on some Windows registry setting or something? Quick google'ing didn't show me any reliable methods. Om my machine i find tiny solution (patch below) instead of setting LANG variable. Unfortunately i tested it only with "filesystem sync" under win64 and can't say what will happen under other OS/locales. And i want to point again: setting LANG was only the part of the solution. There is something wrong with unicode strings (and therefore "isunicode" routine) inside gPodder code. As far as i can understood python does not treat some of the strings passed to sanitize_filename as unicode. Here is tiny patch: diff --git a/lib/site-packages/gpodder/util.py b/lib/site-packages/gpodder/util.py index 320afd8..5232f4a 100644 --- a/lib/site-packages/gpodder/util.py +++ b/lib/site-packages/gpodder/util.py @@ -68,7 +68,8 @@ N_ = gpodder.ngettext if gpodder.ui.maemo: encoding = 'utf8' else: - encoding = 'iso-8859-15' + import sys + encoding = sys.getfilesystemencoding() if 'LANG' in os.environ and '.' in os.environ['LANG']: lang = os.environ['LANG']
Fixed: http://gpodder.org/commit/060ba862 Please test and re-open if this bug still exists in the next release. Thanks!