gPodder Bug Tracker – Bug 57
[PATCH] Podcast Files and Folder Naming
Last modified: 2009-12-16 18:03:10 GMT
I would like to see the files and folders in the podcast directory named for the show name and title of the podcast rather than a series of numbers. Is there a technical reason for the way the naming is currently implemented? Let me know if I'm missing something.
Thanks for your bug report. I just added this to the FAQ: What about these odd directory and file names that look like MD5 sums? They not only look like md5sums, they are. In short, we use them to avoid problems with awful RSS feeds (no title, description in title, etc..). You can read a long description of the reasons, and why we haven't found a real solution. A solution (for filesystems that support hard links) has been proposed in this mail, and the result is availabe in doc/dev/gdfs/ in the source tree of gPodder. With this, you can "mirror" the gPodder download directory to a folder with human-readable names. The files will be hard linked, so no additional space will be used (except for the directory structure, but not the file data). Related links: https://lists.berlios.de/pipermail/gpodder-devel/2007-October/001104.html and https://lists.berlios.de/pipermail/gpodder-devel/2007-November/001140.html If you can come up with a better solution, please, by all means do send in a patch. If you have your download directory stored on an ext2/ext3/reiserfs partition, you can use gdfs to get the result you want.
This will be adressed in one of the upcoming releases. We discussed this on the mailing list: https://lists.berlios.de/pipermail/gpodder-devel/2008-April/001772.html https://lists.berlios.de/pipermail/gpodder-devel/2008-April/001773.html
Created attachment 52 [details] file naming patch (experimental) This patch kind of works, but it's experimental. Please try it and see if this works well enough. I'm not yet convinced that we should add this to the release. Maybe as a hidden option?
Fixed in SVN trunk, revision 736. You need to enable "experimental_file_naming" for it to work. Closing now.
Thomas, I am working on tidying this up. Though I am wondering if we should postpone as the solution to bug #12 could make this task a lot easier!
Ohh, sorry for being too quick here, I forgot that you wanted to tweak this one a bit. Was just looking at the roadmap and saw this has not been merged yet. My fault ;) Yep, we probably should wait for SQLite support, which should probably be implemented soon, depending on Justin. Anyway, for 0.11.4, we at least have basic, experimental support now. We should implement the ideas discussed somewhere - was it on the mailing list or via private mail? Can we paste/link this discussion here?
(In reply to comment #6) > Ohh, sorry for being too quick here, I forgot that you wanted to tweak this one > a bit. Was just looking at the roadmap and saw this has not been merged yet. My > fault ;) No problems, I have re-opened. > Yep, we probably should wait for SQLite support, which should probably be > implemented soon, depending on Justin. > Anyway, for 0.11.4, we at least have basic, experimental support now. We should > implement the ideas discussed somewhere - was it on the mailing list or via > private mail? Can we paste/link this discussion here? https://lists.berlios.de/pipermail/gpodder-devel/2008-June/001872.html
I have tested this experimental feature. I encountered problems with the feed http://www.europe1.fr/podcast/actualite_divertissement.jsp (a very popular french podcast).Download works correctly, but all episodes file name are identified as "europe1pod_v.mp3". It means every new download will overwrite the previous one, and once the first one is downloaded, all are marked as downloaded. I you look at the feed, you'll notice that the urls are in the form * http://stat3.cybermonitor.com/m/media/europe1pod_v.mp3?R=divertissement&S=podcast&media_url=http%3A%2F%2Fviphttp.yacast.net%2Flgdf%2Feurope1%2Fmedia%2Fson%2Fvideo%2F0000148%2F148511_BD.mp3 * http://stat3.cybermonitor.com/m/media/europe1pod_v.mp3?R=divertissement&S=podcast&media_url=http%3A%2F%2Fviphttp.yacast.net%2Flgdf%2Feurope1%2Fmedia%2Fson%2Fvideo%2F0000148%2F148051_BD.mp3 ... I can see 3 problems here: * the base url (http://stat3.cybermonitor.com/m/media/europe1pod_v.mp3), altought it ends with .mp3, does not link to a media file, but to the jsp server * the real url to the media is one parameter (media_url) within the querystring * the real url does not contain "http://" but "http%3A%2F%2". "/" are also encoded as "%2" and "&" as "&" Maybe the best solution would be to get the file name by querying the url (I mean, send a request to the server), instead of trying to parse it from the url field. After all, I think nothing prevents somebody to provide url like http://myserver?id=1234. But I don't know how difficult it is to implement.
I think now that we have the SQLite database in place, it's easy to add support for saving the local filename in the database and then find out the local filename by sending a HEAD request to the server and saving the result in the db. What do you think?
(In reply to comment #9) > I think now that we have the SQLite database in place, it's easy to add support > for saving the local filename in the database and then find out the local > filename by sending a HEAD request to the server and saving the result in the > db. What do you think? > That's probably the good way. I must say I don't feel confortable with the new sqlite part, so I can only trust you ;)
Created attachment 107 [details] patches to save folder/file names in db Here's my few cents. I added some code necessary to save folder/file names in the database, to generate folder names, to migrate the downloaded episodes to the new folders transparently. I also enabled the "experimental file naming" unconditionally, but didn't touch its logic. If file names still collide for some feeds, a fix would be trivial.
Justin: Do you think we can include this patch in 0.13.0?
No, I'm working on making this patch better. I'll update it in few days.
Shane encountered the smae problem with another feed: "I am having troubles with the 60 second science feed. http://www.sciam.com/podcast/sciam_podcast_r_d.xml All the items are listed as podcast.mp3, with a modifier for each item after a "?". gpodder seems to remove everything after this question mark, meaning that all episodes downloaded end up with the same name, meaning that episodes overwrite each other." (In reply to comment #8) > I have tested this experimental feature. I encountered problems with the feed > http://www.europe1.fr/podcast/actualite_divertissement.jsp (a very popular > french podcast).Download works correctly, but all episodes file name are > identified as "europe1pod_v.mp3". It means every new download will overwrite > the previous one, and once the first one is downloaded, all are marked as > downloaded.
A good example feed which has this file name clashing while having "good" names after retrieving the page and getting the "Location:" URL is this podcast, posted by Shane Donohoe on gpodder-devel: http://www.sciam.com/podcast/sciam_podcast_r_d.xml
Oh, sorry for the noise - didn't read the previous post before posting my message. My fault..
*** Bug 166 has been marked as a duplicate of this bug. ***
Oh, I know what my problem was now! I couldn't find the advanced button because I was using a much older version of gPodder that I was downloading through the Ubuntu repositories The repositories need updating seriously! I almost left the gPodder client in search of something else. Glad I figured it out.
I'd be happy if gpodder would suggest a directory name but allowed me to override it if I so desired. This would cover the odd ball cases (make no suggestion if something looks outa whack or the directory already exists) and offers enhanced functionality. I'd be just as happy if I had to enter two fields to add a new feed, url and directory name.
(In reply to comment #20) > I'd be happy if gpodder would suggest a directory name but allowed me to > override it if I so desired. This would cover the odd ball cases (make no > suggestion if something looks outa whack or the directory already exists) and > offers enhanced functionality. > > I'd be just as happy if I had to enter two fields to add a new feed, url and > directory name. Good idea. But there are some open questions: * What if the user imports an OPML file with multiple podcasts at once? How would the interface ask for the names then? * How do we handle the case of unsubscribing and re-subscribing to a podcast when the downloaded episodes are not deleted, but we want to keep them? Thanks for your input :)
I feel pretty stupid here. I enabled the "experimental_file_naming" option in the Advanced options, but every time I try it keeps using md5 sums. I've deleted the old entries, closed and re-opened the software multiple times, did everything I know to try, but it just keeps doing it. I would like to utilize one of the patches attached above, but I don't know which is the better choice, and once chosen, I have no idea how I am supposed to apply the patch. Linux has a bit of a learning curve, and I haven't covered the bit about "patch application" yet... Thanks! Using Ubuntu Studio 8.10 and gPodder 0.13.1
(In reply to comment #22) > I feel pretty stupid here. I enabled the "experimental_file_naming" option in > the Advanced options, but every time I try it keeps using md5 sums. I've > deleted the old entries, closed and re-opened the software multiple times, did > everything I know to try, but it just keeps doing it. I would like to utilize > one of the patches attached above, but I don't know which is the better choice, > and once chosen, I have no idea how I am supposed to apply the patch. Linux has > a bit of a learning curve, and I haven't covered the bit about "patch > application" yet... Thanks! No problem. We'll gladly guide you through getting this to work. As the name of the features says, it's "experimental". We are working on a better, more solid version for one of the upcoming versions. Experimental file naming also only operates on downloaded files, not folder names. Please feel free to drop by on #gpodder on IRC (irc.freenode.net) and we'll try to help you there!
Created attachment 202 [details] proper folder naming patch, based on justin's ideas I've improved upon your last patch and used your new database code that takes care of adding fields to old databases. What I want to have with the proper folder name support is that folders are named the same as the podcasts in the list, even if the user decides to rename the podcast in the podcast folder. To fix this, we have to move around folders and files at some points (automatically migrate from the current folder structure) and make sure previously deleted folders are correctly re-added when "don't delete my downloaded files" is checked. I plan to do the same for files, although it's probably a bit more tricky for files ;)
Additionally, this patch probably means that we can't rename podcasts while a download for this podcast is running. I think that's not a big problem, but I just wanted to point it out :)
Just tried this out on Fedora and Maemo. I'm getting this[1] error on Maemo when trying to add a podcast[2] with a "»" in it's <title>. [1] http://slexy.org/raw/s2zM7EDHwN [2] http://feeds.feedburner.com/nlo unicode, grrr.
Found the problem. On maemo the encoding defaults to iso-8859-15 because the $LANG variable is set to en_US (at least for me), on my Fedora box it's set to en_US.utf8. So what happens when you run: u'»'.encode('iso-8859-15').encode('iso-8859-15') ? Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/encodings/iso8859_15.py", line 12, in encode UnicodeDecodeError: 'utf8' codec can't decode byte 0xbb in position 0: unexpected code byte In podcastChannel.get_save_dir(), util.sanitize_filename() is called twice on the same string which is why this error occurs. It takes a perfectly good unicode string and attempts to convert it to iso-8859-15. The solution for me was to set the default encoding to utf8. Is there a reason why we can't always use this? More importantly, I can't figure out how to prevent this error. Passing 'ignore' or 'replace' to encode() does nothing...
*** Bug 288 has been marked as a duplicate of this bug. ***
Here's an idea - how about taking whatever the file is called on the server and appending to that the MD5 number. This would ensure that it's a unique name (so no danger of overwriting) but also that we can make sense of which file is which. That is surely a pretty easy fix which would solve this issue. I'm currently on juice with wine which is an unpleasant combination no matter how you look at it, so would love to switch to gpodder. e.g. newspod_20090123-1745a.mp3 would become something like newspod_20090123-1745a-45435298092834534.mp3
@Lee: Agree. I was thinking the same thing when I read previous posts. That would be a good solution.
Created attachment 231 [details] Make patch apply on top of current git HEAD (feb 6, 09) Fix up the patch so that it applies and works with the current git HEAD.
Created attachment 232 [details] 0002-Add-first-32-bits-of-the-sha1-hash-of-the-episode-s.patch The attached patch appends the first 32 bits of the URL's sha1 hash to the filename. This should fix the filename collision problem; I tested it with the above-mentioned feeds which all have the same basename and it handled them perfectly.
based on Nick's latest patches, I've finished writing the support for podcast filename, which (I think) works quite well now, and produces "beatiful" file names. It's in gPodder's git "master" branch. Please test and report back any feeds that stopped working.
(In reply to comment #33) > based on Nick's latest patches, I've finished writing the support for podcast > filename, which (I think) works quite well now, and produces "beatiful" file > names. It's in gPodder's git "master" branch. Please test and report back any > feeds that stopped working. Thomas, can this patch be applied to an existing installation, i.e. an install that has existing MD5 style filenames or should a new clean install be made with this new version?
(In reply to comment #34) > (In reply to comment #33) > > based on Nick's latest patches, I've finished writing the support for podcast > > filename, which (I think) works quite well now, and produces "beatiful" file > > names. It's in gPodder's git "master" branch. Please test and report back any > > feeds that stopped working. > Thomas, can this patch be applied to an existing installation, i.e. an install > that has existing MD5 style filenames or should a new clean install be made > with this new version? It _should_ work, and I'd be happy if you could test this for me. Here are the instructions (assuming your downloads go into ~/gpodder-downloads/ - if not, change the paths in the commands accordingly): cd ~ tar czvf ~/gpodder_config_backup.tar.gz .config/gpodder/ tar czvf ~/gpodder_downloads_backup.tar.gz gpodder-downloads/ After that, try the version by following the instructions in the Wiki: http://wiki.gpodder.org/wiki/Running_gPodder_from_Git If there is some problem, please report it (send me both your old backups and the results after running the new version). I don't need the gpodder_downloads_backup.tar.gz file as a whole, but I need a listing of it: tar tzvf ~/gpodder_downloads_backup.tar.gz >~/gpodder_downloads_listing.txt This way, I can re-create the directory structure before and after, and test it locally here. If anything goes wrong with the new version, and you want to continue with the old version until the bug is resolved, you can do the following: rm -rf ~/.config/gpodder/ rm -rf ~/gpodder-downloads/ tar xzvf gpodder_config_backup.tar.gz tar xzvf gpodder_downloads_backup.tar.gz Good luck and please report how it works for you (I've tested it with my local data here, but would be happy to fix bugs that are encountered by testers before releasing it into the wild).
The file naming works for me (for old and new episodes). But MTP synchronisation (and I thing also mp3 and iPod) is broken since the episodes.local_filename prototype has changed. The new "create" parameter should be optional.
(In reply to comment #36) > The file naming works for me (for old and new episodes). > But MTP synchronisation (and I thing also mp3 and iPod) is broken since the > episodes.local_filename prototype has changed. The new "create" parameter > should be optional. Forget the last message. I have mix an old sync.py in my sources. Sorry for the inconvenience.
(In reply to comment #36) > episodes.local_filename prototype has changed. The new "create" parameter > should be optional. I've made the "create" parameter obligatory on purpose, so we spot uses of the function where we still need to determine if we need to have a filename or if we are just checking for the file existence. I suppose there will be several bug reports before we get it right, but the design and implementation will be cleaner then :)
I have been running with this latest mod for about a week now and have not stumbled across any issues! :)
Thanks for your feedback. This will be included in 0.15.x then.
Closing. Ready for 0.15.0. Please open new reports in case you encounter bugs.