Skip to content Skip to sidebar Skip to footer

How To Url-safe Encode A String With Python? And Urllib.quote Is Wrong

Hello i was wondering if you know any other way to encode a string to a url-safe, because urllib.quote is doing it wrong, the output is different than expected: If i try urllib.

Solution 1:

According to RFC 3986, %C3%A1 is correct. Characters are supposed to be converted to an octet stream using UTF-8 before the octet stream is percent-encoded. The site you link is out of date.

See Why does the encoding's of a URL and the query string part differ? for more detail on the history of handling non-ASCII characters in URLs.

Solution 2:

Ok, got it, i have to encode to iso-8859-1 like this

word = u'á'
word = word.encode('iso-8859-1')
print word

Solution 3:

Python is interpreted in ASCII by default, so even though your file may be encoded differently, your UTF-8 char is interpereted as two ASCII chars.

Try putting a comment as the first of second line of your code like this to match the file encoding, and you might need to use u'á' also.

# coding: utf-8

Solution 4:

In this question it seems some guy wrote a pretty large function to convert to ascii urls, thats what i need. But i was hoping there was some encoding tool in the std lib for the job.

Post a Comment for "How To Url-safe Encode A String With Python? And Urllib.quote Is Wrong"