Saturday, 31 August 2013

convert unicode of form "\uxxxxxx" to string or text

convert unicode of form "\uxxxxxx" to string or text

I'm writing an python script which will extract the url of facebook video.
But in the source of the video page, i see some characters of form
\uxxxxxx in the url.
for instance url is in this form
https\u00253A\u00255C\u00252F\u00255C\u00252Ffbcdn-video-a.akamaihd.net\u00255C\u00252Fhvideo-ak-prn2\u00255C\u00252Fv\u00255C\u00252F753002_318048581647953_53890_n.mp4\u00253Foh\u00253D64e3e8ecf7e88f1da335d88949b2dc1f\u002526oe\u00253D52226D10\u002526__gda__\u00253D1377987338_9e37fb163a1d37d4b06ab7cff668f7dc\u002522\u00252C\u002522
\u00253A is colon (:), but how do i convert it.
When i did like
>>> x.decode('unicode_escape').encode('ascii','ignore')
i get
'https%3A%5C%2F%5C%2Ffbcdn-video-a.akamaihd.net%5C%2Fhvideo-ak-prn2%5C%2Fv%5C%2F753002_318048581647953_53890_n.mp4%3Foh%3D64e3e8ecf7e88f1da335d88949b2dc1f%26oe%3D52226D10%26__gda__%3D1377987338_9e37fb163a1d37d4b06ab7cff668f7dc%22%2C%22
I want exact url not percentage.
I searched a lot but couldn't find any help.
Thanks in advance

No comments:

Post a Comment