Python - Reading Emoji Unicode Characters -

September 15, 2010

i have python 2.7 program reads ios text messages sqlite database. text messages unicode strings. in following text message:

u'that\u2019s \u0001f63b'

the apostrophe represented \u2019, emoji represented \u0001f63b. looked code point emoji in question, , it's \uf63b. i'm not sure 0001 coming from. know comically little character encodings.

when print text, character character, using:

s = u'that\u2019s \u0001f63b'  c in s:     print c.encode('unicode_escape')

the program produces following output:

t h t \u2019 s  \ud83d \ude3b

how can correctly read these last characters in python? using encode correctly here? should attempt trash 0001s before reading it, or there easier, less silly way?

i don't think you're using encode correctly, nor need to. have valid unicode string 1 4 digit , 1 8 digit escape sequence. try in repl on, say, os x

>>> s = u'that\u2019s \u0001f63b' >>> print s that’s 😻

in python3, though -

python 3.4.3 (default, jul  7 2015, 15:40:07)  >>> s  = u'that\u2019s \u0001f63b' >>> s[-1] '😻'

Search This Blog

JVParth

Python - Reading Emoji Unicode Characters -

Comments

Post a Comment

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -

How to provide Authorization & Authentication using Asp.net, C#? -