First it may not work for all cases, so I tested and researched only for:
TheConceptisthesameasitappearshereinGmail.
Heretheiconalone:
Shortdescription
Theyarereferredinternallyasgoomoji,andtheyappeartobeanon-standardUTF-8extension.WhenGmailfindsoneofthesecharacters,itisreplacedbythecorrespondingicon.Icouldnotfindanydocumentationaboutthem,butIdidreverse-engineertheformat.
Sohowdoesthiswork?
Weknowthatinsomeway,876Urg==
meansthe52E
icon,buthow?
Ifwedecodebase64876Urg==
,weget0xf3be94ae.Thislookslikethefollowinginbinary:
11110011101111101001010010101110
Thesebitsareconsistentwitha4-byteUTF-8encodedcharacter.
11110xxx10xxxxxx10xxxxxx10xxxxxx
So,therelevantbitsareasfollows.:
011111110010100101110
Orwhenaligned:
000011111110010100101110
Inhexadecimal,thesebytesareasfollows:
FE52E
Asyoucansee,withtheexceptionoftheprefixthatispresumablytodistinguishthegoomojiconesfromotherUTF-8characters,itmatches52EaoURLicon.Sometestsprovethisholdstrueforothericons.
Itseemslikealotofwork,isthereaconverter?
This,ofcourse,canberouted.IcreatedthefollowingPythoncodeformytest.Thesefunctionscanconvertthebase64encodedstringtoandfromtheshorthexstringfoundintheURL.NotethatthiscodeiswrittenforPython3andisnotcompatiblewithPython2.
ConversionFunctions:
importbase64defgoomoji_decode(code):#Base64decode.binary=base64.b64decode(code)#UTF-8decode.decoded=binary.decode('utf8')#GettheUTF-8value.value=ord(decoded)#Hexencode,trimthe'FE'prefix,anduppercase.returnformat(value,'x')[2:].upper()defgoomoji_encode(code):#Addthe'FE'prefixanddecode.value=int('FE'+code,16)#ConverttoUTF-8character.encoded=chr(value)#EncodeUTF-8tobinary.binary=bytearray(encoded,'utf8')#Base64encodereturnendreturnaUTF-8string.returnbase64.b64encode(binary).decode('utf-8')
Examples:
print(goomoji_decode('876Urg=='))print(goomoji_encode('52E'))
Output:
52E876Urg==
Andofcourse,findingtheURLofaniconsimplyrequiresthecreationof anewdraftinGmail,byenteringthedesirediconandusingthe DOMinspectorforyourbrowser.
References:
link
link
link
link
link