Skip to content

Instantly share code, notes, and snippets.

@Gocrazy
Created July 26, 2018 08:17
Show Gist options
  • Save Gocrazy/af78676280f99f84d7c5888f5be60ebb to your computer and use it in GitHub Desktop.
Save Gocrazy/af78676280f99f84d7c5888f5be60ebb to your computer and use it in GitHub Desktop.
# check if character is cjk Python 3
# https://stackoverflow.com/questions/30069846/how-to-find-out-chinese-or-japanese-character-in-a-string-in-python/30070664
# -*- coding:utf-8 -*-
ranges = [
{"from": ord(u"\u3300"), "to": ord(u"\u33ff")}, # compatibility ideographs
{"from": ord(u"\ufe30"), "to": ord(u"\ufe4f")}, # compatibility ideographs
{"from": ord(u"\uf900"), "to": ord(u"\ufaff")}, # compatibility ideographs
{"from": ord(u"\U0002F800"), "to": ord(u"\U0002fa1f")}, # compatibility ideographs
{"from": ord(u"\u30a0"), "to": ord(u"\u30ff")}, # Japanese Kana
{"from": ord(u"\u2e80"), "to": ord(u"\u2eff")}, # cjk radicals supplement
{"from": ord(u"\u4e00"), "to": ord(u"\u9fff")},
{"from": ord(u"\u3400"), "to": ord(u"\u4dbf")},
{"from": ord(u"\U00020000"), "to": ord(u"\U0002a6df")},
{"from": ord(u"\U0002a700"), "to": ord(u"\U0002b73f")},
{"from": ord(u"\U0002b740"), "to": ord(u"\U0002b81f")},
{"from": ord(u"\U0002b820"), "to": ord(u"\U0002ceaf")} # included as of Unicode 8.0
]
def is_cjk(char):
return any([range["from"] <= ord(char) <= range["to"] for range in ranges])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment