1
clino 2017-04-19 16:17:35 +08:00 2
那你要先搞清楚字符串到底是什么编码
|
2
Feva 2017-04-19 16:22:30 +08:00
# coding=utf-8
or # -*- coding:utf-8 -*- 后台文件有么? |
3
JasperYanky OP @Feva 脚本之前是 ok 的 现在 json 中的数据变了 应该是某个中文字符引起的
|
4
JasperYanky OP @clino Python3 默认是 unicode 吧
|
5
Feva 2017-04-19 16:28:30 +08:00 1
@JasperYanky 你这个明显是处理非 assii 字符报错
py 文件头加编码不行,再加入如下代码 import sys reload(sys) sys.setdefaultencoding('utf-8') |
6
ipwx 2017-04-19 17:15:33 +08:00
Python 脚本没有指定编码,所以解释器就报错了吧,应该不是 Json 的锅。
|
7
ipwx 2017-04-19 17:15:55 +08:00
$ ipython
Python 3.6.0 |Anaconda 4.3.1 (x86_64)| (default, Dec 23 2016, 13:19:00) Type "copyright", "credits" or "license" for more information. IPython 5.1.0 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]: import json In [2]: json.dumps('你好') Out[2]: '"\\u4f60\\u597d"' |
8
yucongo 2017-04-19 23:51:16 +08:00
可以是国人的网页没跟足标准, 试试 resp.encoding='utf-8':
url = '...' resp = requests.get(url) resp.encoding = 'UTF-8' ===== In [189]: url = 'http://www.baidu.com' In [190]: import requests In [191]: resp = requests.get(url) In [192]: resp.encoding Out[192]: 'ISO-8859-1' # bad ! In [200]: chardet.detect(resp.content) Out[200]: {'confidence': 0.99, 'encoding': 'utf-8'} In [201]: resp.encoding = 'utf-8' |